Keras AlexNet: Dog vs. Cat Classification

Project URL: https://github.com/Insignite/Alexnet-DogvsCat-Classification

I want to build a simple Deep Learning model for image classification on Kaggle Dog vs. Cat Dataset. In this project, I decided to use AlexNet architecture as it repeatedly mentions during my Machine Learning course. This project is simple enough that helps me understand Alexnet, familiarize myself with Keras, and gain more experience in the ML field.

Dataset

After download and extract dataset from zip file, let’s view the data.

The dataset doesn’t come with a label file. But I can extract the label from image name in train dataset.

I apply a data generator to provide variety to our train dataset which definitely will improve the model accuracy. This also “replicate” real-world dataset because not all input image will be a perfect picture of a dog or a cat. Let’s view a sample from our generator.

The train dataset split as 80% training and 20% validation with image generator applied to both. Data now ready to be train.

Deep Learning Model

As mentioned, I will be using AlexNet architecture to build the model. AlexNet consist of five convolutional layers, some followed by maximum pooling layers and then three fully connected layers. Since the dataset only consist of two classes (Dog and Cat), the last layer is a 2-ways softwax.

Layer name	Output	Filters	Kernel size	Stride	Padding
Input	227x227x3	–	–	–	–
Convol_1	55x55x96	96	11×11	4	valid
MaxPool_1	27x27x96	–	3×3	2	valid
Norm_1	27x27x96	–	–	–	–
Convol_2	27x27x256	256	5×5	1	valid
MaxPool_2	13x13x256	–	3×3	2	valid
Norm_2	13x13x256	–	–	–	–
Convol_3	13x13x384	384	3×3	1	valid
Convol_4	13x13x384	384	3×3	1	valid
Convol_5	13x13x256	256	3×3	1	valid
MaxPool_3	6x6x256	–	3×3	2	valid
FullConnect_1	4096	–	–	–	–
FullConnect_2	4096	–	–	–	–
FullConnect_3	1000	–	–	–	–
FullConnect_4	2 (Dog vs Cat)	–	–	–	–

Training

I am using a Huaweii Matebook Pro with 8th Gen Intel- i7, 16GB RAM, NVIDIA GeForce MX150. Definitely not a good laptop to run any type of machine learning project so each epochs take me roughly 10-15 minutes. I decided to use small epochs but reasonable enough to get a decent results. I tried out with 3, then 10, and finally 20 epochs. If you have stronger hardware, an increase to 50 or so definitely will yield a good result.

Let’s graph the train lost, train accuracy, validation lost, and validation accuracy for 20 epochs.

Result

Let’s put some predicted result with images so we can see our prediction result better. I will do first 20 images from test result.

TADA!!! I now have a simple model to classify picture of dog or cat.