Keras AlexNet: Dog vs. Cat Classification

Project URL: https://github.com/Insignite/Alexnet-DogvsCat-Classification

I want to build a simple Deep Learning model for image classification on Kaggle Dog vs. Cat Dataset. In this project, I decided to use AlexNet architecture as it repeatedly mentions during my Machine Learning course. This project is simple enough that helps me understand Alexnet, familiarize myself with Keras, and gain more experience in the ML field.

Dataset

After download and extract dataset from zip file, let’s view the data.

The dataset doesn’t come with a label file. But I can extract the label from image name in train dataset.

I apply a data generator to provide variety to our train dataset which definitely will improve the model accuracy. This also “replicate” real-world dataset because not all input image will be a perfect picture of a dog or a cat. Let’s view a sample from our generator.

The train dataset split as 80% training and 20% validation with image generator applied to both. Data now ready to be train.

Deep Learning Model

As mentioned, I will be using AlexNet architecture to build the model. AlexNet consist of five convolutional layers, some followed by maximum pooling layers and then three fully connected layers. Since the dataset only consist of two classes (Dog and Cat), the last layer is a 2-ways softwax.

Layer name Output Filters Kernel size Stride Padding
Input 227x227x3
Convol_1 55x55x96 96 11×11 4 valid
MaxPool_1 27x27x96 3×3 2 valid
Norm_1 27x27x96
Convol_2 27x27x256 256 5×5 1 valid
MaxPool_2 13x13x256 3×3 2 valid
Norm_2 13x13x256
Convol_3 13x13x384 384 3×3 1 valid
Convol_4 13x13x384 384 3×3 1 valid
Convol_5 13x13x256 256 3×3 1 valid
MaxPool_3 6x6x256 3×3 2 valid
FullConnect_1 4096
FullConnect_2 4096
FullConnect_3 1000
FullConnect_4 2 (Dog vs Cat)

Training


I am using a Huaweii Matebook Pro with 8th Gen Intel- i7, 16GB RAM, NVIDIA GeForce MX150. Definitely not a good laptop to run any type of machine learning project so each epochs take me roughly 10-15 minutes. I decided to use small epochs but reasonable enough to get a decent results. I tried out with 3, then 10, and finally 20 epochs. If you have stronger hardware, an increase to 50 or so definitely will yield a good result.

Let’s graph the train lost, train accuracy, validation lost, and validation accuracy for 20 epochs.

Result

Let’s put some predicted result with images so we can see our prediction result better. I will do first 20 images from test result.

TADA!!! I now have a simple model to classify picture of dog or cat.