Laboratory One Research Blog

Transfer Learning - SpongeBob SquarePants Character Recogniser

August 07, 2018

Freeze first 5

There is an important technique in deep learning called Transfer Learning. It allows one to fine-tune a pretrained network on new data and repurpose it. This greatly reduces computation cost and data requirements. Consider a business problem where you needed a machine to recognize SpongeBob SquarePants characters. How could you quickly tackle this important pressing issue?

VGG19

The VGG19 is a very deep convolutional network for image recognition. It is a 19 layer network that was trained for the ImageNet Challenge in 2014 by the University of Oxford. This network can classify 1000 different objects so it’s a perfect baseline for our task. The following is a diagram of VGG19’s architecture:

VGG19 Model Diagram

Lets see how well it does in recognizing an image of SpongeBob SquarePants.

Baseline Prediction

The network guessed the image to be a hook at 17.6656% confidence. As expected it fails because it wasn’t trained on SpongeBob SquarePants.

Methodology

To apply transfer learning, we need to perform the following steps:

  1. Build a dataset
  2. Load a pretrained network
  3. Freeze a number of layers
  4. Add new layers
  5. Train the model

You can find my code and data for this on my github

Build a dataset

I built a very small dataset for this task. It consisted of 3 characters (classes):

  • SpongeBob SquarePants
  • Sandy Cheeks
  • Patrick Star

For each class, I had 31 images. 27 for training, 3 for validation, and 1 for final prediction.

Validation Images

Load a pretrained network

I used the VGG19 network for this task. The objects that this image was trained to recognize are real objects, not cartoons. I wanted to see if it could generalize to screenshots of cartoon characters.

Freeze a number of layers

Freezing layers means that the network will not train a given set of layers. We may freeze many of the layers if we don’t have sufficent data, or maybe those layers are already well-trained on a set of features. It took a few tries to find the right number of layers to freeze. My combinations were:

  • freeze all layers
  • freeze the first layer
  • freeze the first five layers
  • freeze the first ten layers
  • freeze all but the last ten layers
  • freeze all but the last five layers
  • freeze all but the last layer

My best result was freezing the first five layers.

Add new layers

The last few layers of VGG19 network are used to classify images into classes. We need to rip these out, and add our own. Mine were as follows:

  • fully connected layer outputing 1024 neurons
  • 50% dropout layer
  • fully connected layer outputing 1024 neurons
  • fully connected layer outputing 3 neurons

New Layers

Train the model

It was very easy to train the model. Because most of the work was already done, I was able to train all the freezing combinations above in under 30 mins. I used a batch size of 16, and trained until accuracy stopped increasing.

Accuracy

Loss

Predictions

The following are my results.

Freeze all layers Freeze all

Freeze the first layer Freeze first

Freeze the first five layers Freeze first 5

Freeze the first ten layers Freeze first 10

Freeze all but the last ten layers Freeze not last 10

Freeze all but the last five layers Freeze not last 5

Freeze all but the last layer Freeze not last

References

Transfer Learning using Keras

Keras Tutorial: Fine-tuning using pre-trained models

Keras

deeplearning.ai


Peter Chau

Written by Peter Chau, a Canadian Software Engineer building AIs, APIs, UIs, and robots.

peter@labone.tech

laboratoryone

laboratory_one