Multi-task learning in Computer Vision: Image classification

Original Source Here


Ever faced an issue where you had to create a lot of deep learning models because of the requirements you have, worry no more as multi-task learning is here. Multi-task learning can be of great help if the underlying dataset in each of your model remains the same. No need to create individual models, you can simply extract the features of the dataset using the base network layer and finally attach the different types of head layers on top of the base network in a parallel manner to learn different models simultaneously. Some of the uses cases that this can solve in the computer vision domain are:

  • Prediction of prices and type of house
  • Gender and age detection from faces
  • Crop segregation and its health analysis

Here we will look at a use case where we will create a simple image classification model to detect the colour and type of automobile that is present in a given image. Let’s not wait anymore.


For the following exercise, I have chosen this dataset which is publically available in Kaggle, the dataset consists of different directories with each directory following the pattern of “color_type”. You could have guessed what we are going to do, yeah we are going to treat this as a multi-task image classification problem with colour and type are our different target values that we intend the model to learn. Let us look at the data structure and prepare the dataset as per requirement.

So the dataset consists of around 18 classes and each class could consist of only around 50 images which are arguably less in case of a normal image classification problem, rather from a multi-task learner view, we can split this into two learners: one for the colour and another for the type of vehicle using a single model. Let us prepare the dataset for the same.

Data loading and preparation

Here the images are loaded and reshaped as per the model requirement. Also, we have separated the 18 classes into two different sets:

  • Colour labels: 4
  • Type labels: 5

Data Augmentation

Preparing target data

Modelling using transfer learning

Now we will be creating the layers required for building the multi-task image classification problem, I have decided to try transfer learning with pre-trained weights as the model will learn better using the pre-trained weights. The specifics of the model are as follows:

Model: Mobilenet_v2 trained on Imagenet dataset with the input image size as [128, 128, 3]

  • Layer_1 for learning the colour of vehicles in the dataset
  • Layer_2 for learning the type of vehicles in the dataset

Building and training the model

The model has done pretty okay in terms of the validation dataset as the loss is pretty near to the lower levels from this plot

Inferencing the model with random images

Let us check the capability of the model using random images from the internet, I have downloaded a few samples to test the efficiency of the model. I have chosen three sets of images: a black truck, a white truck and a green bike. Let’s look at how the results span out:

Getting the predictions and remapping them to the classes

As we have seen here that rather than developing different sets of models for different use cases we could save more time and build efficient models by making use of multi-task learning provided the underlying dataset is common. Do try this out and give me the feedback, until then bye friends. Stay safe.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: