Original Source Here
Towards a Fully Automated Active Learning Pipeline
Simple, modular, and generic architecture for active learning
In my previous post, I gave a short introduction to the theory and methods of active learning. The next step in the Active Learning journey is the implementation. In this post, I will share my journey towards a complete automated active learning pipeline.
Step 1 — Motivation to implement
At first, like other motivated algorithm developers, I started by implementing the chosen active learning method. I didn’t think about the next steps and the “bigger picture”. On one hand, it got me pretty fast to a working (engineering-wise) implementation. On the other hand, it made the next steps harder. In my case I had two parallel next steps, that raises three different questions:
- How do I build an active learning pipeline?
- How to build the pipeline in a modular and generic way?
- How to incorporate multiple different tasks into the pipeline?
Step 2 — active learning pipeline — semi-automatic
My first active learning pipeline implementation was semi-automatic. Each cycle runs fully automatically but is executed manually. To this end, the main addition is the Data Selector.
The Data Selector encapsulates the purpose of active learning — choosing the next images to be annotated in an informed way. Its input is a set of currently non-annotated data and its output is a sub-set to be annotated.
The Data Selector can be an additional neural network, classic algorithm, database query, or any other method that works for you.
At each cycle, the Data Selector is based on the best model from the previous cycle. Its output set is added to the training set of the previous cycle (after annotating it).
Why is semi-automatic not good enough?
Besides the overhead of executing each cycle manually, which is time-consuming, a semi-automatic pipeline requires monitoring. By monitoring, I mean that we need to remember which cycle we want to run, where we save the state of the previous cycle, manually choosing the inference model from the previous cycle, and more. This process has a high potential for errors, bugs, and confusion. Ideally we want a fully automatic pipeline, which is the topic of our next section.
Step 3 — modular and generic active learning pipeline
I have several models for several different tasks. One day I found myself doing copy-paste from one file to four other different files, again and again. At this point, I decided that something needs to change, and it’s time for standardization.
The training pipeline component is the most important one in the sense of software engineering. Maintaining modular and standard architecture to your training pipeline can save you a lot of trouble the next time you implement a new network.
Inspired by this template, I created my own deep learning code architecture with three main components- data loader, graph, and agent.
Data Loader — as implied from its name, the Data Loader encapsulates all we need to create a data loader object.
Graph — The Graph contains our network model and other components needed for training, such as optimizer, scheduler, and loss.
By defining the Graph, it is easier to reuse shared components. For example, I have two segmentation tasks, using the same loss function. My Graph superclass contains the loss function implementation, and each task is implemented as a graph class that inherits from the Graph superclass.
Agent — The agent is our training class. Each specific task inherits from an Agent superclass and must implement train and eval methods. A generic training loop is implemented in the agent superclass, and any other class that inherits from it can extend it.
Step 4 — active learning pipeline — automatic
Back to the active learning pipeline, we finally get to the last step. In order to create a fully automated pipeline, we need to close the loop. First, each cycle needs to load the state of the previous state and save its current state. Second, we need to automatically choose the best inference model to use. This is done using the Model Selector module.
Model Selector— Finishing the training pipeline, we have several saved models from different points in the training loop. The Model Selector chooses the best model, using a given criterion.
This model will be used as the Data Selector model in the next cycle (for relevant methods) and as the new production model.
Bonus step — Compare module
Do we always want to retrain our network for any new selected non-annotated batch? Probably not. We want to retrain our model when we notice a deterioration in the results. For this, we use the Compare module.
The compare module is pretty simple and works as follow:
- Pass the new annotated images in the Data Selector module.
- Measure the quality of the results and compare them to your key performance indicator (KPI).
- Meets the KPI — no need to retrain.
- Does not meet the KPI — retrain.
The compare module can be eliminated in the development stage. In the development stage, we want to prove that our active learning approach works. To do so, we use a known fully annotated dataset and show that we can choose our next training samples in an informed way. For more details, see my previous post.
It’s time to wrap up and summarize the entire architecture. In the following scheme, you can find the automated pipeline with all its modules and their connections.
We did it!
Now we are all familiar with the steps toward creating a great active learning pipeline.
To me, it took some time to organize all my existing code to fit this pipeline architecture. But after it’s done, adding a new task or changing my Data Selector method became much easier.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot