Top 5 Machine learning models 2021*C-9UBs-QRU5h6mJc

Original Source Here

Top 5 Machine learning models 2021

A collection of the most noteworthy machine learning models over the past year

Photo by Pawel Czerwinski on Unsplash

This year has been full of a lot of great models. In this article, my hope is to highlight 10 of the most noteworthy models. I have been regularly reviewing papers and explaining them over this year and I think I have quite a few good mentions. Disclaimer: There might be other good models not mentioned here and I am not claiming to be the ultimate expert when it comes to evaluating the quality of machine learning models!

Also, note that this list isn’t ordered!

  1. Deepmind NFNets

Our smaller models match the test accuracy of an EfficientNet-B7 on ImageNet while being up to 8.7× faster to train, and our largest models attain a new state-of-the-art top-1 accuracy of 86.5%.

Source: arxiv

Normalizer-free networks were released by Deepmind in February 2021 and received a lot of recognition. I even wrote an article explaining them at the time and it got 41,000 views!

My own stats from medium

One of the most interesting bits about this paper is the analysis they did on existing optimization techniques in deep learning such as batch normalization. The result of their analysis led them to the conclusion that it is not needed and that actually removing it can speed the training process without causing a decrease in performance and hence the name “normalizer free nets”.

Source: arxiv

If you are interested in finding more feel free to check out my article here:

2. OpenAI CLIP

CLIP is a transformed-based neural network that uses Contrastive Language–Image Pre-training to classify images. CLIP classifies a very wide range of images by turning image classification into a text similarity problem. The issue with current image classification networks is that they are trained on a fixed number of categories, CLIP doesn’t work this way, it learns directly from the raw text about images, and thus it isn’t limited by labels and supervision.

If you are looking for an interesting paper about transformers on images, check out CLIP:

3. Google Switch Transformers

This paper is about a new method to significantly boost the number of parameters while maintaining the number of Floating-point operations per second (the ML computational cost standard metric).

It’s well known that increasing the number of parameters increases the model’s complexity and its ability to learn (up to a certain point of course). And as expected the model gains 4x improvement over T5-XXL and 7x improvement over T5-Base and T5-Large.

The paper uses a wide variety of ML concepts which makes it quite an interesting read such as Mixture Of Experts, distillation, and model sharding.

Source: arxiv (LaTex reproduced table)

Using such techniques they manage to get a 3–30% improvement in performance over very powerful existing transformer-based models. We can also see that the number of parameters is relatively small.

4. Deepmind AlphaFold2

This is probably my favorite model of the year as I am a big fan of deep learning in the drug discovery and medical space. AlphaFold2 is an advanced model built by a deep mind to solve the protein folding problem. The protein folding problem is one of the biggest problems in the bioinformatics space, if you want to read more about it, check out my article here.

AlphaFold2 is quite a complex model with a lot of details and tricks, if you want to find out more about how it works and the “Evoformer” they used, feel free to check out my article here:

5. Google EfficientNetV2

EfficientNets have been quite popular in image recognition tasks. This year, Google has released the 2nd version of them that achieves slightly higher performance than state-of-the-art image models while training 5–10x times faster. And this is a noteworthy pattern of papers this year, which is that a fair amount of them was focused on training models faster rather than getting a higher performance. Also, I think a decent amount of papers were focused on image-based problems.

One of the interesting techniques used in EfficientNetV2 is progressive learning which means that although the image sizes are originally small when the training starts, they increase in size progressively. This solution stems from the fact that EfficientNets’ training speeds start to suffer on high image sizes.

Honorable Mentions:

  • Google MLP-Mixer: A high performing image model that uses simple multilayer perceptrons
  • Varifocal Net: A powerful object detection model that achieves similar or higher performance to YoloV5
  • OpenAI DALL-E: An interesting model that creates images from text


I know that most of these models are released by huge companies, I am quite sure there are other great models that were released in 2021 that weren’t included in this list simply because they weren’t as popular and didn’t reach a greater audience. The sad truth is that papers from these huge companies will always have a greater reach. Just saying, I simply didn’t mention these models just because they were released by big companies, I actually read the papers and explained them in medium posts. If you have any other great papers, please by all means leave them in the comments below! (surely I have missed some good ones)

If you want to receive regular paper reviews about the latest papers in AI & Machine learning, add your email here & Subscribe!


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: