How to Read Machine Learning Papers Easily

Original Source Here

How to Read Machine Learning Papers Easily

Step by step approach for reading technical papers

Ok, I won’t try to convince you why you need to read technical and research papers. Almost all the algorithms that you use in daily life came from the technical literature but are present ubiquitously in the form of tutorials or easier walkthroughs. One time or the other it is highly likely that you’ll run into problems that are quite esoteric and terse and don’t have any standard textbook solution and that’s when the skill to parse through the dense technical literature will come in handy.

Image Source: Unsplash

Some other solid reasons to read technical literature are:

  1. To keep yourself abreast with what is happening in the field of AI/ML. here 33,000 new papers are published every year; a few of them are bound to be informative.
  2. To know more about a particular topic of your interest, what are the recent findings, what research has been done till now, and what questions are left that need to be answered.
  3. It makes you feel humble that there is a lot that we don’t know and can comprehend completely.

I won’t call myself a master in reading papers but over the years with trial and error, I have improved a little. The journey from “how can anyone write in such complicated language” to “hmm…it is difficult but one can understand the main idea” has been challenging, at times infuriating but rewarding.

Source: Giphy

First things first

Before getting your hands on the paper you need to know two things:

  1. What the major theme of the topic e.g. it could be recommender system, computer vision, natural language processing, audio processing etc
  2. What’s your goal? — What are you trying to achieve? Do you want to know more about a topic or do you want to see whether a paper can help you solve a problem you are stuck on?

Once you have sorted these two questions, then only synthesize a reading strategy.


By now you must have come across many sources from where you can scour the technical literature of your field. There are many posts and listicles that give you more than enough information on the same, so I’ll refrain from doing so but will mention the ones from where I download most of the papers.

  1. Papers — is good but navigating through the site is like getting lost in a forest where each tree looks the same. The better option is arxiv-sanity-preserver which preserves the sanity while going through myriad papers present on arxiv. It gives you a preview and abstract of the paper and also lets you find other papers that are similar to the current paper(based on TF-IDF ranks). All hail Andrej Karpathy!!!
  2. Implementations — Papers with Code provides you with programming implementation and Github repos of the papers. You can try to fork the repos and re-run the code to have a better understanding of the mathematical aspects of the paper.
  3. Video Two minutes paper is a YouTube channel that provides a quick overview of the practical aspects of a paper and the associated implications. Yannic Kilcher is another YouTube channel that I watch sometimes while having lunch or dinner.
  4. Newsletter I’ll mention Import AI for its crisp and coherent presentation of the material. They have a small section of tech tales that are just sci-fi stories, they are quite neat to read.
  5. Podcasts — I would include Linear Digressions, Talking Machines, and Data Skeptic as the audio sources(Spotify, Apple) that are quite fun to listen to. They talk not only about the latest trends but also practical applications and host talk shows with industry leaders and chief data scientists.
Source: Giphy

The How part

Now that you have a paper in hand or on-screen and have a clearly defined goal, you’ll be able to navigate through it easily and understand the paper in a better fashion. You don’t need to read highly complicated and maths heavy papers in the beginning to develop the habit.

Kesav Srinivasan[1] has discussed this approach in this paper(check the references) but different folks have the version that works for them.

Over time, I also have a flavour of the three-step approach which has seemed fruitful.

The three pass approach

Read the paper in three iterations. Each iteration has a different objective but collectively they gear towards having a greater understanding of the literature at hand.

First iteration

The objective of the first pass is to get a general idea of the paper.

Read the names of the authors, title, abstract, intro, the title of the subsections, and read the conclusion. Don’t read any mathematical part. Check for the references mentioned in the paper.

Don’t spend more than 15 minutes for the first pass and by the end of it, you should have an idea of what problem is being solved, what conclusions have been drawn and most importantly, is this paper well written and is pertinent enough for you to spend more time on it.

Second iteration

In this phase, you have to read the paper more critically. Make notes and create rough diagrams while going through the paper as having a pen/pencil/marker makes it an active pursuit.

You still don’t have to read any esoteric mathematical equations, just glance over them superficially. Read all the parts that are there in English and see if you can come up with your rendition of a crude algorithm or a process described in the paper.

Check if there are any plots, graphs, figures in the paper. Try to understand them as they hold a lot of information and generally are key to direct results.

After the second pass, check whether any GitHub repo or a tutorial is present for the paper and try to re-run it to replicate the results.

This stage should take you somewhere between 50 to 90 minutes.

If you are satisfied with the paper’s contents and overall it feels that it will help you in achieving the goal, then and only then gear towards the third iteration.

One fruitful thing to do at the end of the second stage is to write a 150-word paragraph of what you learned to crystallize your understanding. You can later use this summary to discuss with your group or on an online forum.

Third iteration

This is the commitment phase as you’ll be spending a lot of time on the paper trying to understand every word of it.

If your goal was just to acquaint yourself with a new methodology then maybe 2nd pass was enough but if you have a particular task at hand whose solution you are trying to find, then move ahead with the third iteration.

In the third pass, you will read the mathematical part thoroughly. You will use a pen and a paper to break the equations down to the fundamental level, you will consult external sources for the terms and concepts that are alien to you.

Essentially, you are recreating the paper and understanding the nitty-gritty of the various elements of the paper.

Initially, it can take many hours in this phase but as you gain skill, the time taken will decrease sharply.


Use a highlighter or an annotation tool, don’t print all the papers(for environmental reasons) but the ones that you feel necessary to be held in hand.

The papers are written in such convoluted language because the audience is generally the other researchers and practitioners who are very well acquainted with the technical terms, background, and context and thus can parse through easily.

You will feel frustrated and will be enraged at one time or the other for “Nothing makes you feel stupid quite like reading a scientific journal article.” — Adam Ruben

Be patient with yourself. It is a skill akin to riding a bicycle, initially striking a balance might seem like a herculean task but you’ll get better over time and will know how the cycle behaves on the road and which bumps and potholes you have to avoid.

Despite all the difficulties, it is a noble pursuit to gain knowledge, so keep at it. The democratisation of AI isn’t only about MOOCs and higher computing power at disposal but also about the perusal of academic and industrial research and associated literature.

At times you will walk away with half baked knowledge but that shouldn’t discourage you, it means you need to upskill yourself to reach the level where you can understand the paper. You can read a few old papers to develop a foundation for the new research and it is set the context for you beautifully.

Email the authors with insightful questions, they might take a long time to reply but they are also looking for ways to explain their work in as simple words as possible.

A word of caution

You don’t have to take the conclusions in the papers as a gospel every time. That’s why the source and the authors of the paper is an import aspect here. There are times when the methodology and the theory are very well organised but the reported results are on a biased dataset. That’s why reimplementation of the paper to see whether you can replicate the results is an important task.

At the end

This is optional but one of the fun things that I have learned from the YouTube series, 5 levels, is how would I explain what I just learned from the paper in 5 levels of increasing difficulty i.e. explain in 4 or 5 sentences to a child, to a teen, to a college student, to a grad student, and to an expert.

I don’t know about the papers but it has helped in trying to explain concepts to a non-technical audience.


[1] K Srinivasan — How to read papers —

—— — — – — Happy Reading and Stay Curious — — — — —

Source: Giphy


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: