Transfer Learning in Medical Diagnosis

Original Source Here

Transfer Learning in Medical Diagnosis

What has been done so far and future directions…

The use of Artificial Intelligence (AI) in medical imaging is not a new trend. It has been in place since 1970s when the first steps in AI mostly included “if else” statements, mathematical modelling and systems designed to analyse rules implemented by humans.

At the time this systems were called AI, however, today, we call it expert systems which is quit different than the AI techniques we have today. The main difference is that expert systems are implemented with rules defined and known by humans, such as logic statements (“if else”), mathematical equations and other computer algorithms that allow computers to solve tasks that are usually reserved for humans. This means that we show the computer the data, and the rules to perform the task, and the computer output will be the final result.

Today AI is very different, and computers are able to find the way to solve a certain task without a human telling the exact rules to use. That is, we show the computer the data, and we show the computer the final result. The computer will “learn” the task and will output the rules (the model) to perform the task. For most models, it would be human impossible to know them due to their complexity. However, these models are based purely on mathematical and logic operations (usually matrix operations), which require millions of calculations, only possible with the processing power of modern computers.This is called Machine Learning.

Image from author.

Transfer Learning (TF) is a technique of Machine Learning where a model is not trained from the scratch, and instead, a pre-trained Convolutional Neural Network (CNN) model trained in a different dataset is applied in a new data. The advantages of the use of TF is that a smaller dataset can be used (hundreds or thousands of images instead of millions), and the time to train the model is significantly reduced (hours instead of days or weeks).

The use of TF in Medical literature has seen a drastic increase in last years. A rapid search in PubMed with the terms (“transfer learning” OR “Transfer Learning”) in the last 10 years show us a clear picture. In 2012 and 2013 there was only 16 articles per year, in 2014 this number increased to 27, the next year 43, in 2016 there were 54 and 100 in 2017. In 2018 these numbers started to seriously increase with 231 articles published, 395 in 2019, 771 in 2020 and finally 1281 articles in 2021. I did this research at 4th January 2022, and at this date there are already 31 articles published. Another fact worth noting, is that more than 90% of these publications date from 2018 or later.

Image from author.

Transfer Learning has shown promising results in medical field even if medical imaging is quite different than real world images where this models are trained (ImageNet dataset for example). However, since all images share common characteristics such as curves, lines or colours, the parameters learned in these images can be transferred to medical data.

The most commonly used models to classify medical images are summarised in an article from Morid (2020). The author report Inception-V3 (19%), VGG-16 (18%), AlexNet (15%), and ResNet-50 (13%) as the most frequent choices to use in medical image classification tasks. The most common type of medical images used are x-ray and MRI.

The results so far have shown promising results, with these Machine Learning models being able to effectively classify and stratify medical images. The best results are seen for binary classification, where transfer learning techniques have been reported with accuracy over 80% (Yadav, 2019). The results are not that good when transfer learning is used for image segmentation, or non-binary classification, with accuracy reportedly between 60 and 70% (Cai, 2020).

In spite of these promising results it is not expectable that AI will be replacing health workers anytime soon. The published results so far have some limitations, which means that it is not yet possible to largely apply results.

Some limitations include:

(1) Data collected only from one medical centre, or fewer interconnected centres. This limits the extrapolation of results due to specific population variations, specific techniques used, equipment brand and model variation… These variables can cause the AI models to perform worse.

(2) Limited data available. Even if some recent papers are using thousands of labeled images, it is still not on par with the millions of images necessary to train Machine Learning models. To acquire more labeled medical images is expensive due to the necessary work of image classification and segmentation only possible with human work (that contrary to computers need to paid for).

(3) Unbalanced data available. To train Machine Learning models, the data in the different categories must be balanced. This means that we need the same number of images in the difference categories. In medical imaging this is not the reality, as images are usually unbalanced, with the majority of images being “normal” or “healthy”, and when “unhealthy” or “abnormal” images are presented that is a higher probability of having multiple images from the same patient. To use unbalanced data in TF there are three options: (a) to proceed the study with unbalanced data, which may cause the model to underfit; (b) choose to use only the number of images in the category with less representation, which leads to wasted data; and (c) perform data augmentation, which may cause the model to overfit.

(4) There is not reliable benchmark between different models, which means that we still don’t know which model is better for a specific task. One base model can be better for x-ray images, but a different model can be better for MRI. But we still don’t know it without massive testing of different models.

(5) Is Transfer Learning really better than training a model from scratch? Most of the publications in Machine Learning for image classification and segmentation in medical field use TF techniques instead of training a model from scratch. When models are trained from scratch, there are reportedly worse results than when TF is used (Raghu, 2019). However, this deserves further investigation, as the number of images needed to train a model from scratch is much higher than what is need for transfer learning, and therefore limiting this kind of comparisons.

In conclusion, Transfer Learning using ImageNet as a non-medical dataset, might be an effective way to approach medical image classification tasks. However, research gaps are still holding a more meaningful implementation of AI in medical systems, including some serious benchmarking between models and the availability of larger datasets that are build with images from different centres.

Thank you for reading.

If: you liked this article, don’t forget to follow me and thus receive all updates about new publications.

Else If: you want to read more, you can subscribe to Medium membership with my referral link. It will not cost you more but will pay me for a coffee.

Else: Thank you!


Morid, M., Borjali, A., & Del Fiol, G. (2021). A scoping review of transfer learning research on medical image analysis using ImageNet. Computers In Biology And Medicine, 128, 104115. doi: 10.1016/j.compbiomed.2020.104115

Yadav, S., & Jadhav, S. (2019). Deep convolutional neural network based medical image classification for disease diagnosis. Journal Of Big Data, 6(1). doi: 10.1186/s40537–019–0276–2

Cai, L., Gao, J., & Zhao, D. (2020). A review of the application of deep learning in medical image classification and segmentation. Annals Of Translational Medicine, 8(11), 713–713. doi: 10.21037/atm.2020.02.44

Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: Understanding Transfer Learning for Medical Imaging. 33Rd Conference On Neural Information Processing Systems.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: