Incorporating deep learning into pathology

Original Source Here

Incorporating deep learning into pathology

The medical industry has been exploiting AI for the past decade or so. We will see several approaches to Artificial Intelligence in Pathology that use Whole Slide Images.

In this article, we will dissect how to approach the problems like classification and segmentation for the Whole Slide Images (WSI) that can provide insights to the pathologists quickly!

Why use AI in Pathology?

The use of digital pathology has become very prominent since the beginning of the Information Technology era. Computational pathology can open doors to many applications, including AI. Due to recent advances, it has become easier to develop image-based diagnostic, prognostic, and predictive algorithms.

Some of the main tasks done by pathologists are classification, segmentation, and quantification. Deep learning techniques elevate these tasks dramatically. Apart from collecting more accurate results, investing in AI will help pathologists be more productive by reducing human errors, having better workloads, and quicker patient turnaround time. Additionally, it is easier to automate pipelines when dealing with the bulk of images.

A brief about WSI

Whole Slide Imaging, also known as virtual microscopy, aims to imitate microscopy in a computer-generated manner. Thus, archiving and applying image-based solutions is trouble-free.

A sample WSI

The approaches explained below are generic and could have usage in different use-cases in pathology. We will look into both the classification and segmentation of WSI.

Normalizing H&E

If you are working with H&E images. You may need to follow an additional set of processing steps to convert images from RGB to OD space.


The WSI resolution is typically large, exceeding 15K. It will demand heavy computational power and memory usage to train a simple classifier on the entire image.

First approach

If the classification features are apparent, then resizing and performing vanilla CNN classification will do the trick! It is the simplest one and takes the least amount of time. The difficulty is that we select model architecture and image resolution based on factors like available compute power and memory.

Second approach

It is tiling or patching the Whole Slide Images.

Some libraries out there that can extract patches of tissues from a WSI such as CLAM or pathologists’ intervention may be needed to annotate the region of interest and get these annotations. Therefore, we choose the tiles within this annotated region for training and validation.

Tiled approach with a CNN classifier

Classification is applied to each tile of a particular image. Ultimately, all predictions will determine the class of the WSI.

The advantage is that tiling produces many images from a handful of WSI. But it is more time-consuming as it requires additional processing efforts to generate tiles.


Preferably, it is better to do Transfer Learning because of the complex features of the images. So, models like VGG16/ResNet weights are suitable, or there could be open-source pre-trained models based on the use case. The transfer learning model may converge sooner.

Instead of using pre-trained weights from VGG16/ResNet, we can also build a CNN model from scratch. It is effective if the image features are outstanding that must be visible and distinguishable. The model size will also be small and won’t require heavy resource usage.


The segmentation shows the desired parts of WSI like tumors, tissues, and wounds that help pathologists make decisions sooner.

Segmentation is the pixel-level classification where a mask gets generated given a set of classes. Tiling is necessary to perform segmentation on high-resolution images such as WSI.

Generally, the process is as follows,

Tiled approach for segmentation (don’t mind the mask)
  1. With the help of annotations, generate a mask for the WSI.
  2. Run pre-processing (if any) and tile WSIs and their corresponding masks.
  3. Train the model on the tiled data.
  4. Run post-processing (if any) and then restitch the generated masks as a whole.

It is necessary to establish a standard for WSI resolution/dimensions. While training a segmentation model, conforming to only one input shape must not be overlooked. Therefore, when performing tiling on the original WSIs, we must ensure resizing them beforehand.

To clarify, assume that we have 3 WSIs with different resolutions 15640x15640, 15360x16896, and 17920x12000. If the input shape or tile dimension is 512x512, then not all of these are divisible by 512. Then one way would be to resize these WSIs to 15360x15360, which is a multiple of 512.

Albeit, this works well when different images’ resolutions are very close. But when the difference is significant, like the 3rd resolution where a dimension is 12000, the resized image will be skewed and might lose important features relevant for segmentation. Ideally, the WSI resolution should be uniform for all WSIs.

Similar to the 2nd approach of classification, we can choose the tiles relevant to the use case.


There are many segmentation model architectures implemented. But the most popular ones are Unet and Mask-RCNN. These are fully capable of performing instance/semantic segmentation over images.

Unet model architecture

Unet was created to perform segmentation over medical images. The first advantage of using Unet is that it allows for the use of global location and context at the same time. Secondly, it works with very few training samples and provides better performance for segmentation tasks.

Mask-RCNN model architecture

However, the only advantage of using Mask-RCNN over Unet would be its simplicity. Unet requires additional processing steps. Overall, there is no difference between Unet and Mask-RCNN results, or that may be specific to the use case.

Regardless of the model architecture, segmentation is a resource-demanding task. Therefore, use a GPU with a powerful machine for both training and inference.

Hierarchical approach

Classification and segmentation can be inside a pipeline belonging to different steps. Classification could filter out images that must be passed to the segmentation model to obtain more accurate results and avoid redundant processing. Similarly, segmented regions may require further classification. The pipeline can have many stages like these.

A sample pipeline for segmentation of tumor cells


The objective of this article was to highlight the aspects of deep learning in pathology. The mentioned approaches are generic and can have complex processes depending on the use cases. But they should give a good baseline to confront your pathology projects.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: