Interpreting Image/Text Classification Model with LIME — by Sharath Manjunath

Original Source Here

Interpreting Image/Text Classification Model with LIME — by Sharath Manjunath

Can you trust your Neural Network / Machine Learning Model? What if the Model explains why you need to TRUST THEM…! Let’s Check it out.


Machine learning systems are increasingly being used in high-stakes situations including health (e.g., radiology, drug research), finance (e.g., stock price prediction, digital financial adviser), and even law (e.g. case summarization, litigation prediction). Despite the growing use, there are currently insufficient approaches available to explain and comprehend the judgments made by these deep learning systems. This can be particularly troublesome in situations where algorithmic conclusions must be explainable or traceable to certain characteristics owing to rules or regulations (such as the right to explanation).

The rate of progress and growth in the field of machine learning is incredible. To tackle these challenges, we may now use a number of machine learning models. Let’s assume we need to tackle a classification problem; we now have more options than just logistic regression. There are additionally, decision trees, random forests, SVMs, Gradient Boosting, Neural Networks, and more algorithms that may be used.

However, we can all agree that the majority of the existing machine learning models are black boxes. This means that most machine learning models are doing something so sophisticated behind the hood that we have no idea why they act the way they do. In other words, we don’t know how our model thinks when it predicts anything.

Black Box algorithm

Understanding our machine learning model’s behavior is becoming increasingly critical. Judging a model’s performance only on its accuracy is no longer adequate, as your model may deceive you. Consider the following ball classifier, which has only one job: to categorize a ball as either a football or a basketball.

Our ball classifier performed admirably in the image above. It accurately guessed the class of all six pictures. After that, we’re satisfied with it and decide to utilize it in production. We had no idea, however, that our classifier had effectively deceived us. What is the reason behind this? Take a look at the description of the model below.

It turns out that our classifier accurately identified a ball as a football based on human body components rather than the ball itself. So instead of classifying between a football and a basketball, our classifier attempted to categorize between human body parts and a basketball. Obviously, this isn’t what we want, and we shouldn’t put our faith in our model only on the basis of its correctness.

Unfortunately, interpreting the behavior of our black-box model, like a deep neural network, is a very difficult thing to do.



LIME stands for Local Interpretable Model-agnostic Explanations. The abbreviation of LIME itself should give you an intuition about the core idea behind it. LIME is:

Local — Explanations are locally faithful instead of globally.

Interpretable — Humans are limited by the amount of information that can be processed and understood. e.g. The weights of a neural network are not meaningful for a human.

Model-agnostic — Any machine learning algorithm can be used as a predictive model. Works with text, image, and tabular data.

Explanations — Artifacts that provide an understanding between input to a ML model and the model’s prediction.

Interpretable Machine Learning with LIME for Text Classification

LIME generates local explanations, in other words, explanations for individual instances in a dataset. LIME starts by generating a new dataset of perturbations around the instance to be explained. Then, the trained machine learning classifier is used to predict the class of each instance in the new generated dataset. Finally, a simpler model with intrinsic intepretability, for instance a linear regression model, is fitted and used to explain the prediction of the classifier. Before fitting a simpler model, the instances in the new generated dataset are weighed based on their distance to the original instance being explained. In this way, there is a higher certainty that the model is locally faithful around the explained instance.

In this example, as shown in the plot below, we want to explain the prediction of the blue dot (instance with x1 = 0.8 and x2 = -0.7).

Step 1. Generate random perturbations around the instance being explained

For the case of tabular data, sampling around the mean and standard deviation of the explanatory variables is recommended. Given that the dataset used was normally standardized , a normal random sample with mean 0 and standard deviation 1 is generated and stored in the 2D array. At this point, we still do not know the class for each element in this sample. Such a class is predicted in the next step.

Step 2: Use ML classifier to predict classes of newly generated dataset

The classifier trained is used to predict the class of each pair (x1,x2) in the newly generated dataset.

Step 3: Compute distances between the instance being explained and each perturbation and compute weights (importance) of the generated instances

The distance between each randomly generated instance and the instance being explained is computed using the euclidean distance. For explanations of other types of data such as image or text data the cosine distance can be used. Such distances are then mapped to a value between zero and one (weight) using a kernel function. Depeding on how we set the kernel width it shows how wide we define the “locality” around our instance. For tabular data the definition of a kernel width needs special atention, specially if the data has not been standardized.

The plot below shows how these weights would look like around the instance being explained. Green markers represent larger weights or instance with higher importance.

Step 4: Use the new generated dataset , it’s class predictions and their importance (weights) to fit a simpler and interpretable (linear) model

A linear model is fitted as shown below. This linear model generates new decision boundaries that are locally faithful around the explained instance. This linear decision boundary can be seen with the markers with + and — symbols. It is important to emphasize that this new linear decision boundary is not globally faithful because it is supposed to be a proper discriminator only in the locality of the instance being explained (blue dot).

The coefficients of this estimated linear model can be used to understand how changes in the explanatory variables affect the classification output for the instance being explained. For example, as shown below, the estimated coefficients suggest that for the instance being explained, increasing the values of x1 and x2 will cause the prediction to lean towards the prediction of the negative class (darker area).

Interpretable Machine Learning with LIME for Image Classification

1. Input data permutation

LIME will produce many samples that are comparable to our input picture by turning on and off parts of the image’s super-pixels if our input data is an image.

2. Predict the class of each artificial data point

Following that, LIME will predict the class of each fake data point created by our trained model. If your input data is a picture, each altered image’s forecast will be created at this point.

3. Calculate the weight of each artificial data point

The cosine distance between each perturbed picture and the original image will be computed if the input data is an image. The greater the resemblance between a disturbed picture and the original image, the greater the weight and significance of the perturbed image.

4. Fit a linear classifier to explain the most important features

The weighted fake data points are then used to construct a linear regression model. Following this step, we should be able to obtain the fitted coefficients for each feature, exactly as we would in a standard linear regression study. Now, if we sort the coefficients, the characteristics with bigger coefficients are the ones that have a significant impact on our black-box machine learning model’s prediction.

Example: Let’s say we want our model to make a prediction of a panda image below:

Next, we need to import all of the necessary libraries. Since our input data is an image, we’re going to use LimeImageExplainer() method from lime_image . If your input data is a tabular data, you need to use LimeTabularExplainer() method from lime_tabular instead.

Now it’s time for us to start interpreting the prediction of our custom model. All we need to do is to call the explain_instance method from explainer object that we’ve created before.

As you can see above, we passed several arguments there:

  • images — The image that we want LIME to explain.
  • classifier_fn — Your image classier prediction function.
  • top_labels — The number of labels that you want LIME to show. If it’s 3, then it will only show the top 3 labels with highest probabilities and ignore the rest.
  • num_samples — to determine the amount of artificial data points similar to our input that will be generated by LIME.

Next, we can proceed to visualize the explanation provided by LIME.

Now we know why our image is classified as a panda by our model! Only the super-pixels where the panda may be seen are shown in the left picture. This indicates that because of these super-pixels, our model labels our image as a panda.

The area of super-pixels colored in green on the right picture increases the likelihood that our image belongs to the panda class, whereas the amount of super-pixels colored in red decreases the probability.


LinkedIn :

Github :


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: