Model Benefit Evaluation with Lift and Gain Analysis



Original Source Here

Lift and Gain Analysis

Lift and Gain analysis is an analysis to evaluate the model prediction and the benefit to the business. It is often used in the marketing target analysis but not restricted.

In a typical Lift and Gain analysis, the analysis result would be presented in the chart below.

Image by Author

Gain and lift charts are visual aids for evaluating the performance of classification models. Unlike the confusion matrix that evaluates the overall population, the Gain and Lift chart evaluates model performance in a portion of the population. This means we evaluate the model in terms of the benefit we could get using the model in a portion of the population.

The Gain and Lift analysis benefit comes from how in the business often a time that our 80% revenue comes from 20% of the customers. This is the main part of the decile analysis used in the Gain and Lift chart calculation. The decile analysis is presented in the chart below.

Decile Analysis (Source: http://www.totalcustomeranalytics.com/decile_analysis.htm)

How Decile Analysis is applicable in the Gain and Lift analysis? Let’s take a few steps back and explain how to Gain and Lift analysis calculated from the beginning.

As I mentioned previously, the Gain and Lift chart is used to evaluate the classification model. For the sake of example, let’s create a prediction model. In this article, I would use the churn data from Kaggle.

import pandas as pd
churn = pd.read_csv('churn.csv')
Image by Author

In this dataset, we have 21 columns with the target is customer churn. This means we would develop a classification prediction model to predict the customer churn. For simplicity, I would clean the data for modelling purposes.

#Drop Customer ID
churn = churn.drop('customerID', axis = 1)
#Change Ordinal data to numerical
for i in ['Partner', 'Dependents', 'PhoneService', 'OnlineSecurity',
'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies', 'PaperlessBilling', 'Churn']:
churn[i] = churn[i].apply(lambda x: 1 if x == 'Yes' else 0)
#OHE categorical data
churn = pd.get_dummies(churn, columns = ['gender', 'MultipleLines', 'InternetService', 'Contract', 'PaymentMethod'], drop_first = True)
#Change object data into numerical
churn['TotalCharges'] = churn['TotalCharges'].apply(lambda x: 0 if x == ' ' else float(x))

After cleaning the data, we would try to develop the prediction model. For this article, I would use the Logistic Regression model.

#Import the model
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
#Splitting the model
X_train, X_test, y_train, y_test = train_test_split(churn.drop('Churn', axis =1), churn['Churn'], test_size = 0.3,stratify = churn['Churn'], random_state = 101)
model = LogisticRegression()
model.fit(X_train, y_train)

With our model is set, we would start to make our Gain and Lift analysis to evaluate this model. As a comparison, we would evaluate the model using the usual metrics.

from sklearn.metrics import classification_report
predictions = model.predict(X_test)
print(classification_report(y_test, predictions))
Image by Author

As we can see from the image above, our model capability to predict the churned customer (class 1) is lower. Would our model still have a benefit if we applied it in the business? Let’s see it using the Gain and Lift analysis.

The first step in the Gain and Lift analysis is to get the model prediction probability of class 1 based on the test data and order it in descending order.

#Getting the prediction probability of class 1 and order it by descending orderX_test['Prob']  = model.predict_proba(X_test)[:,1]
X_test = X_test.sort_values(by = 'Prob', ascending = False)
X_test['Churn'] = y_test
Image by Author

When we obtained the probability and order it descendingly, we would divide the data into deciles. This is similar to the decile analysis I have shown in the above image; we divide the data into 10 sets and label it.

#Divide the data into decile
X_test['Decile'] = pd.qcut(X_test['Prob'], 10, labels=[i for i in range (10, 0, -1)])
Image by Author

After dividing the data by decile, we need to calculate the actual churn (actual class 1, not predicted) in each Decile. This motion I called the Number of Responses.

#Calculate the actual churn in each decile
res = pd.crosstab(X_test['Decile'], X_test['Churn'])[1].reset_index().rename(columns = {1: 'Number of Responses'})
lg = X_test['Decile'].value_counts(sort = False).reset_index().rename(columns = {'Decile': 'Number of Cases', 'index': 'Decile'})lg = pd.merge(lg, res, on = 'Decile').sort_values(by = 'Decile', ascending = False).reset_index(drop = True)
Image by Author

In the image above, we obtain the Number of Cases (The number of data in the decile) and the Number of Responses (The number of actual positive data in each decile). With this number, we able to calculate the Gain number.

For information. Gain is the ratio between the cumulative number of the Number of Responses (Actual Positive) up to each decile divided by the total number of positive observations in the data. Let’s try to calculate it in our data.

#Calculate the cumulative
lg['Cumulative Responses'] = lg['Number of Responses'].cumsum()
#Calculate the percentage of positive in each decile compared to the total nu
lg['% of Events'] = np.round(((lg['Number of Responses']/lg['Number of Responses'].sum())*100),2)
#Calculate the Gain in each decile
lg['Gain'] = lg['% of Events'].cumsum()
Image by Author

We could see from the Gain image above that the Gain number is increasing for each decile, but the cumulative total decreases with the higher decile. What is the interpretation of the Gain? Gain is the percentage of targets (actual positive) covered at a given decile level. For example, in decile 2, we had a Gain of 50.44. It means 50.44% of targets covered in the top 20% of data based on the model. In the churn model, we can say we can identify and target 50% of customers who are likely to churn by just targeting 20% of total customers. Business-wise, it means with fewer resources, we could potentially avoid a 50% churn event.

Next, we need to calculate the Lift. The lift would measures how much better we can expect to do with the predictive model comparing without the model.

lg['Decile'] = lg['Decile'].astype('int')
lg['lift'] = np.round((lg['Gain']/(lg['Decile']*10)),2)
Image by Author

The lift could be interpreted as the gain ratio percentage to the random percentage at a given decile level. In a layman term, in decile 2, we have 2.52 means that when selecting 20% of the data based on the model, we could find the target (actual positive) 2.52 times more than the randomly selected 20% of the data without a model.

Let’s try to visualize the Gain and Lift chart compared to the random picking.

Image by Author

In the image above, we could evaluate the model by measuring the Gain and Lift model compared to the random—the Greater the area, the Better the model. We could see that our model is good in prediction because the chart has shown that the model allowed a greater Gain and Lift than the random picking. Business-wise, the churn is found better with using the model; means fewer resources to spend.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: