Original Source Here
Bank Churn Modeling with Neural Networks
Using Artificial Neural Network with Keras in google Colab
In this article, we will predict customer churn or attrition with an artificial neural network in google colab. The aim is not to get good accuracy with the model, the importance is to get a good model with different techniques and algorithms.
Churn analysis is a classification problem because the label column has binary values only. Customers are very important for every company and institution and attrition of customers is a part of the analysis.
Here, we will use a neural network to predict the attrition of the customer based on input features and target column as shown below:
The input features are Surname, CreditScore, Geography,…….., IsActiveMember, EstimatedSalary.
The Target column is Exited
The dataset is easily available on kaggle.
Let’s start the python code in google colab.
First of all, we need to mount the drive with the colab, so that we can access the data directly from the drive. For this a simple command is given below:
from google.colab import drive
After running this command, we will get a link to attach a drive with colab as shown below:
we need to click on the link, sign in with Gmail then copy the link from there and paste it into the above box.
Now, import all the required libraries in the colab.
import numpy as np
import pandas as pd
import matplotlib.pyplot as pltimport keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Embedding, Flatten,
BatchNormalization, Dropoutfrom keras.activations import relu, sigmoid
from keras.layers import LeakyReLU
Now, we will read the data from the drive.
bank_df = pd.read_csv('/content/drive/MyDrive/Colab
Now, we will divide the data into input features and target feature as X and y respectively.
X = bank_df.iloc[:, 3:13].values
y = bank_df.iloc[:, 13].values
The input features are from the 4th column to the 12th column and the target feature is the 13th column i.e Exited column.
There are two categories columns in the features and we have to change them into numerical form to get easy training. Here, we used a label Encoder to get categories in numerical form.
# Encoding categorical data
from sklearn.preprocessing import LabelEncoderlabelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
Now, we will use one hot Encoder to get the categories column.
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformerct = ColumnTransformer(
[('one_hot_encoder', OneHotEncoder(categories='auto'), )],
remainder='passthrough')X = ct.fit_transform(X)
X = X[:, 1:]
The Column transfer is a type of estimator in which the columns’ vector values are transformed to generate features.
It’s time to divide the features into train and test sets.
# dividing features into training and testing sets.
from sklearn.model_selection import train_test_splitXtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size
= 0.2, random_state = 0)
Now, we will a feature scaling so that all the varying data get into some equal range.
# Feature Scalingfrom sklearn.preprocessing import StandardScaler
sc = StandardScaler()
Xtrain = sc.fit_transform(Xtrain)
Xtest = sc.transform(Xtest)
Now, importing the Keras classifier and grid search cv for tuning of the parameters.
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
It’s time to create a function to make a neural network with different hidden layers.
def create_model(layers, activation):
model = Sequential()
for i, nodes in enumerate(layers):
model.compile(optimizer='adam', loss = 'binary_crossentropy',
return modelmodel=KerasClassifier(build_fn=create_model, verbose=0)
Here, we have given different numbers of layers and activation functions to choose the best as a part of GridSearchCV.
layers = [, [35, 25], [40, 25, 15]]
activations = ['sigmoid', 'relu']param_grid = dict(layers = layers, activation = activations,
batch_size = [128, 256], epochs=)grid = GridSearchCV(estimator=model, param_grid=param_grid)
In the next step, we are training the model and getting the best parameters.
grid_result = grid.fit(Xtrain, ytrain)
Write the following command to get the best parameters for our artificial neural network.
'layers': [40, 25, 15]}]
Now, we will do the prediction with these best parameters.
pred_y = grid.predict(Xtest)
For evaluation, we are checking the confusion matrix.
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(ytest, ypred)
[ 202, 203]])
It’s time to get the accuracy score of the artificial neural network we created above.
from sklearn.metrics import accuracy_score
score = accuracy_score(ytest, ypred)
These parameters give 86% accuracy to the model.
This is a very simple classification artificial neural network and very much useful to beginners.
1. 8 Active Learning Insights of Python Collection Module
2. NumPy: Linear Algebra on Images
3. Exception Handling Concepts in Python
4. Pandas: Dealing with Categorical Data
5. Hyper-parameters: RandomSeachCV and GridSearchCV in Machine Learning
6. Fully Explained Linear Regression with Python
7. Fully Explained Logistic Regression with Python
8. Data Distribution using Numpy with Python
9. Decision Trees vs. Random Forests in Machine Learning
10. Standardization in Data Preprocessing with Python
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot