Breast Cancer Classification

Original Source Here

Detail Discription of Code for Classification :

  1. The Google Colab Look as Follows in image 1:
image 1 : Google Colab

2. Upload the Dataset in Google Drive of the same directory :

image 2: Google Drive

3. To access the files in the google drive you need to autenticate it . Run this code in image 3 to Authenticate Colab to use the files in Google Drive.

image 3

After running the code a link will be generated. Click the link , allow colab to use the file in selected drive. After Allowing , token will be generated, fill the text box below the link and hit enter.

4. Locate the folder that contain the dataset as in image 4:

image 4

5. Import the necessary libraray:

image 5
  • Pandas libraray: Pandas is a python library written for the Python programming language for data manipulation and analysis.
  • Numpy libraray: NumPy is a Python library used for working with arrays.
  • Matplotlib libraray :Matplotlib is a plotting library for the Python programming language.
  • Seaborn libraray; Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

6. Run the code as show in image 6 in your colab notebook to read the dataset and show it.

image 6
  • Pandas libraray is used to read the csv file.
  • The pandas function “read_csv” is used to read the .csv file.
  • The Dataframe called “data” is generated .
  • Dataframe is displayed using head attribute of dataframe.

7. The feature and Label are generate from the dataframe using the code in image 7.

image 7
  • X is an array containing the feature of the dataset.
  • y is the array containing label , wheather the feature array has breast cancer(M) or not(N)

8. The Categorical data in the y is encoded using LabelEncoder from sklearn libarary as shown in image 8.

image 8

9. Dataset is Splitted into the Training set and Test set using the train_test_split function of sklearn into the ration of 80:20.

image 9
  • 80 % of the dataset is used for trainig purpose
  • while rest i.e. 20% is used for testing purpose.

10. Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. The scaling of the feature is carries out using the StandardScaler function in Sklearn library. As shown in image 10.

image 10

11. The shape of the label array and feature array while training are in image 11.

image 11

12. Creating a Deep Learning Model to train on the feature and label dataset to learn and classify breast cancer.

image 12
  • Necessary library are imported used while deep learning model to create
  • The Layer used in this deep learning model is Dense with unit 16, Dropout with 10 %, Dense with 1 unit and sigmoid activation function.
  • The optimizer used is Adam
  • The Loss function is calculated by binary crossentropy
  • The metrices used is Accuaracy
  • The Classifier is created with the layers and compiled with the mentioned optimizer, loss and metrices.
  • The classificer is fitted with the feature ad label array wih batch size of 100 and trained till epoch 150.
  • At the end of the training the accuracy optained is 99.10% and loss of 0.0519

13. Implemented a neural Network solution and improved the acurracy of breast cancer classification to the greeks-for-greeks article which has an test accuracy of 96.51%.

14. The Deep Learning Model was implemented on Test Dataset , the accuracy obtained was of 98.24 %.

Thus succesfully classsified the breast cancer as Malignant and benign using the kaggle dataset with an accuracy of 98.24%.

The Complete Code for Breast Cancer detection using Deep Learning is available at :Github Link

Thank You for your time.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: