Original Source Here

# SVM: Complete Guide

After diving into KNN, Linear Regression, and Naive Bayes in previous parts to this series, we’ve finally arrived at one of the famous algorithms used for both classification and regression; SVM, or **Support Vector Machines.**

SVM is a discriminative classifier that searches for the best hyperplane by which to divide data into various groups. This ideal hyperplane in two dimensions can be visualized as a line separating the space into two sections, with one section containing data points that belong to one class and the other section containing data points that belong to the other class. Only if the data points can be separated linearly can the idea of lines serving as a classifier be considered valid.

Assume you are given a graph with a plot of two labeled classes as shown in the example image below. Can you choose a line to divide the classes?

The following line divides the data points into two halves. Anything on the left of the line is black dots and anything on the right is blue dots.

So, we seek the optimal hyperplane to divide the two classes in this supervised machine learning issue.

# How does SVM differ from logistic regression?

The major difference between the SVM and logistic regression is SVM is based on statistical method whereas logistic regression is based on probabilistic approach.

SVM determines the ideal line across the largest possible margin, which implies that it is equally and maximally distant from both classes or spaces. To give this line the maximum margin, the sum of the margins must be maximized.

A maximum margin hyperplane or classifier can be seen in the image as the line in the center. It appears as a line in a two-dimensional plane but is actually a hyperplane in a three-dimensional space. That is how SVM functions.

**Benefits of SVM**

- More accurate in high-dimensional feature space that, for instance, KNN.
- In large dimensional spaces, SVM performs better. SVM is effective in cases where the number of dimensions is greater than the number of samples.

**Negative aspects of SVM**

- Prone to overfitting, particularly when there are many more features or dimensions of the space than there are data points or samples.
- Doesn’t offer a probability estimate.

**Applications of the SVM**

- Text mining and text category assignment;
- Gene expression classification;
- Speech recognition;
- Image analysis;
- Spam detection;
- Sentiment analysis.

# Hands-on example

For this example, we will be using the Social Network Ads dataset from Kaggle. Given the independent variables below, our model has to classify whether a customer purchased the SUV or not. For simplicity’s sake, in this example we will just use the customer’s age and salary as features.

**1. Importing Libraries**

## 2. Load the Dataset

We will not use `UserID`

to solve this issue. Also, we will remove `Gender`

for simplicity’s sake. Remember that X contains independent variables or features, and Y for contains the dependent variable (in our case, whether or not the customer made a purchase).

## 3. Split Dataset into X and Y

**Independent variables (X):**

**Dependent variable(Y):**

**4. Split the X and Y Dataset into the Training set and Test set**

## 5. Perform Feature Scaling

The dataset includes features with highly different ranges. Therefore, we will need to scale the features. We can standardize the data within a certain range using `StandardScaler`

from `sklearn`

.

## 6. Fit the model

This SVC class enables us to create a kernel SVM model (both linear and non-linear) for classification. The kernel’s default value is `rbf`

. We will be using `rbf`

here because it is nonlinear.

## 7. Predicting the results

To check how many of our predictions are correct, we will calculate a confusion matrix. We will use this confusion matrix to further calculate our evaluation metrics. Typically for any classification problem you would want to use a combination of performance metrics, including the F1 score, but for the sake of simplicity here we will just be calculating accuracy.

**8. Make the Confusion Matrix**

And we got 93% accuracy. Note that this is not a very relevant metric for many classification problems, but we for simplicity’s sake we will use it here.

## 9. Visualizing Results

The output of the plot will look like this

There are a total of 7 inaccurate predictions, as seen in the image. Three green forecasts were made but turned out to be red, while four red predictions were made but turned out to be green. Out of 100 test cases used, this is how we arrive at 93% accuracy.

**Conclusion**

With this information, I hope you can better comprehend Support Vector Machines. I made an effort to make my explanation of SVM and its implementation in Python simple and understandable, so I hope you got it!

If you want to learn other algorithms you can check out some of my other articles:

I advise you to give it a shot. You are welcome to ask me any questions in the comment section if you have any!

*Editor’s Note:** Heartbeat** is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.*

*Editorially independent, Heartbeat is sponsored and published by** Comet**, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.*

*If you’d like to contribute, head on over to our** call for contributors**. You can also sign up to receive our weekly newsletters (**Deep Learning Weekly** and the** Comet Newsletter**), join us on** Slack**, and follow Comet on** Twitter** and** LinkedIn** for resources, events, and much more that will help you build better ML models, faster.*

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot