Hyperparameter Tuning of Support Vector Machine Using GridSearchCV


Original Source Here

The models can have many hyperparameters and finding the best combination of the parameter using grid search methods.

What is SVM?

SVM stands for Support Vector Machine. It is a Supervised Machine Learning algorithm. It is used for both classification and regression problems. It uses a kernel strategy to modify your data and, based on these changes, finds the perfect boundary between the possible results.

Most of the time, we get linear data, but usually, things are not that simple. Let’s take an example of classification with non-linear data:

Now, to classify this type of data, we add a third dimension to this two-dimension plot. We rule that it be calculated a certain way convenient for us: z = x² + y² (you’ll notice that’s the equation for a circle). It gives us a three-dimension space. Since we are in three dimensions now, the hyperplane is a plane parallel to the x-axis at a certain z (let’s say z = 1). Now, we convert it again into two dimensions.

It looks like this :

And here we go! Our decision boundary is a circumference of radius 1, which separates both tags using SVM.

What is Grid Search?

Grid search is a technique for tuning hyperparameter that may facilitate build a model and evaluate a model for every combination of algorithm parameters per grid.
We might use 10 fold cross-validation to search for the best value for that tuning hyperparameter. Parameters like in decision criterion, max_depth, min_sample_split, etc. These values are called hyperparameters. To get the simplest set of hyperparameters, we will use the Grid Search method. In the Grid Search, all the mixtures of hyperparameters combinations will pass through one by one into the model and check each model’s score. It gives us a set of hyperparameters that gives the best score. Scikit-learn package as a means of automatically iterating over these hyperparameters using cross-validation. This method is called Grid Search.

How does it work?

Grid Search takes the model or objects you’d prefer to train and different values of the hyperparameters. It then calculates the error for various hyperparameter values, permitting you to choose the best values.

To illustrate an example of the grid search, it works | Image: Source: Image created by the author.

Let the tiny circles represent different hyperparameters. We begin with one value for hyperparameters and train the model. We use different hyperparameters to train the model. We tend to continue the method until we have exhausted the various parameter values. Every model produces an error. We pick the hyperparameter that minimizes the error. We split our dataset into 3 parts to pick the hyperparameter, the training set, validation set, and test set. We tend to train the model for different hyperparameters. We use the error component for each model. We select the hyperparameter that minimizes the error or maximizes the score on the validation set. In ending the test, our model performance using the test data.

Below we are going to implement hyperparameter tuning using the sklearn library called gridsearchcv in Python.

Step by step implementation in Python:

a. Import necessary libraries:

We have imported various modules like datasets, decision tree classifiers, Standardscaler, and GridSearchCV from different libraries.

Let’s Start

We take the Wine dataset to perform the Support Vector Classifier.

  • Here is dataset information:

Input variables ( based on physicochemical tests ):

1. Alcohol

2. Malic acid

3. Ash

4. Alkalinity of ash

5. Magnesium

6. Total phenols

7. Flavanoids

8. Non-flavonoids phenols

9. Proanthocyanins

10. Color Intensity

11. Hue

12. od280/od315_of_diluted_wines

13. Proline

Libraries used –

  • Pandas
  • Numpy
  • Matplotlib
  • Seaborn
  • Sklearn
  • Grid Search CV

Now, import Wine data using sklearn in-built datasets. Data looks like this:

Now, the main part that every data scientist does is Data Pre-processing. In this, we first see our dataset information using the DESCR method means to describe. It shows our attribute information and target column.

In every machine learning model, we first separate our input and output variables, let’s say X and y, respectively.

To understand every feature’s dependency on the output, we use seaborn and matplotlib library for visualization. First, we use a boxplot to know the relation between features and output.

Let’s take an example of one of the feature:

Image Source: Image created by the author.

In this boxplot, we easily see a linear relation between alcalinity_of_ash and the wine class.

Another example :

Image Source: Image created by the author.

In this boxplot, we see 3 outliers, and if we decrease total_phenols, then the class of wine changes.

So, our SVM model might assign more importance to those features which are varying linearly with the output.

To see how our data is distributed, we use matplotlib python library. We use histogram here. Let’s see an example of it :


Image Source: Image created by the author.

Feature malic_acid follows the left-skewed distribution.

Train Test Split:

We split the data into two parts training dataset and testing dataset using the train_test_split module of sklearn’s model_selection package in 70% — 30%, respectively. Because we first train our model using the training dataset and then test our model accuracy using the testing dataset.

Train the Support Vector Classifier without Hyperparameter Tuning :

Now, we train our machine learning model. We import Support Vector Classifier (SVC) from sklearn’s SVM package because it is a classification problem.

Parameter for gridsearchcv:

The value of your Grid Search parameter could be a list that contains a Python dictionary. The key is the name of the parameter. The value of the dictionary is the different values of the parameter. This will make a table that can be viewed as various parameter values. We also have an object or model of the support vector classifier. The Grid Search is using various kinds of classification performance metrics on the scoring methods. In this case, classification error and number of folds, the model or object, and the parameter values. Some of the outputs include the different scores for different parameter values. In this case, classification error along with parameter values that have the best score.

Now the main part comes Hyper-parameter Tuning. First, we understand hyper-parameter — it is a parameter whose value is used to control the learning process, and hyper-parameter tuning means choosing optimal parameters. To accomplish this task, we use GridSearchCV; it is a library function that is a member of sklearn’s model_selection package. It helps to loop through predefined hyper-parameters and fit your estimator (like-SVC) on our training set. Here is the code:

finding best hyperparameter using gridsearchcv:

We fit the object. We can find the best values for the parameters using the attribute best estimator.

To know the accuracy, we use the score() function.

The final output we get with 90% accuracy and by using the SVC model and GridSearchCV.



Conclusion :

We analyzed the Wine Dataset (which is a preloaded dataset included with scikit-learn) in this post. Pandas, Seaborn, and Matplotlib, were used to organize and plot the data, revealing several of the features naturally separated into classes. Classifiers were trained and testing using the split/train/test paradigm. Now that we’ve learned how to work with SVM and how to tune their hyper-parameters.

Thanks for reading.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: