Machine Learning Theory and Programming: Supervised Learning for Multiclass Classification



Original Source Here

Multiclass Classification Programming

There are machine learning-specific programming languages, such as MATLAB, Octave, R, etc. For some of the general-purposed programming languages, such as Python, they are supplied with machine learning libraries. It is highly recommended not to reinvent the wheel. We can use machine learning specific programming languages or machine learning libraries to solve multiclass classification problems.

MATLAB is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. It provides a built-in function, k-means clustering

  • fitcsvm trains an SVM model for one-class and two-class (binary) classification on a low-dimensional or moderate-dimensional predictor dataset.
  • fitcecoc trains a multiclass model for SVM or other classifiers. ClassificationECOC is an error-correcting output codes (ECOC) classifier for multiclass learning, where the classifier consists of multiple binary learners.

Three Binary Classifiers

This is the program that uses three binary classifiers to resolve iris classification problem. It plots predicted regions in different colors. The inputs in the iris training set are displayed on top of the regions for validation.

Line 1 specifies the function name, ThreeBinaryClassifiers.

Lines 3–5 load the dataset, fisheriris, and assign data to X and y. X takes 3rd and 4th columns of iris data, which are petal length and petal width. y is a cell array copied from species.

Lines 8–13 train 3 binary classifiers, SVMModels. There are 3 unique classes, 'setosa', 'versicolor, and 'virginica'. numel(classes) returns 3. It takes the current class as the positive cases, and fitcsvm trains each SVM model with the kernel function, 'gaussian'.

Lines 16–26 generate every point on the 2-D grid, xGrid. The density is defined by d (line 16). meshgrid(x, y) returns 2-D grid coordinates based on the coordinates contained in vectors x and y. For each point, it predicts the score for each of the 3 binary classifiers, and stores the result in Scores.

Lines 29–31 draw each points on the 2-D grid. Each point chooses the color of the corresponding classifier that yields the latest value. If it is from 'setosa', the color is cyan ('c'). If it is from 'versicolor, the color is magenta ('m'). If it is from 'virginica', the color is yellow ('y'). max(Scores, [], 2) at line 29 returns a column vector containing the maximum value of each row. Line 31 calls hold on to retain plots in the current axes so that new plots added to the axes do not delete existing plots.

Line 34 plots the iris training set, X, on top of the regions for validation.

Lines 37–42 draw title, axis labels, and legend.

Line 43 sets the axis limits to the range of data.

Line 44 calls hold off to set the hold state to off.

Line 45 terminates the function.

Here is the generated graph with predicted regions:

Image by Author

Multiclass Classifiers

As we have said, try not to reinvent the wheel. There is a built-in function, fitcecoc, which trains multiclass classifiers. Although it is a classifier that consists of multiple binary learners, we still should use this built-in model for multiclass classification. It uses posterior regions (conditional probability) to show how likely the predictions are.

The following program is shorter, and provides a more sophisticated result.

Line 1 specifies the function name, MulticlassClassifier.

Lines 3–5 load the dataset, fisheriris, and assign data to X and y. X takes 3rd and 4th columns of iris data, which are petal length and petal width. y is a cell array copied from species.

Lines 8–9 train the multiclass classifier, Mdl. Line 8 returns an SVM learner template, templateSVM, which is suitable for training multiclass models. Line 9 trains the multiclass classifier, and transforms classification scores to class posterior probabilities. 'Verbose' level is set to 2 to display information of the training process.

Lines 12–15 generate every point on the 2-D grid, xGrid. The density is defined by d (line 12). meshgrid(x, y) returns 2-D grid coordinates based on the coordinates contained in vectors x and y. For each point, it predicts the posterior probability. The result is captured in PosteriorRegion (line 15).

Lines 18–22 plot maximum posterior regions. contourf fills 2-D contour plot, which are coordinates reshaped by class posterior probabilities. It creates a filled contour plot containing the isolines of the same class posterior probability values. Line 20 creates a color bar that maps colors and posterior values. The bar is labeled and set position at southoutside. Line 22 calls hold on to retain plots in the current axes so that new plots added to the axes do not delete existing plots.

Line 25 plots the iris training set on top of the regions for validation.

Lines 28–31 draw title, axis labels, and legend.

Line 32 sets the axis limits to the range of data.

Line 33 calls hold off to set the hold state to off.

Line 34 terminates the function.

Here is the generated graph that shows maximum posterior distribution:

Image by Author

The following is the verbose console output for 3 learners:

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: