Multi-Label Classification with Deep Learning



Original Source Here

Multi-Label Classification with Deep Learning

Photo by Agence Olloweb on Unsplash

Introduction

Multi-label classification is a supervised learning problem in which one instance can be associated with several labels. This differs from most single-label classification (multi-class or binary) work, which allocates a single class label to each occurrence. The multi-label context is becoming more common, and it may be used with a wide range of domains, including text, audio data, photographs and video, and bioinformatics, as well as the references included within.

Deep learning neural networks are a part of the multi-classification problem-solving system. The Keras deep learning library makes it simple to define and analyze neural network models for multi-label classification problems.

In this article, you’ll learn how to create multi-label classification deep learning models.

Multi-Label Classification

The most common technique for multi-label classification is to train separate classifiers for each label. Consider the binary relevance (BR) transformation. Any off-the-shelf binary classifier is applied separately to each of these problems, thereby reducing a multi-label problem to a single binary problem for each label. Almost all multi-label research admits that the method’s limitations are due to a lack of explicit modeling of label interactions, and provides solutions to account for these dependencies.

Some classification problems need the prediction of several class labels. This implies that class labels and membership in a class are not mutually exclusive. Multiple label classification is the name given to these activities.

For each input sample, zero or more labels are needed as outputs and the outputs are required concurrently in multi-label classification. The output labels are assumed to be a function of the inputs.

The scikit-learn libraries make multilabel classification() method may be used to produce a synthetic multi-label classification dataset. There will be 500 samples in our dataset, each with 5 input features. For each sample, the dataset will contain 3 class label outputs, with one or two values for each class (0 or 1, e. g, present or not present).

Below is a detailed example of how to create and summarize a synthetic multi-label classification dataset.

# Example of - multi-label classification task
from sklearn.datasets import make_multilabel_classification
# define the datasets
x, y = make_multilabel_classification(n_samples=500, n_features=5, n_classes=3, n_labels=2, random_state=1)
# summarize datasets shape
print(x.shape, y.shape)
# summarize first few examples
for i in range(5):
print(x[i], y[i])

We can see that there are 500 samples, each with 5 input and 3 output features, as predicted.

The first 5 rows of inputs and outputs are summarized, and we can see that all of the inputs for this dataset are numeric and that each of the three output class labels has a 0 or 1 value.

Output

Multiple Label Neural Networks

Multi-label classification is supported by several machine learning algorithms out of the box. Depending on the characteristics of the classification problem, neural network models can be built to handle multi-label classification and perform well.

Neural networks can directly enable multi-label classification by simply defining the number of target labels in the issue as the number of nodes in the output layer. A task with three output labels (classes), for example, will necessitate a neural network output layer with three nodes.

The sigmoid activation must be used by every node in the output layer. This will give the label a likelihood of belonging to a class, with a value between 0 and 1. Finally, the binary cross-entropy loss function must be used to fit the model.

To summarize, while building a neural network model for multi-label classification, the following details must be taken into account:

  1. The number of nodes in the output layer is the same as in the label layer.
  2. Each node in the output layer has sigmoid activation.
  3. With a binary cross-entropy loss function.
Image from :https://miro.medium.com/

Multi-label classification using a Neural Network

If the dataset is pretty small, it’s a good idea to test neural network models many times on the same dataset and provide the average performance. This is due to the learning algorithm’s stochastic character.

When generating predictions on new data, it’s also a good idea to employ k-fold cross-validation instead of train/test divides of a dataset to provide an unbiased assessment of model performance. The procedure can only be finished in a fair amount of time if there isn’t too much data.

Taking this into consideration, we’ll use repeated k-fold cross-validation with 10 folds and 3 repetitions to test the MLP model on the multi-output regression problem.

Import the required packages

from numpy import std
from numpy import mean
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_multilabel_classification
from sklearn.model_selection import RepeatedKFold
from keras.layers import Dense
from keras.models import Sequential

Python code

# select the dataset
def get_dataset():
p, q = make_multilabel_classification(n_samples=500,n_classes=3,n_features=5, n_labels=2, random_state=1)
return p, q
# model Creation
def get_model(x_inputs, x_outputs):
model = Sequential()
model.add(Dense(10, input_dim=x_inputs, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(x_outputs, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
return model
# evaluate a model using repeated k-fold cross-validation
def evaluate_model(p, q):
results = list()
x_inputs, x_outputs = p.shape[1], q.shape[1]
# define evaluation procedure
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
# enumerate folds
for train_ix, test_ix in cv.split(p):
# prepare data
p_train, p_test = p[train_ix], p[test_ix]
q_train, q_test = q[train_ix], q[test_ix]
# define
model = get_model(x_inputs, x_outputs)
# model fitting
model.fit(p_train, q_train, verbose=0, epochs=100)
# prediction on the test set
qhat = model.predict(p_test)
# round probabilities to class labels
qhat = qhat.round()
acc = accuracy_score(q_test, qhat)
# result store
print('>%.3f' % acc)
results.append(acc)
return results
# loading dataset
p, q = get_dataset()
# evaluating model
results = evaluate_model(p, q)
# summarizing performance
print('Accuracy: %.3f (%.3f)' % (mean(results), std(results)))

OUTPUT

>0.600
>0.540
>0.560
>0.540
>0.700
>0.580
>0.540
>0.660
>0.680
>0.740
>0.580
>0.680
>0.760
>0.620
>0.640
>0.620
>0.620
>0.660
>0.660
>0.720
>0.580
>0.740
>0.560
>0.660
>0.760
>0.660
>0.560
Accuracy: 0.633 (0.068)

Due to the stochastic nature of the algorithm or assessment technique, as well as changes in numerical precision, your findings may vary. Consider repeating the procedure and comparing the average result.

The accuracy of the mean and standard deviation is presented at the conclusion. The model is proven to reach an accuracy of 63.3 percent in this scenario.

The next example shows this by fitting the MLP model to the complete multi-label classification dataset, then running the predict() method on the stored model to create a prediction for a new row of data.

# prediction on multi-label classification using mlp 
from numpy import as array
from keras.layers import Dense
from keras.models import Sequential
from sklearn.datasets import make_multilabel_classification
# select the dataset
def get_dataset():
p, q = make_multilabel_classification(n_samples=500, n_features=5, n_classes=3, n_labels=2, random_state=1)
return p, q
# choose model
def get_model(x_inputs, x_outputs):
model = Sequential()
model.add(Dense(20, input_dim=x_inputs, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(x_outputs, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
return model
# loading dataset
p, q = get_dataset()
x_inputs, x_outputs = p.shape[1], q.shape[1]
# select model
model = get_model(x_inputs, x_outputs)
# fit the model on all data
model.fit(p, q, verbose=0, epochs=100)
# create a prediction based on new data
row = [3, 3, 6, 7, 8]
newp = asarray([row])
yhat = model.predict(newX)
print('Prediction: %s' % yhat[0])

Output

Prediction: [0.8969829  0.9227912  0.29882562]

Multi-label text classification using attention-based graph Neural Network

One sample can belong to many classes in Multi-Label Text Classification (MLTC). Most MLTC tasks have dependencies or correlations among labels, as observed. Existing techniques frequently overlook the link between labels. To capture the attentive dependence structure among the labels, a graph attention network-based model is suggested.

To capture and investigate the essential dependencies between the labels and build classifiers for the task, the graph attention network employs a feature matrix and a correlation matrix. To allow end-to-end training, the resulting classifiers are applied to sentence feature vectors acquired from the text feature extraction network (BiLSTM).

Because attention lets the system give different weights to neighbor nodes per label, it may learn the relationships between labels without explicitly learning them. Five real-world MLTC datasets are used to validate the proposed model’s results. In comparison to earlier state-of-the-art models, the suggested approach delivers equal or superior results.

Conclusion

You’ve now learned how to create multi-label classification deep learning models. You specifically learned that Multi-label classification is a type of predictive modeling that entails predicting zero or more mutually non-exclusive class labels and Multi-label classification problems may be configured using neural network models, even you learned about how to assess a multi-label classification neural network and create a forecast for new data.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: