Do you really know the difference between Test and Validation Datasets?*lmMInBT5wd90p6AG

Original Source Here

Do you really know the difference between Test and Validation Datasets?

Photo by Joanna Kosinska on Unsplash

Many people don’t really know the difference between test and validation.
In Machine Learning these two words are often used improperly, but they indicate two very different things.
Even literature sometimes reverses the meaning of these terms.

When training a model the dataset is usually divided into a train set, a validation set and a test set but…why are the last two sets needed? Are they always needed? Keep reading, you will find your answers.

The definition of Validation

In simple words:

During training, Validation is the process of monitoring model’s performance when computing unseen data

While we are training a model using the train set, we need to constantly evaluate its performance using a different set. This is a must because an improvement in training data doesn’t always involve an improvement in unseen data. Literature calls this overfitting problem.
Validation set is the sample of data used during this process.

So we always have to compute the model’s metrics over the validation set, which allows us to monitor the model’s improvement (or worsening). Bad validation metrics are a signal that something is not working well. This might lead us to change something like the model’s hyperparameters.

Problem: by doing this we are indirectly forcing our model to give good results from the validation set because the choice of each hyperparameter is based on metrics obtained from this set.
This way we created a bias: our model changes were influenced by the validation metrics, so predictions from this set should have better results compared to the results obtained from another dataset which didn’t influence model tuning. We still can’t be 100% sure that our model is good.

Solution: test set.

The definition of Test

In simple words:

After training, Test is the process of checking the final model’s performance

In this phase, we compute the final model’s metrics from a new dataset, the train set. This set has never been used during model training or tuning.
This set like the validation one should be composed of samples which completely cover the model’s field of application.

Last words

In different situations, because of the lack of samples, the original dataset is only split into a train set and a validation set because the test set “steals” data which could be used for training. This way the validation set is also used as if it was a test set, but this is not the canonical procedure. When comparing your model’s results with other people’s models (especially from papers), you should use a test set to have a valid comparison.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: