Original Source Here
Contrary to what many believe, the machine learning model with the best performance is not necessarily the best solution.
In Kaggle competitions, performance is all you need. In real-life situations, it’s just another factor to consider.
Let’s start with the model’s performance and revisit some of the other considerations to keep in mind when selecting a model to solve a problem.
The quality of the model’s results is a fundamental factor to take into account when choosing a model. You want to prioritize algorithms that maximize that performance.
Depending on the problem, different metrics could be useful to analyze the results of the model. For example, some of the most popular metrics include accuracy, precision, recall, and f1-score.
Keep in mind that not every metric works in every situation. For example, accuracy is not appropriate when working with imbalanced datasets. Selecting a good metric (or set of metrics) to evaluate your model’s performance is a crucial task before we are ready to start the model selection process.
In many situations, explaining the results of a model is paramount. Unfortunately, many algorithms work like black boxes, and the results are hard to explain regardless of how good they are.
The lack of explainability may be a deal-breaker in those situations.
Linear Regression and Decision Trees are good candidates when explainability is an issue. Neural networks, not so much.
Understanding how easy it is to interpret the result of each model is important before picking a good candidate.
Interestingly, explainability and complexity are usually on both ends of the spectrum, so let’s tackle complexity next.
A complex model can find more interesting patterns in the data, but at the same time, it will be harder to maintain and explain.
A couple of loose generalizations to keep in mind:
- More complexity can lead to better performance but also larger costs.
- Complexity is inversely proportional to explainability. The more complex the model is, the harder it will be to explain its results.
Putting explainability aside, the cost of building and maintaining a model is a crucial factor for a successful project. A complex setup will have an increasing impact during the entire lifecycle of a model.
4. Dataset size
The amount of training data available is one of the main factors you should consider when choosing a model.
A Neural Network is really good at processing and synthesizing tons of data. A KNN (K-Nearest Neighbors) model is much better with fewer examples.
Going beyond the amount of available data, a related consideration is how much of it you truly need to achieve good results. Sometimes you can build a great solution with 100 training examples; sometimes, you need 100,000.
Use this information about your problem and the amount of data to choose a model that’s capable of processing it.
It’s useful to look at dimensionality in two different ways: the vertical size of a dataset represents the amount of data we have. The horizontal size represents the number of features.
We already discussed how the vertical dimension influences the selection of a good model. It turns out that the horizontal dimension is also important to consider: More features will often lead your model to come up with better solutions. More features also increase the complexity of your model.
The Curse of Dimensionality is a great introduction to understand how dimensionality affects the complexity of a model.
As you might imagine, not every model scales the same with high-dimensional datasets. We may also need to introduce specific dimensionality reduction algorithms when high-dimensional datasets become a problem. PCA is one of the most popular algorithms for this.
6. Training time and cost
How long it takes, and how much it costs to train a model? Would you choose a 98%-accurate model that costs $100,000 to train or a 97%-accurate model that costs $10,000?
Of course, the answer to this question depends on your individual circumstances.
Models that need to incorporate new knowledge in near real-time can’t afford long training cycles. For example, a recommendation system that needs to be constantly updated with every user’s action benefits from an inexpensive training cycle.
Balancing time, costs, and performance is crucial when designing a scalable solution.
7. Inference time
How long does it take to run a model and make a prediction?
Imagine a self-driving system: it needs to make decisions in real-time, so any model that takes too long to run can’t be considered.
For example, most of the processing needed to develop predictions using KNN happens during inference time. This makes it expensive to run. However, a Decision Tree will be lighter during inference time and will require more time during training.
Many people fixate on their favorite model. Usually the one they know the best and gave them good results on their last project.
But there’s no free lunch in machine learning. There’s no single model that works in every situation, especially when we consider the constraints of real-life systems.
Understanding some of the different considerations when choosing a good model is critical to ensuring a successful project. As a summary, here is the list that we just discussed:
- Performance of the model
- Explainability of results
- Model’s complexity
- The size of the dataset
- The dimensionality of the data
- Training time and cost
- Inference time
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot