Improve Linear Regression for Time Series Forecasting*S60F_S8Sslf3TOpZ

Original Source Here

Time series forecasting is a very fascinating task. However, build a machine-learning algorithm to predict future data is trickier than expected. The hardest thing to handle is the temporal dependency present in the data. By their nature, time-series data are subject to shifts. This may result in temporal drifts of various kinds which may become our algorithm inaccurate.

One of the best tips I recommend, when modeling a time series problem, is to stay simple. Most of the time the simpler solutions are the best ones in terms of accuracy and adaptability. They are also easier to maintain or embed and more persistent to possible data shifts. In this sense, the gold standard for time series modeling consists in the adoption of linear-based algorithms. They require few assumptions and simple data manipulations to produce satisfactory results.

In this post, we carry out a sales forecasting task. We provide future forecasts using the standard linear regression and an improved version of it. We are referring to linear trees. They belong to the family of model trees, like classical decision trees, but are different because they compute linear approximations (instead of constant ones) fitting simple linear models in the leaves. The training is computed evaluating the best partitions on the data fitting multiple linear models. The final model is a tree-based structure with linear models in the leaves.

A pythonic implementation of linear trees is available in linear-tree: a python library to build Model Trees with Linear Models at the leaves. The package is fully integrable with sklearn. It provides simple BaseEstimators that wrap every linear model present in sklearn.linear_model to build an optimal linear tree.


For our experiment, we simulate some data that replicate store sales. We generate artificial sales history of 400 stores. Where store sales are influenced by many factors, including promotions, holidays, and seasonality. Our duty is to forecast the future daily sales for up to one year in advance.

We don’t use any past information when we provide our predictions. We engineer all our regressors to be accurate and accessible in any future time period. This enables us to provide long-time forecasts.

We have 400 stores with historical sales from different years with visible daily seasonality.

Series of simulated store sales (image by the author)


We aim to predict sales up to a year ahead for all the stores at our disposal. To make this possible, we build a different model at the store level. We operate parameters tuning on the training set for each store independently. We end with 400 models trained ad-hoc. The procedure is computed fitting simple linear regressions and linear trees. In this way, we can verify the goodness of 400 independent fits by comparing the performances on the unseen test data for both model types.

Training a linear tree is simple as training a standard linear regression. When we train a linear tree, we are simply fitting multiple linear models on different data partitions. We can clearly understand that we benefit from the same advantages, i.e. simple preprocessing and good explicative power.

Inspecting the tree paths of the various fits, we can understand the decision taken by the algorithm. In the case of linear trees, we identify which features are responsible to split the original dataset and provide the benefit of further fits.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: