Original Source Here
Transforming non-stationary series to make it stationary
One method for transforming the simplest non-stationary data is differencing. This process involves taking the differences of consecutive observations. Pandas has a
diff function to do this:
The output above shows the results of first, second, and third-order differencing.
For simple distributions, taking the first-order difference is enough to make it stationary. Let’s check this by using the
adfuller function on the
diff_1 (first-order difference of
When we run
adfuller on the original distribution
ts2, the p-value was close to 1. After differencing, the p-value is flat 0, suggesting we reject the null and conclude the series is now stationary.
However, some distributions may not be so easy to deal with. Consider this one we saw earlier:
Before taking the difference, we have to account for that obvious non-linear trend. Otherwise, the series will still be non-stationary.
To remove non-linearity, we will use the logarithmic function
np.log and then, take the first-order difference:
As you can see, the distribution that returned a perfect p-value before transformation is now completely stationary.
Below is the plot of monthly antibiotics sales in Australia:
As you can see, the series shows both an upward trend and a strong seasonality. We will again apply a log transform and, this time, take a yearly difference (365 days or 12 months) to remove the seasonality.
Here is what each step looks like:
We can confirm the stationarity with
The p-value is extremely small, proving that the transformation steps have shown their effect.
In general, every distribution is different, and to achieve stationarity, you might end up changing multiple operations. Most of these involve taking logarithms, first/second-order, or seasonal differencing.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot