Beginners guide to stock prediction using LSTM.

Original Source Here

I’ll be predicting the future value of the stock of one of the prominent Indian healthcare companies. The time series forecasting would be done using the LSTM model which is a type of recurrent neural network used in deep learning and can learn the order of dependence between items in sequence. They have the ability to learn the context required to make predictions in time series problems rather than having this problem specified, before hand.

The data chosen is from May 2016 to May 2021 as the healthcare companies in India saw a tremendous boom in the past 5 years. Strengthening coverage, services, increasing expenditure by public and private players and the recent pandemic made these stocks show quite a momentum. The program implemented is pretty basic and in no way can it be used to make any significant profits, however it would give you a fair idea regarding how market prediction works.

I’ll be starting off by importing the libraries necessary for this model.

And then extract the data from yahoo finance with the help of pandas datareader. It requires us to put the stock ticker symbol which in this case is ‘APOLLOHOSP.NS’. We also need to mention the source (yahoo finance) and the start and end date of the data we need to base our predictions on.

1st 10 rows of the healthcare company’s stock data

The dataset has different variables, namely: High, Low, Open, Close, Volume and Adj Close. Also, we have the Date set as our index, which is highly significant in any timeseries problem.

· Variables Open and Close show the starting and final price at which the stock is traded on a particular day.

· Variables High, Low represent the maximum and minimum price of the stock for the day.

· Volume is the number of shares bought or sold in the day.

· While the Adj Close is the adjusted closing price of the stock. It alters the stock’s close price after accounting for any corporate actions. It’ll give us a better idea about the overall value of the stock and help make better decisions.

We then proceed with the basic checks. We’ll see whether the dataset has any null values. If present, we try to fill that up using a column average. Furthermore, we need to lookout for the datatypes of the variables. In case of any variation, we would homogenize them to have the same datatypes.

Checking for null values
Variable datatypes and memory usage

Finally, we plot the growth of the stock’s adjusted closing price from May 2016 to May 2021, as that is to be predicted. As evident, we see that the stock price has been oscillating in the range of Rs 1000 to 1500 in the years 2016 to 2019 following a sudden ascend in the year 2020 due to the strong focus of the government on healthcare following the pandemic. The budget estimates for the Department of Health and Family Welfare in year 2020–2021 showed quite a satisfactory increase of 3.75%. Also, there was a considerable 10% hike in allocation for the Department for Health Research. All these factors resulted in growing stock prices of the healthcare companies.

Closing price for the past 5 years.

Here, since we’ll be predicting the stock prices, we’ll need to split our data into two. While dealing with timeseries problems we cannot randomly split the data in train and test set as that will hamper the time component. We need to decide how much data we’ll need to train on. And this decision needs to be made with respect to the date, as the prediction is dependent on the previous data points. Let’s go with 80% of the data that needs to be trained on, since the high spike in the stock prices is seen only from 2020. It would be imperative to train some data from the year 2020 to make successful predictions. The rest of 20% needs to be tested on.

The data for our timeseries problem needs to be scaled when training a recurrent neural network like Long Short-Term Memory as LSTMs are extremely sensitive to the scale of the data. When a network is fit on unscaled data that has a high range of values such as our stock prices, it is possible for large inputs to slow down the learning and convergence of your network and in some cases prevent the network from effectively learning your problem. I’ll be scaling this data in the range of 0 to 1, as specified in the ‘feature_range’.

So, the basic logic behind LSTM is that the data taken from previous day is used to predict the next day data. Now, this time window of 1 day is again used to predict the next day, and so on. This iteration takes place over the whole dataset in batches. The key to this is bigger the time frame, the better. The more number of data points you consider for the prediction, a more accurate prediction shall be seen. Shown below is the code that will create a dataset in which X_train and X_test are the set of independent variables at a particular time (t) and Y_train and Y_test are the target variables at the next time (t+1). As shown, I have taken the timestep=80. Also, X_test and y_test can be created in a similar way with help of test dataset.

Now that we are done with the preprocessing, its time to apply the LSTM model. However, before applying the LSTM model, we need to reshape our data. Why reshape? Because the LSTM network expects the input to be 3 dimensional in the form of number of samples, number of timesteps and the number of features. Also right now, our X_train and y_train dataset is 2D. As depicted below, the number of samples is the number of of rows in X_train, number of timestamps is 80 and the number of features are 1.

The LSTM architecture is pretty easy to understand. To start off with, we would be reading in our sequential data and we are going to assign this to the model. The data is then fed to the neural network and trained for prediction assigning random biases and weights. In the 1st layer, we are putting in X_train which goes into the 50 hidden units and is then transformed into a single output of stock return value. Adding a dropout regularization is for reducing overfitting in the neural network. Finally, we have the output Dense layer, and since we only need output, units has to be 1. I have created a function to build and fit a LSTM network to make it hassle free.

Fitting the LSTM model

If you want to delve deeper into the workings of LSTM, do refer the blog below.

We then use the model to predict the stock prices based on X_test. Using inverse transform brings back the predicted values in original format.

I’ll be using RMSE to calculate the error value. A score of 113.45 is pretty good. Better than I expected for our model!

RMSE score

The LSTM model can be further tuned for various parameters such as increasing the number of epochs, changing the number of LSTM layers, or by adding dropout value. Though the predictions from LSTM are not enough individually to identify whether the stock price will increase or decrease. The market is also largely affected by the news about the company and other sentiments. I am very much interested in exploring timeseries problems in detail and planning to try blending in the useful news data which would relate to our target and help with the prediction.

Though a minimal set of functionalities has been depicted through this walk through, I sincerely hope it gives enough of an insight. Below are some resources that I referred and you may find useful!


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: