Original Source Here
1. What is Heroku and Why Do We Need It?
Heroku is a platform as a service that deploy apps onto the cloud. The platform officially supports apps created by any of several programming languages such as Node.js, Ruby, Java, PHP, Python, Go, Scala and Clojure. It also supports other languages (such as R Shiny) through the use of third-party buildpacks.
The advantage of using Heroku as opposed to creating your own virtual private server (VPS) running on Linux or Windows operating system is the time and overhead that comes along with maintain the server. Imagine the headache of dealing with software upgrades, dependencies and compatibility issues.
For example, I have a Ubuntu server running several of my R Shiny apps, which is running perfectly fine but it is running on Ubuntu 14 and an older version of R, which means that it would not support newer R packages. Migrating all the apps to a new server seems to be an overwhelming task that I still procrastinate about to this very day. All of these concerns would no longer matter if using Heroku. The great thing is that you can get started for free with Heroku as they provide a free tier of about 550–1000 dyno hours per month.
Dynos are isolated Linux containers and serves as the building blocks of Heroku’s offerings (more information on Heroku’s dynos here). They range in size from a small and economical dyno (offering fewer CPU processors and small RAM capacity) to larger and costlier dyno (unlimited CPU processors and up to 14 GB of RAM).
2. What Model are We Deploying Today
The machine learning model that we are going to be deploying today is the Penguins Species classification model.
In essence, the model will use innate physical characteristics of penguins (e.g. bill length, bill depth, flipper length and body mass) along with their gender and geographical location as input parameters (i.e. the X variables) in order to classify penguins as belonging to one of three species (e.g. Adelie, Gentoo and Chinstrap) which is the Y variable.
A cartoon illustration of the modeling workflow is summarized in the image on the left.
3. Building the Machine Learning Model
Now, that we have seen the schematic workflow of how the model is built, let’s now take a look at the code.
3.2. Line-by-Line Explanation
In this simple example, we’re going to use the random forest algorithm for classifying penguins as belonging to one of three species (Adelie, Gentoo and Chinstrap) as also mentioned above.
Line 1 — Import the
pandas library which will subsequently be used to store values of the penguins dataset.
Line 2 — Read in the penguins dataset from a CSV file using the
pd.read_csv() function and store the contents to the
Lines 4 and 5 — Commented text to describe that the code block underneath will be performing ordinal feature encoding
Line 6 — Copy the contents from the
penguins variable to the newly created
Line 7 — Assign the column name
'species' found in the
df dataframe that we will use as the Y variable.
Line 8 — Create a list of ordinal features to be encoded and assign it to the
Lines 10–13 — Here, the ordinal features will be encoded. The
for loop is used to iterate through the 2 features to be encoded as follows:
— Perform one hot encoding via the
pd.get_dummies() function and concatenate the newly generated one hot encoded values as new columns into the
— Finally, delete the original column from the
Lines 15–19 — The species column which will be used as the Y variable will now be converted to numerical form via numerical mapping whereby the class label (Adelie, Chinstrap and Gentoo) are mapped to numerical values (0, 1 and 2) as shown in line 15. A simple custom function is created to perform this numerical mapping (lines 16 and 17), which will be applied on line 19. The mapped values are then assigned to the
'species' column of the
Lines 21–23 — The
df dataframe will now be separated into the
Y variables as performed on lines 22 and 23, respectively.
Lines 25–28 — This block of code will now build the random forest model as commented on Line 25. This starts by importing the
RandomForestClassifier function from the
sklearn.ensemble sub-module. The model is instantiated on Line 27 whereby the
RandomForestClassifier() function is assigned to the
clf variable. The model is finally trained on Line 28 via
clf.fit() function using
Y variables as the input data.
Line 30–32 — Finally, we’re going to save the model by serializing it with the
pickle library where the
pickle.dump() function will save the trained model stored in the
clf variable into a file called
We now have the trained model saved from
scikit-learn, which we will now use for model deployment.
4. Building the Web App
Full details and line-by-line explanation on building the Penguins web app in Python using the Streamlit library is provided in the prior article (below).
Let’s now build the Penguins app using the Streamlit library. The code for building the app is provided in the
penguins-app.py file shown below. This app will make use of the trained model (
penguins_clf.pkl) for predicting the class label (the Penguin’s species as being Adelie, Chinstrap or Gentoo) by using input parameters from the sidebar panel of the web app’s front-end.
4.2. Launching the App
We are going to launch the app locally on our own computer. Firstly, make sure you have
streamlit installed and if you haven’t already you can do so using the following commands:
pip install streamlit
In the terminal, we can now launch the app (the
penguins-app.py file) using the commands shown below:
streamlit run penguins-app.py
In a short moment we should see the following output in the terminal.
> streamlit run penguins-app.pyYou can now view your Streamlit app in your browser.Local URL: http://localhost:8501
Network URL: http://10.0.0.11:8501
Then a browser should pop up, giving us the Penguins app.
Now that the web app works locally, we will now proceed to deploying it onto the cloud.
5. Repository for the Penguins App
5.1. Creating a New Repository
Firstly, we will create a new repository on GitHub and we will name the repository to be
penguins_heroku, which can be entered into the text box for Repository name. Then tick on Add a README file and click on Create repository.
Secondly, we will upload the trained model (
penguins_clf.pkl) and the web app (
penguins-app.py) to this new GitHub repository. This can be done by clicking on Add file > Upload files. Then choose and upload the above 2 files (
From the above screenshot, you will see that in addition to the 2 uploaded file we have 5 additional files (
runtime.txt) that we will have to create and place inside this repository as well.
5.2. Create New Files in the Repository
To create a new file directly on GitHub, we can click on Add file > Create new file.
In the example below we will create one of the four files mentioned above, which we will start with creating the
Then, scroll to the bottom of the page and click on the Commit new file button. Afterwards, you should notice the addition of the Procfile to the repository.
Repeat this for the 4 remaining files consisting of
6. Deploying to Heroku
Let’s now proceed to deploying the model by heading over to the Heroku website to sign up (if you haven’t already) and log in.
6.1. Sign up to Heroku
To sign up for a free Heroku account find the “Sign up” button at the top right hand corner of the Heroku website as shown below.
6.2. Log-in to Heroku
After sign up, log into your Heroku account.
6.3. Creating a New App
To create a new app, click on New > Create new app button as shown in the screenshot below.
Now, we’re going to give the app a name, here we will use
penguins-model which is still available. It should be noted that if an App name is already taken you will see an error message, if so then you can choose a new name.
To proceed, click on the Create app button at the bottom.
6.4. Connecting to GitHub
We’re now going to connect our App to the GitHub repository. To do this click on GitHub (Connect to GitHub) as shown in the screenshot below.
If this is your first time deploying to Heroku, you will have to authenticate your GitHub account and give Heroku permission to access it. This is done once per Heroku account.
Now, type in the name of the GitHub repository that you have just created into the text box and click on Connect.
If this was successful, you’ll see the
Connected to dataprofessor/penguins-heroku message. It should be noted that we can activate
Automatic deploys but this is recommended after
Manual deploy is successful. Thus, we can come back and activate this later.
Now, scroll down and click on the Deploy Branch button.
The build log will update as the container is provisioned and prerequisite libraries are being installed.
After the container have been provisioned and libraries have been installed successfully, you should see the message
Your app was successfully deployed.
Now, click on the View button to launch the deployed web app.
7. Deployed Penguins App
If you see the web app with no error message, congratulations! You have now successfully deployed the Penguins App. Click on the following link if you would like to see a Demo of the Penguins App.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot