You Don’t Need a High-End System to Train Your Machine Learning Model



Original Source Here

Launching the Cloud Hosted Jupyter Lab

Launching your cloud-hosted machine is pretty simple and straightforward. The only thing you need is a link that you can paste into your browser. And, there you go. You will see a Jupyter notebook in your browser. Let’s see it in action.

After logging into the cloud environment you can move to the resource section. And then click on the new Jupyter server.

Screenshot by Author

My favorite part is the data science pipeline. This is very handy when you need to deploy end-to-end machine learning or deep learning models. You can configure your complete data flow and model building process in the pipeline and finally, you can run them automatically.

Furthermore, you can also expose an API call from your pipeline so that you can call the model in your app or website. Some other capabilities are deploying dashboards and connecting to a Snowflake database. When you deploy your dashboard, that means you can share your hosted dashboard with a single sharable link, and anyone with the access can view and play with the elements in the dashboard. A similar dashboard can be embedded in your app or website for further usage.

Configure your resource

The main benefit of a cloud-based solution is that you can assign a specific configuration to specific projects because different project use cases need different kinds of system configurations. A deep learning model is complex so it requires a good GPU-enabled system to train in less time.

It’s a good practice to manage your resources efficiently according to your needs.

Based on your training epoch size and dataset size, you can select the RAM space and the disk space. It is a good practice to select the resource according to your model complexity and the data size. Otherwise, if the model is simple, most of your memory power won’t be used by the model.

Cloud-hosted Jupyter Lab also allows you to get access to high-power GPUs, which is great! You can use any GPU for your preferred amount of time so that you can get the most of the GPUs at a lower cost.

Running Parallel Model Instances

You can also create different resources for different machine learning models. You can make a separate resource for your deep learning model that requires GPU power and additionally, you can create another resource for a simple machine learning model, which might not require any GPU.

The benefit of this approach is that you can run many different models in different resources at the same time and you won’t be slowed down because everything is happening in the cloud.

Launching Your Cloud-Hosted Jupyter Lab

Once the Jupyter Server is ready, we can just click on the green button to make the Jupyter Notebook ready to run and then click on the Jupyter Notebook to launch under the Jupyter Lab.

A new tab will open that will have your Jupyter Lab.

You will get a similar Jupyter Lab that is running on the cloud. Now, you don’t need to worry about your RAM issues.

Now, you can click on the new button to create your .ipynb notebook. The good thing is that you can even upload your own .ipynb notebook here. And, also you can export any of your .ipynb notebooks in pdf, Html, or many more formats.

Control your model versions

The benefit of a cloud-based solution is that you can connect your GitHub or any version control system with your account so that each time you tune your model configuration, you can save them as different versions.

Here, you don’t need to paste your model files separately and most of the things are automated. Your changes also get committed in GitHub automatically after some time — which is great!

Additionally, you can share access to the repo with others so the whole team can collaborate on the same projects. Also, you can use the same GitHub repo to showcase your previous project work with the interviews.

The project report that you make on the cloud can be accessible by the person you want. If someone wants to check out your previous build model report, with a single click, you can generate the report and can share it with anyone. Do not worry about the security, everything here is very secure.

Parallel Computing with Dask

Dask is very popular when it comes to parallel computing. But, again, parallel computing needs a quality system, which is not feasible in your personal machine. But, in a cloud-based solution, it is.

Just a single line of code can convert your Pandas code to Dask.

import dask 
import pandas as pd

@dask.delayed
def testfn(df):
df1 = df.groupby("PULocationID").trip_distance.mean()
return df1
testfn(df).compute()

Now, if you want your data frame to use dask parallelization just call this compute() function.

Here is an example with Pandas:

pandasDF[["trip_distance"]].mean()

And, now if I will just add the compute() function, as shown below:

daskDF[["trip_distance"]].mean().compute()

We are now using dask parallelization.

Be Industry Ready

Most of the industries have already started the “Move to the Cloud” initiative. Because of flexibility, security, easy-to-use environment, most industries are preferring cloud-based solutions.

So, if you are already familiar with using Jupyter Notebooks in the cloud, then things will be much easier. As per my observation with Linkedin and other social media platforms, many industries are still using local machine learning environments.

Adapting to a cloud-based machine learning environment will surely open another level of future opportunities and ease in your career.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: