https://miro.medium.com/max/1200/0*0RYsu3_5425_hOzX
Original Source Here
Installing PyTorch on Apple M1 chip with GPU Acceleration
It finally arrived!
The trajectory of Deep Learning support for the MacOS community has been amazing so far.
Starting with the M1 devices, Apple introduced a built-in graphics processor that enables GPU acceleration. Hence, M1 Macbooks became suitable for deep learning tasks. No more Colab for MacOS data scientists!
Next on the agenda was compatibility with the popular ML frameworks. Tensorflow was the first framework to become available in Apple Silicon devices. Using the Metal plugin, Tensorflow can utilize the Macbook’s GPU.
Unfortunately, PyTorch was left behind. You could run PyTorch natively on M1 MacOS, but the GPU was inaccessible.
Until now!
You can access all the articles in the “Setup Apple M-Silicon for Deep Learning” series from here, including the guide on how to install Tensorflow on Mac M1.
How it works
PyTorch, like Tensorflow, uses the Metal framework — Apple’s Graphics and Compute API. PyTorch worked in conjunction with the Metal Engineering team to enable high-performance training on GPU.
Internally, PyTorch uses Apple’s Metal Performance Shaders (MPS) as a backend.
The MPS backend device maps machine learning computational graphs and primitives on the MPS Graph framework and tuned kernels provided by MPS.
Note 1: Do not confuse Apple’s MPS (Metal Performance Shaders) with Nvidia’s MPS! (Multi-Process Service).
Note 2: The M1-GPU support feature is supported only in MacOS Monterey (12.3+).
Installation
The installation process is easy. We will break it into the following steps:
Step 1: Install Xcode
Most Macbooks have Xcode preinstalled. Alternatively, you can easily download it from the App Store. Additionally, install the Command Line Tools:
$ xcode-select --install
Step 2: Setup a new conda environment
This is straightforward. We create a new environment called torch-gpu
:
$ conda create -n torch-gpu python=3.8
$ conda activate torch-gpu
The official installation guide does not specify which Python version is compatible. However, I have verified that Python versions 3.8
and 3.9
work properly.
Step 2: Install PyTorch packages
There are two ways to do that: i) Using the download helper from the PyTorch web page, or ii) using the command line.
If you choose the first method, visit the PyTorch page and select the following:
You could also directly run the conda install command that is displayed in the picture:
conda install pytorch torchvision torchaudio -c pytorch-nightly
And that’s it!
Pay attention to the following:
- Check the download helper first because the installation command may change in the future.
- Expect the M1-GPU support to be included in the next stable release. For the time being, it only is found in the Nightly release.
- More things will come in the future, so don’t forget to check the release list now and then for new updates!
Step 3: Sanity Check
Next, let’s make sure everything went as expected. That is:
- PyTorch was installed successfully.
- PyTorch can use the GPU successfully.
To make things easy, install the Jupyter notebook and/or Jupyter lab:
$ conda install -c conda-forge jupyter jupyterlab
Now, we will check if PyTorch can find the Metal Performance Shaders plugin. Open the Jupiter notebook and run the following:
import torch
import math# this ensures that the current MacOS version is at least 12.3+
print(torch.backends.mps.is_available())# this ensures that the current current PyTorch installation was built with MPS activated.
print(torch.backends.mps.is_built())
If both commands return True
, then PyTorch has access to the GPU!
Step 4: Final test
Finally, we run an illustrative example to check that everything works properly.
To run PyTorch code on the GPU, use torch.device("mps")
analogous to torch.device("cuda")
on an Nvidia GPU. Hence, in this example, we move all computations to the GPU:
dtype = torch.float
device = torch.device("mps")
# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)
# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(2000):
# Forward pass: compute predicted y
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)
# Backprop to compute gradients of a, b, c, d with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_a = grad_y_pred.sum()
grad_b = (grad_y_pred * x).sum()
grad_c = (grad_y_pred * x ** 2).sum()
grad_d = (grad_y_pred * x ** 3).sum()
# Update weights using gradient descent
a -= learning_rate * grad_a
b -= learning_rate * grad_b
c -= learning_rate * grad_c
d -= learning_rate * grad_d
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
If you don’t see any error, everything works as expected!
Evaluation
A follow-up article will benchmark the PyTorch M1 GPU execution against various NVIDIA GPU cards.
However, it’s undeniable that the GPU acceleration far outspeeds the training process on the CPU. You can verify this yourself: Design your own experiments and measure their respective times.
The Apple engineering team performed an extensive benchmark of popular deep learning models on the Apple silicon chip. The following image shows the performance speedup of the GPU compared to the CPU.
Closing Remarks
The newest addition of PyTorch to the toolset of compatible MacOS deep-learning frameworks is an amazing milestone.
This milestone allows MacOS fans to stay within their favourite Apple ecosystem and focus on deep learning. They no longer have to choose the other alternatives — Intel-based chips or Colab.
If you are a data scientist and fan of MacOS, feel free to check the list of all Apple Silicon-related articles:
AI/ML
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot