Installing PyTorch on Apple M1 chip with GPU Acceleration*0RYsu3_5425_hOzX

Original Source Here

Installing PyTorch on Apple M1 chip with GPU Acceleration

It finally arrived!

Photo by Ash Edmonds on Unsplash

The trajectory of Deep Learning support for the MacOS community has been amazing so far.

Starting with the M1 devices, Apple introduced a built-in graphics processor that enables GPU acceleration. Hence, M1 Macbooks became suitable for deep learning tasks. No more Colab for MacOS data scientists!

Next on the agenda was compatibility with the popular ML frameworks. Tensorflow was the first framework to become available in Apple Silicon devices. Using the Metal plugin, Tensorflow can utilize the Macbook’s GPU.

Unfortunately, PyTorch was left behind. You could run PyTorch natively on M1 MacOS, but the GPU was inaccessible.

Until now!

You can access all the articles in the “Setup Apple M-Silicon for Deep Learning” series from here, including the guide on how to install Tensorflow on Mac M1.

How it works

PyTorch, like Tensorflow, uses the Metal framework — Apple’s Graphics and Compute API. PyTorch worked in conjunction with the Metal Engineering team to enable high-performance training on GPU.

Internally, PyTorch uses Apple’s Metal Performance Shaders (MPS) as a backend.

The MPS backend device maps machine learning computational graphs and primitives on the MPS Graph framework and tuned kernels provided by MPS.

Note 1: Do not confuse Apple’s MPS (Metal Performance Shaders) with Nvidia’s MPS! (Multi-Process Service).

Note 2: The M1-GPU support feature is supported only in MacOS Monterey (12.3+).


The installation process is easy. We will break it into the following steps:

Step 1: Install Xcode

Most Macbooks have Xcode preinstalled. Alternatively, you can easily download it from the App Store. Additionally, install the Command Line Tools:

$ xcode-select --install

Step 2: Setup a new conda environment

This is straightforward. We create a new environment called torch-gpu :

$ conda create -n torch-gpu python=3.8
$ conda activate torch-gpu

The official installation guide does not specify which Python version is compatible. However, I have verified that Python versions 3.8 and 3.9work properly.

Step 2: Install PyTorch packages

There are two ways to do that: i) Using the download helper from the PyTorch web page, or ii) using the command line.

If you choose the first method, visit the PyTorch page and select the following:

You could also directly run the conda install command that is displayed in the picture:

conda install pytorch torchvision torchaudio -c pytorch-nightly

And that’s it!

Pay attention to the following:

  • Check the download helper first because the installation command may change in the future.
  • Expect the M1-GPU support to be included in the next stable release. For the time being, it only is found in the Nightly release.
  • More things will come in the future, so don’t forget to check the release list now and then for new updates!

Step 3: Sanity Check

Next, let’s make sure everything went as expected. That is:

  • PyTorch was installed successfully.
  • PyTorch can use the GPU successfully.

To make things easy, install the Jupyter notebook and/or Jupyter lab:

$ conda install -c conda-forge jupyter jupyterlab

Now, we will check if PyTorch can find the Metal Performance Shaders plugin. Open the Jupiter notebook and run the following:

import torch
import math
# this ensures that the current MacOS version is at least 12.3+
# this ensures that the current current PyTorch installation was built with MPS activated.

If both commands return True, then PyTorch has access to the GPU!

Step 4: Final test

Finally, we run an illustrative example to check that everything works properly.

To run PyTorch code on the GPU, use torch.device("mps") analogous to torch.device("cuda") on an Nvidia GPU. Hence, in this example, we move all computations to the GPU:

dtype = torch.float
device = torch.device("mps")

# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)

# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)

learning_rate = 1e-6
for t in range(2000):
# Forward pass: compute predicted y
y_pred = a + b * x + c * x ** 2 + d * x ** 3

# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)

# Backprop to compute gradients of a, b, c, d with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_a = grad_y_pred.sum()
grad_b = (grad_y_pred * x).sum()
grad_c = (grad_y_pred * x ** 2).sum()
grad_d = (grad_y_pred * x ** 3).sum()

# Update weights using gradient descent
a -= learning_rate * grad_a
b -= learning_rate * grad_b
c -= learning_rate * grad_c
d -= learning_rate * grad_d

print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')

If you don’t see any error, everything works as expected!


A follow-up article will benchmark the PyTorch M1 GPU execution against various NVIDIA GPU cards.

However, it’s undeniable that the GPU acceleration far outspeeds the training process on the CPU. You can verify this yourself: Design your own experiments and measure their respective times.

The Apple engineering team performed an extensive benchmark of popular deep learning models on the Apple silicon chip. The following image shows the performance speedup of the GPU compared to the CPU.

CPU vs GPU on Mac M1, both for training and evaluation (Source [1])

Closing Remarks

The newest addition of PyTorch to the toolset of compatible MacOS deep-learning frameworks is an amazing milestone.

This milestone allows MacOS fans to stay within their favourite Apple ecosystem and focus on deep learning. They no longer have to choose the other alternatives — Intel-based chips or Colab.

If you are a data scientist and fan of MacOS, feel free to check the list of all Apple Silicon-related articles:


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: