Installing TensorFlow on an Apple M1 (ARM native via Miniforge) and CPU versus GPU Testing

11 min readAug 28, 2022

The relevance of trying to install TensorFlow on an Apple Mac M1 is that:

Googles TensorFlow machine learning library is a very flexible and widely used one
Tensorflow runs best on a high end GPU and the M1 contains a great GPU.

On the latter, Apple used to insinuate on their website by the graphic mentioned in this Macrumors article, that the M1 Ultra (the most powerful M1 GPU at the time of writing, 27 Aug 2022), has the same maximum performance as the most performant discrete graphics card around, the NVIDIA 3090 series.

The Highest-end discrete GPU curve here is in fact incomplete and in reality continues to higher performance than the M1 max performance, as evident from the next independent Geekbench 5 test results.

However, according to the same MacRumors page, the performance of the NVIDIA RTX 3090 GPU is found to be at least 2 times that of the M1 Ultra in an independent Geekbench 5 test.

Nevertheless, the Mac Studio (Ultra) has almost 50% of the performance of the best discrete graphics card around, already embedded in your motherboard and that is an impressive feat.

According to the above bar chart, my MacBook Pro 14-inch with an M1 Max should have 30% of the RTX 3090 maximum performance.

So let’s get this GPU beast working!

There are a few tutorials online on how to install TensorFlow on a Mac with an Apple ARM M1 chip (instead of an Intel one), but I had to piece about 6 of them together to make it work for my setup. This resulted in a publicly downloadable apple_tensorflow_20220828.yml file for conda environment creation, which you can simply install by:

$ conda env create --file=apple_tensorflow_20220828.yml --name=apple_tensorflow

If you want to know how I generated this file, read on.

The installation method depends on two requirements I had, and you may also have.

I want to use native ARM code rather than the Rosetta x86 Intel to ARM translation layer on top, mimicking an Intel machine, since removing this extra layer may increase performance a bit, much needed for machine learning training and inference (and I am not sure if as of today, 27 Aug 2022, it’s even possible to get TensorFlow working via Rosetta on an M1.)
I want to isolate the environment from my other python projects, for which I use Anaconda. I am happy to use Miniforge as well, but I don’t want to touch all my other Anaconda projects where python packages are Intel based. Turns out you can run [Anaconda (or Miniconda)] and [Miniforge] next to each other on the same system after all. (See section C for details.) For the relation between conda, Anaconda vs. Miniconda and Minconda vs Miniforge, check these footnotes¹ ² ³.

For the semi-hurried, I start in section A with my resulting installation instructions and, in section B, show a test proving that it works (for me). In section C, I mention other installation tutorials for the same, but where I got blocked when trying them out, how I sometimes got no solution and needed a workaround, and sometimes did find a direct solution.

A. Installing Tensorflow and Prerequisites

We assume below that everything happens in an OS X Terminal application set to the bash shell.

1. update your MacOS

$ sw_vers -productVersion12.5.1

is what I get on 27th Aug 2022. For Tensorflow, you need at least version 11 (Big Sur). If you have version 10 or lower, you must upgrade to a version ≥ 11 from the macOS App Store.

2. Install / update x-code tools

$ xcode-select -install

3. Miniforge

Miniforge allows you to install apple arm packages via miniconda.

From https://github.com/conda-forge/miniforge#miniforge3, download Miniforge3-MacOSX-arm64

$ cd ~/Downloads
$ chmod a+x Miniforge3-MacOSX-arm64.sh
$ ./Miniforge3-MacOSX-arm64.sh

for me, since I accepted the default settings in the dialogue that followed, this installed miniforge3 in /Users/peter/miniforge3.

So check this with:

$ which conda/Users/peter/miniforge3/condabin/conda

So conda points to the Miniconda one inside the Miniforge3 directory rather than my Anaconda conda one inside /Users/peter/opt/anaconda3. If want to use the anaconda one I can use ~/opt/anaconda3/condabin/conda explicitly or create two aliases via:

$ alias anaconda=~/opt/anaconda3/bin/conda
$ alias miniforge=~/miniforge3/condabin/conda

so that I can use Anaconda and Miniforge now. A somewhat cleaner approach to obtain the same is mentioned in section C.3 below.

4. Setup a conda environment

You can download my apple_tensorflow_20220828.yml file from this GitHub link. Then, make sure that conda refers to Miniforge (as in ~/miniforge3/condabin/conda or so) and do:

$ conda deactivate
$ conda env create --file=apple_tensorflow_20220828.yml --name=apple_tensorflow

Apart from some comments, the above, when successful, should end in:

Preparing transaction: done
Verifying transaction: done
Executing transaction: doneRetrieving notices: …working… done

You can then check:

$ conda env list

and

/Users/peter/miniforge3/envs/apple_tensorflow

should be a line appearing in the list returned. Do

$ conda activate apple_tensorflow

The next three commands you can skip since the yml file above now includes these 2 pip libs.

(apple_tensorflow) $ python3 -m pip install --upgrade pip
(apple_tensorflow) $ pip install tensorflow-macos
(apple_tensorflow) $ pip install tensorflow-metal

If, when running an ML program, you would encounter problems with protobuf, as in:

tensorflow-metadata 1.10.0 requires protobuf<4,>=3.13, but you have protobuf 4.21.5 which is incompatible.tensorflow-macos 2.9.2 requires protobuf.20,>=3.9.2, but you have protobuf 4.21.5 which is incompatible.

, do the following:

(apple_tensorflow) $ pip install protobuf==3.19.4

and rerun.

Now your TensorFlow environment installation is complete and you will want to test it.

B. Testing Tensorflow

1. Further install requirements

In the same conda environment “apple_tensorflow”, activated by

$ conda activate apple_tensorflow

and you should not see an additional (apple_tensorflow) in your shell prompt. This incidates you are ‘iside’ the conda environment with this name. We now additionally install for testing purposes:

(apple_tensorflow) $ pip install pandas
(apple_tensorflow) $ pip install tensorflow_datasets

2. Test TensorFlow

As a “hello world” test of TensorFlow, we use the MNIST problem. That’s part of the tensorflow_datasets you just installed. It is a bunch of raster pictures of hand-written digits from 0 to 9. We will try to categorise and then recognise them. This is sort of the Hello World problem of machine learning. A nice, very visual as well as full mathematical explanation of this problem and its solution method is on this page by “chrisolah”. A more hands on interactive Jupyter tutorial for it is here at Google.

You can now download from Github this TensorFlowMNistTest.py script, (which is an evolved version of the code on this page). You will also need to download the EagerCpuGpuConfig.py script which contains a class to make your Gpu visible or invisible to ML computations.

The TensorFlowMNistTest.py code splits the MNIST data into train and test data. Then, it builds a tensorflow keras model and compiles it. Lastly it performs 12 epochs of [fitting the model to the test data and validation] ending up with 99.07% validation accuracy. This means that, theoretically, there is also 99.07% probability that in inference, the next received digit will be mapped to the correct category from 0 to 9.

Save these two python files in the same directory. Then do:

(apple_tensorflow)$ python TensorFlowMnistTest.py

This should lead to a computation taking between 1 and 5 minutes on an Apple M1 Max. Collecting the outputs for 4 cases: (switching run eagerly on/off) * (switching GPU visible/invisible) by the 2 booleans ‘disable_gpu_visibility’ and ‘eagerly’ in the code, we get the 4 log files as shown on GitHub here. As timings for the ‘fit’ function, so for the training we get:

MacM1MaxEagerCpuNoGpu.log:          0:05:13.037244
MacM1MaxEagerCpuGpu.log:            0:01:33.897871
So 3.4 times faster with than without GPU.MacM1MaxNotEagerCpuNoGpu.log:       0:03:58.723621
MacM1MaxNotEagerCpuGpu.log:         0:00:51.107395
So 4.6 times faster with than without GPU.

These time savings of a factor 4 on the Apple M1 when including the GPU are pretty impressive. For larger problems, further speedup could be sought by plugging in an external GPU via thunderbolt3. Both on Linux and on MacOs, one can get this to work. Note that on a Mac, at the time of writing, this Apple developer page on tensorflow-metal warns that more than one GPU is not supported. So one cannot distribute the work over more than 1 GPU yet via metal. Another option is of course to use the services of GPU heavy cloud computers tuned to perform best at that and likely be cost efficient at the same time.

So I hope the above writeup has saved you time. Please let me know if it did and also any typos you’d have found.

You may want to check out some other Tensorflow tutorials now.

C. Inspiring ReferenceTutorials

Thanks to all who published inspiring tutorials before. These are:

https://caffeinedev.medium.com/how-to-install-tensorflow-on-m1-mac-8e9b91d93706 from Prabhat Suma Sahu was the first one I found, but I did
the miniforge installation via

$ git clone https://github.com/conda-forge/miniforge
$ build_miniforge_osx.sh

which did not work, since it saw my anaconda installation and stopped right there with:

ERROR: File or directory already exists: ‘/Users/peter/conda’
If you want to update an existing installation, use the -u option.
$

2. Then I found Prabhats video and at minute 1:00 he showed to instead download the script. That download also resulted in a Miniforge3-MacOSX-arm64.sh file, but did run to completion.

3. Jeff Heaton’s different videos telling me how to install Miniforge and selecting the arm i.o intel version. He also after talking on installing Miniforge and TensorFlow on the M1 in this video, at minute 14 starts on the topic of running Anaconda and Miniconda on one machine next to each other.

Let me summarise this for you.

Basically, you can install anaconda (or Miniconda) on your Apple for your projects where you do not need the ARM level instructions and just want to work at the Intel level (with Rosetta implicitly below) for maximum compatibility with, say your colleagues who are still on Intel machines. Also there are of course still more python libraries available for the x86 Intel platform than for the Apple M1 platform.

You can also, before or after Anaconda (or Miniconda) installation, install Miniforge (see above for instructions). So, say you have installed both.

Say you have Anaconda installed in /Users/peter/opt/anaconda3. The conda executable then typically is at /Users/peter/opt/anaconda3/bin/conda. Then do

$ cd ~/opt/anaconda3/bin
$ ./conda init

Note that the ./ in the last command is important, since this calls the conda in the local directory rather than the (possible other/Miniforge) conda found in the PATH system environment variable. This last command generates a .bash_profile script (or a .zshrc script if you use the zsh i.o. bash) in your home directory (which are called when you start a new bash respectively zsh shell). This script looks like

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/Users/peter/opt/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/Users/peter/opt/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/Users/peter/opt/anaconda3/etc/profile.d/conda.sh"
    else
        export PATH="/Users/peter/opt/anaconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

and it just maps the command “conda” to anaconda. This is something we want to keep under a separately callable command. So copy it as such

$ cd ~
$ cp bash_profile start_anaconda.sh
# chmod a+x start_anaconda.sh

Now do the same for Miniforge, so

$ cd ~/miniforge3/condabin
$ ./conda init

Again, don’t forget the ./!

This generates/overwrites the ~/.bash_profile file which you then copy as in

$ cd ~
$ cp bash_profile start_miniforge.sh
# chmod a+x start_miniforge.sh

You now have 2 scripts. You can use the following to switch towards your Intel x86 based conda projects by doing:

$ source ~/start_anaconda.sh

and the next command to switch towards your ARM M1 based conda projects by doing:

$ source ~/start_miniforge.sh

So you can run more than one shell with each a project with different target architecture next to each other, like so:

Anaconda Intel base project in one shell

Miniforge ARM Apple M1 base project in another shell

By the ‘miniforge3’ or ‘anaconda3’ substring in the environment path, you can see with which conda tool you should activate the environment. Note that both conda tools show their own environments but also the environments of the other tool. If you try to activate a Miniforge environment with Anaconda, Anaconda won’t let you and gives a clear error message, as you can see below.

Anaconda won’t activate a Miniforge environment. The environment is a Miniforge created one, so naturally, you should activate it with Miniforge.

The same is true for trying to open an Anaconda environment with Miniforge.

4. This github tensorflow issue 153 page explaining the procedure, up to the point I got an “ERROR not a supported wheel on this platform”.

5. This page for seeing other people also having problems with the error:

ERROR: tensorflow_addons_macos-0.1a3-cp38-cp38-macosx_11_0_arm64.whl is not a supported wheel on this platform

strangely had no solution to it, but made me realise that just avoiding specifying specific wheel (.whl) files and doing

$ pip install tensorflow-macos

solved this problem.

6. A reminder on this Apple developer forum page to ensure to first do

$ python3 -m pip install --upgrade pip

before pip installing tensorflow-macos was a time saver too.

D. Extra Comparison with both a Wintel and a Mac Intel Machine

We also installed TensorFlow and ran the same program on an Intel based machine with an NVIDIA GPU as well as with a Mac Intel machine and described the processes here and here. Runtimes for 3 machines, for (eager, non eager) * (GPU, no GPU) are

Runtimes for 3 machines, for (eager, non eager) * (GPU, no GPU).

In bar graphs this gives:

For the two most recent systems (Mac M1 from 2021 and Wintel with Quadro T1000 (2019), using their GPU improves speed by a bout a factor 4. For the older system (the Intel based iMac with AMD GPU from 2015), the GPU helps by a factor 3 in non-eager mode and 5 in eager mode.

We get that the 2021 M1 Max chip (with 2 performance, 2 efficiency CPU cores) and embedded GPU with 32 cores) is about twice as fast in all modes as the 2019 Wintel i7 when considering CPUs. As for the executions on GPUs, the time reduction is about 30%. Compared to the Intel Mac from 2015, the CPU time reduction is about a factor 2 and the GPU time reduction about a factor 2 in all modes.

Footnotes:

Conda is an open source environment and package manager. Miniconda is a free installer for Conda, Python, and a few other useful packages. Anaconda is also a package manager that has a much larger number of packages that you can install. Reference
Should I use Anaconda or Miniconda? See this docs.conda.io page.
miniforge is the community (conda-forge) driven minimalistic conda installer. Subsequent package installations come thus from conda-forge channel. miniconda is the Anaconda (company) driven minimalistic conda installer. Subsequent package installations come from the anaconda channels (default or otherwise). Reference. Miniforge is applicable here because it provides packages tensorflow-macos and tensorflow-metal to get the most out of the M1 chip at ARM level.

Written by Peter Sels on Aug 27th 2022.

Installing TensorFlow on an Apple M1 (ARM native via Miniforge) and CPU versus GPU Testing

A. Installing Tensorflow and Prerequisites

B. Testing Tensorflow

C. Inspiring ReferenceTutorials

D. Extra Comparison with both a Wintel and a Mac Intel Machine

Written by Peter Sels

No responses yet