Note: I now highly recommend cibuildwheel instead of custom binary wheels. See GHA Pure Python Wheels and GHA Binary Wheels for modern methods to produce wheels on GitHub Actions (directly applicable to Azure, as well, with minor changes; cibuildwheel works on all most major CI providers). See my new posts on cibuildwheel!
This is the third post in a series about Azure
DevOps. This one is about making Python wheels. If you want to play
nice with Python users, or you have a complex build, this will
make your package far more accessible to users. They are faster to install and
to use and more secure. We will quickly cover making universal wheels, then we
will move on to fully compiled binaries, including C++14, manylinux2010, and
other hot topics. This series was developed to update the testing and releasing
of Python packages for Scikit-HEP. The results of this tutorial can be seen
in the boost-histogram repository, under the .ci
folder.
See Azure Wheel Helpers for a complete implementation example!
Introduction
You should know the basics of pipelines from
my first post. We will be targeting Azure Release
Pipelines, which you can read about in
my second post. I will assume your package
already has a nice setup.py
or setup.cfg
and uses setuptools, though most of
this should be applicable to the other packaging tools as long as they make
wheels.
SDist
You should provide wheels, but you also should always provide an SDist (source
distribution) as well. SDist’s are easy; there is just one thing to watch out
for: accidentally including too much in your package. You can (and should) write
a good MANIFEST.in
(or whatever your packaging tool uses) to make sure you
don’t pick up .pyc
, .git
, and other files like that. Building your SDist
using CI, such as Azure, helps as well; since the directory can be packaged
before building or running anything, you are less likely to pick up surprises.
Always check your SDist when you are first building them, though, they are just
simple zipped files and can be inspected easily.
Adding a DevOps job to make an SDist is easy. Since you can run this right before making a universal wheel, it integrates into that workflow and you will see it in the next section. If you make binaries, however, it should probably be in its own dedicated job, since you only want it to run one time (so it can’t be in a matrix).
Universal wheels
If you do not have any compiled code, and your code will run on any version of Python that it supports1, then you can make universal wheels. You can make them on any system, and run them on any system.
Here is my suggested azure-pipelines.yml
entry:
- job: "Package"
pool:
vmImage: "ubuntu-16.04"
steps:
- template: .ci/azure-wheel.yml
Here, it doesn’t matter what image we use, so we pick a Linux image. The
contents of .ci/azure-wheel.yml
could be the following:
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: "3.7"
architecture: "x64"
- script: |
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools wheel
displayName: "Install dependencies"
- script: |
python setup.py sdist
displayName: "Make sdist"
- script: |
python setup.py bdist_wheel --universal
displayName: "Make wheel"
- task: PublishPipelineArtifact@0
inputs:
artifactName: "artifact"
targetPath: "dist"
Here, we select a Python version (choice doesn’t matter; in fact, the default
may be fine). Next, we make sure pip, setuptools, and wheel are all
latest-and-greatest. The first call to setup.py makes the sdist; we do this
early to make sure we don’t pickup extra generated bits in our source
distribution. We then make the bdist; you can either pass --universal=1
here,
or (better) set the following in your setup.cfg
:
[bdist_wheel]
universal=1
We end by publishing the artifacts to the Azure artifacts; this way you can download them later or use them in Release Pipelines.
Binary wheels
I like to set up Azure DevOps with two pipelines if you are making non-universal wheels; I really don’t need every possible combination of Python and OS every time I want a PR tested. So I’ll have a test pipeline and a slower build (packaging) pipeline. You can always manually trigger a pipeline in the UI if you need to check all possible Python combinations for some reason.
For the following examples, I will assume you have the package name set as a variable, for example:
variables:
# This is the output name, - is replaced by _
package_name: my_package
Publishing from many jobs with unique names
I am going to start at the end, by making a template that will publish whatever
I produce in ./dist
, called .ci/azure-publish-dist.yml
:
steps:
- task: PublishPipelineArtifact@0
inputs:
artifactName: "artifact_$(Agent.OS)_$(Agent.JobName)_$(python.architecture)"
targetPath: "dist"
This tries very hard to make sure all the outputs will have unique names, but
can be collected with the glob artifact_*
(other tasks, like publishing tests
and artifacts also produce files here; if you do not use any other publishing
lines you can simplify this a little). If you do not run in a matrix (for
example, the SDist), the JobName
will not be very descriptive.
Making the SDist
We need a job that will just make an SDist; let’s keep it clearly separate:
- job: LinuxSDist
pool:
vmImage: "ubuntu-16.04"
variables:
python.architecture: "none"
steps:
- script: |
python -m pip install setuptools
python setup.py sdist
displayName: Publish sdist
- template: azure-publish-dist.yml
That’s it for the sdist. We set python.architecture
because we use it in the
publish step. We don’t need anything special to make an sdist, so it’s really
just these two lines. Warning about multiline commands: Failures in lines before
the final line will not cause the job to fail. However, this should be safe.
ManyLinux
Since there are many flavors of Linux, Python packagers have come up with a special subset of allowed interactions with the base operating system, and called that “ManyLinux1”. It’s based on CentOS 5, circa 2007; so in theory, most Linux OS’s after 2007 should be able to run your wheel. The common exceptions are the unusual distros, like Alpine Linux and Clear Linux, which will download the sdist and build. But you can cover CentOS, Fedora, Ubuntu, and many others.
However, CentOS5 (that is, Red Hat Enterprise Linux 5) has hit end-of-life, so
compiler packages and such are no longer being produced - the latest developer
toolset compiler is GCC 4.8, which does not support C++14. Recently, a new
CentOS 6 manylinux, called manylinux2010
, was released. You need a very recent
version of pip to be able to use it. Note it is also 64-bit only if that matters
to you on Linux. Here’s an example of a DevOps job matrix that builds both
ManyLinux1 and ManyLinux2010 wheels:
- job: ManyLinux
strategy:
matrix:
64Bit2010:
arch: x86_64
plat: manylinux2010_x86_64
image: quay.io/pypa/manylinux2010_x86_64
python.architecture: x64
64Bit:
arch: x86_64
plat: manylinux1_x86_64
image: quay.io/pypa/manylinux1_x86_64
python.architecture: x64
32Bit:
arch: i686
plat: manylinux1_i686
image: quay.io/pypa/manylinux1_i686
python.architecture: x86
pool:
vmImage: "ubuntu-16.04"
steps:
- script: |
set -ex
docker run -e PLAT=$(plat) -e package_name=$(package_name) --rm -v `pwd`:/io $(image) /io/.ci/build-wheels.sh
ls -lh wheelhouse/
mkdir -p dist
cp wheelhouse/$(package_name)*.whl dist/.
displayName: Build wheels
- template: azure-publish-dist.yml
The first few lines should be clear to you by now; we set up three jobs, each
with some custom variables. For the script, we run docker and pass the variables
into the script using -e VARIABLE=$(variable)
. We map the current working
directory to /io
in the container, and we run our script from its container
path. After it runs, we echo the contents of the wheelhouse directory (which is
where we build our files instead of the more normal “dist” directory). Finally,
we copy just the package-related wheels to “dist” - if you built some other
wheels, like numpy, along the way, this keeps them out of your dist directory.
The helper file here is .ci/build-wheels.sh
, and was based on the
official example.
#!/bin/bash
set -e -x
# Collect the pythons
pys=(/opt/python/*/bin)
# Filter out Python 3.4
pys=(${pys[@]//*34*/})
# Compile wheels
for PYBIN in "${pys[@]}"; do
"${PYBIN}/pip" install -r /io/dev-requirements.txt
"${PYBIN}/pip" wheel /io/ -w wheelhouse/
done
# Bundle external shared libraries into the wheels
for whl in wheelhouse/$package_name-*.whl; do
auditwheel repair --plat $PLAT "$whl" -w /io/wheelhouse/
done
# Install packages and test
for PYBIN in "${pys[@]}"; do
"${PYBIN}/python" -m pip install $package_name --no-index -f /io/wheelhouse
"${PYBIN}/pytest" /io/tests
done
The main differences here from the official example is the package name (which I pass in), the filter for Python 3.4 (since NumPy does not provide Python 3.4 wheels, this slows down the build a lot if included).
If you want to build ManyLinux1 wheels with a newer version of GCC, I’ve created
a docker image skhep/manylinuxgcc-x86_64
(and skhep/manylinuxgcc-i686
) with
a custom build of GCC 8 or 9; see
the formula here. The
ManyLinux2010 image should make this obsolete eventually.
macOS
In order to support macOS, you need to pay attention to what version of macOS Python was built with. Most sources of macOS Python are built with a recent version of macOS; the official Python.org versions are the oldest, and so should always be what you build your wheels against. So, if you want a completely generic setup, you should have something like this:
- script: .ci/macos-install-python.sh '$(python.version)'
displayName: Install Python.org Python
If you want to do so in a general setup (click here)
- script: .ci/macos-install-python.sh '$(python.version)'
displayName: Install Python.org Python
condition: and(succeeded(), eq(variables['Agent.OS'], 'Darwin'))
- task: UsePythonVersion@0
inputs:
versionSpec: "$(python.version)"
architecture: "$(python.architecture)"
condition: and(succeeded(), ne(variables['Agent.OS'], 'Darwin'))
The special setup only runs on macOS, other OS’s use the normal Azure Python task.
The contents of the macos-install-python.sh
file:
#!/usr/bin/env bash
PYTHON_VERSION="$1"
case $PYTHON_VERSION in
2.7)
FULL_VERSION=2.7.16
;;
3.6)
FULL_VERSION=3.6.8
;;
3.7)
FULL_VERSION=3.7.3
;;
esac
INSTALLER_NAME=python-$FULL_VERSION-macosx10.9.pkg
URL=https://www.python.org/ftp/python/$FULL_VERSION/$INSTALLER_NAME
PY_PREFIX=/Library/Frameworks/Python.framework/Versions
set -e -x
curl $URL > $INSTALLER_NAME
sudo installer -pkg $INSTALLER_NAME -target /
sudo rm /usr/local/bin/python
sudo ln -s /usr/local/bin/python$PYTHON_VERSION /usr/local/bin/python
which python
python --version
python -m ensurepip
python -m pip install setuptools twine wheel numpy
This installs 2.7, 3.6, and 3.7. You have a choice here; the most recent releases of Python.org Python have special 64-bit only 10.9+ builds; if you prefer, you can use the older 10.6+ dual architecture builds. If you want to support Python 3.5 on macOS, you’ll need to do this as well as select an older patch release, because Python no longer provides binaries for it. For any C++ build, you’ll probably have to make your code 10.9+ anyway (because libstdc++ was removed in 10.14, and you need 10.9+ to get the replacement, libc++).
At this point, you can just make wheels the normal way, using the same code we will use on Windows (except for Python 2.7):
- script: |
python -m pip wheel . -w wheelhouse/
displayName: "Build wheel"
# <INSERT TESTING HERE>
- script: |
ls -lh wheelhouse
mkdir -p dist
cp wheelhouse/$(package_name)* dist/.
displayName: "Show wheelhouse"
We should end by delocating the wheels; like auditwheel above, this will try to make sure all dependencies are included and referenced properly:
- script: |
python -m pip install delocate
/Library/Frameworks/Python.framework/Versions/$(python.version)/bin/delocate-wheel dist/$(package_name)*.whl
displayName: "Delocate wheels"
condition: and(succeeded(), eq(variables['Agent.OS'], 'Darwin'))
This is macOS only, so I have added a condition here; you don’t need it unless you share code with windows (or possibly linux, but the Docker-centeric build makes that unlikely).
Windows
If you don’t care about C++11 and Python 2.7 on Windows, then Windows is easy. First let’s assume you want a pretty standard matrix of versions. Make sure you include 32-bit for Windows; unlike macOS (which removed it years ago), and Linux (which may remove it soon), 32-bit is the default download option from Python.org.
- job: Windows
strategy:
matrix:
Python27:
python.version: "2.7"
python.architecture: "x64"
Python36:
python.version: "3.6"
python.architecture: "x64"
Python37:
python.version: "3.7"
python.architecture: "x64"
Python27_32:
python.version: "2.7"
python.architecture: "x86"
Python36_32:
python.version: "3.6"
python.architecture: "x86"
Python37_32:
python.version: "3.7"
python.architecture: "x86"
pool:
vmImage: "vs2017-win2016"
steps:
- template: .ci/azure-setup.yml
- template: .ci/azure-steps.yml
- template: .ci/azure-publish-dist.yml
This is a pretty standard matrix (where one might complain that I didn’t use the “matrix” part of matrix where it could have been used, but this is simple). If you don’t need special compilers, this is pretty much trivial. Just run the normal setup, bdist, and publish. You don’t even need to delocate the wheels2. Let’s show what it would look like if you need a more powerful compiler, such as MSVC 2017. Note: Do not do this unless you need C++11+! It will force your users to have the MSVC 2015+ redistributable to run, instead of the “normal” 2008 redistributable that Python requires. However, I love pybind11 (as you may have noticed from my previous posts), so this is a requirement for me. Hopefully no one is using Windows and Python 2.7 together.
Let’s look at the three files listed above. First, the Windows
.ci/azure-setup.yml
:
- task: UsePythonVersion@0
inputs:
versionSpec: "$(python.version)"
architecture: "$(python.architecture)"
- script: |
mkdir -p dist
python -m pip install --upgrade pip
python -m pip install --upgrade pytest wheel twine setuptools
displayName: "Install dependencies"
Everything there is normal. Next, .ci/azure-steps.yml
:
- script: |
call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" $(python.architecture)
set MSSdk=1
set DISTUTILS_USE_SDK=1
python -m pip wheel . -w wheelhouse/
displayName: "Build wheel (Windows Python 2.7)"
condition: and(succeeded(), eq(variables['python.version'], '2.7'))
- script: |
python -m pip wheel . -w wheelhouse/
displayName: "Build wheel"
condition: and(succeeded(), ne(variables['python.version'], '2.7'))
- script: |
ls -lh wheelhouse
mkdir -p dist
cp wheelhouse/$(package_name)* dist/.
displayName: "Show wheelhouse"
# <INSERT TESTING HERE>
The special thing here is the setup for MSVC when you are running Python 2.7. You are forcing distutils (really setuptools) to ignore the built-in MSVC settings, and instead pick up the 2017 settings.
You have already seen the publish part.
Wrap up
With that, we have now covered how to make a complete set of Wheels for ManyLinux, Windows, and macOS. You can see an example of all this in action with the boost-histogram package; look in the .ci folder.
If you have suggestions or corrections, either let me know in the comments below, or open an issue here, since this is an open source blog. I would like to thank Eduardo Rodrigues, who helped me edit these posts before they were published.
Bonus: Operating system agnostic files
(Click here to expand)
I really actually share many of the files, at least for macOS and Windows. Here
is what azure-setup.yml
looks like:
steps:
- script: .ci/macos-install-python.sh '$(python.version)'
displayName: Install Python.org Python
condition: and(succeeded(), eq(variables['Agent.OS'], 'Darwin'))
- task: UsePythonVersion@0
inputs:
versionSpec: "$(python.version)"
architecture: "$(python.architecture)"
condition: and(succeeded(), ne(variables['Agent.OS'], 'Darwin'))
- script: |
mkdir -p dist
python -m pip install --upgrade pip
python -m pip install --upgrade pytest wheel twine setuptools
displayName: "Install dependencies"
And, .ci/azure-steps.yml
, including a testing:
- script: |
call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" $(python.architecture)
set MSSdk=1
set DISTUTILS_USE_SDK=1
python -m pip wheel . -w wheelhouse/
displayName: "Build wheel (Windows Python 2.7)"
condition:
and(succeeded(), eq(variables['Agent.OS'], 'Windows_NT'),
eq(variables['python.version'], '2.7'))
- script: |
python -m pip wheel . -w wheelhouse/
displayName: "Build wheel"
condition:
and(succeeded(), not(and(eq(variables['Agent.OS'], 'Windows_NT'),
eq(variables['python.version'], '2.7'))))
- script: |
ls -lh wheelhouse
mkdir -p dist
cp wheelhouse/$(package_name)* dist/.
displayName: "Show wheelhouse"
- script: |
python -m pip install $(package_name) --no-index -f wheelhouse
displayName: "Install wheel"
- script: |
python -m pytest --junitxml=junit/test-results.xml
workingDirectory: tests
displayName: "Test with pytest"
- task: PublishTestResults@2
inputs:
testResultsFiles: "**/test-*.xml"
testRunTitle: "Publish test results for Python $(python.version)"
condition: succeededOrFailed()
- script: |
python -m pip install delocate
/Library/Frameworks/Python.framework/Versions/$(python.version)/bin/delocate-wheel dist/$(package_name)*.whl
displayName: "Delocate wheels"
condition: and(succeeded(), eq(variables['Agent.OS'], 'Darwin'))
-
This may sound odd. The most common case where it might not be true for pure Python code is if you have Python 2 code that is converted into Python 3 code using 2to3 by setup.py. This was the expected method for adopting Python 3 when it first came out, but quickly was found to be a complete mess and is no longer in use. Most code is written to support both in a single code base if both versions are supported. ↩︎
-
Okay, I can’t get by with making Windows look that good. The reason you can’t delocate the wheels is that the DLL lookup on Windows is terrible, and you have to be very careful to bundle in any DLL’s by hand with unique names so that you don’t break another wheel. ↩︎