This was originally given as a PyHEP 2018 talk, It is designed to be interactive, and can be run in SWAN if you have a CERN account. If you want to run it manually, just download the repository: github.com/henryiii/pybindings_cc. It is easy to run in Anaconda.
Focus
- What Python bindings do
- How Python bindings work
- What tools are available
Caveats
- Will cover C++ and C binding only
- Will not cover every tool available
- Will not cover
cppyy
in detail (but see Enric’s talk) - Python 2 is dying, long live Python 3!
- but this talk is Py2 compatible also
Overview:
Part one
Tool(s) | Features | Opionion of Author |
---|---|---|
ctypes, CFFI | Pure Python, C only | Great for simple cases |
CPython | How all bindings work | Too complex for most cases |
SWIG | Multi-language, automatic | Too automatic for most cases |
Cython | New language | Can be very verbose |
Pybind11 | Pure C++11 | Often a great fit |
CPPYY | From ROOT’s JIT engine | Handles templates! |
Part two
- An advanced binding in Pybind11
If you have the original talk from the repository, it is an interactive notebook, and no code will be hidden. Here are the required packages:
!pip install --user cffi pybind11 numba
# Other requirements: cython cppyy (SWIG is also needed but not a python module)
# Using Anaconda recommended for users not using SWAN
If you are not on SWAN, you will want cython
and cppyy
as well. SWIG is also
needed but not a python module, so be sure you find a way to get that.
Here are the standard imports. We will also add two variables to help with compiling:
from __future__ import print_function
import os
import sys
from pybind11 import get_include
inc = "-I " + get_include(user=True) + " -I " + get_include(user=False)
plat = "-undefined dynamic_lookup" if "darwin" in sys.platform else "-fPIC"
What is meant by bindings?
Bindings allow a function(alitiy) in a library to be accessed from Python.
We will start with this example:
%%writefile simple.c
float square(float x) {
return x*x;
}
Desired usage in Python:
y = square(x)
ctypes
C bindings are very easy. Just compile into a shared library, then open it in python with the built in ctypes module:
!cc simple.c -shared -o simple.so
from ctypes import cdll, c_float
lib = cdll.LoadLibrary("./simple.so")
lib.square.argtypes = (c_float,)
lib.square.restype = c_float
lib.square(2.0)
4.0
This may be all you need! Example: AmpGen Python interface. In fact, in Pythonista for iOS, we can even use ctypes to access Apple’s public APIs!
CFFI
- The C Foreign Function Interface for Python
- Still C only
- Developed for PyPy, but available in CPython too
We start with the same example as before:
from cffi import FFI
ffi = FFI()
ffi.cdef("float square(float);")
C = ffi.dlopen("./simple.so")
C.square(2.0)
4.0
CPython
- Let’s see how bindings work before going into C++ binding tools
- This is how CPython itself is implemented
C reminder:
static
means visible in this file only
%%writefile pysimple.c
#include <Python.h>
float square(float x) {return x*x; }
static PyObject* square_wrapper(PyObject* self, PyObject* args) {
float input, result;
if (!PyArg_ParseTuple(args, "f", &input)) {return NULL;}
result = square(input);
return PyFloat_FromDouble(result);}
static PyMethodDef pysimple_methods[] = {
{ "square", square_wrapper, METH_VARARGS, "Square function" },
{ NULL, NULL, 0, NULL } };
#if PY_MAJOR_VERSION >= 3
static struct PyModuleDef pysimple_module = {
PyModuleDef_HEAD_INIT, "pysimple", NULL, -1, pysimple_methods};
PyMODINIT_FUNC PyInit_pysimple(void) {
return PyModule_Create(&pysimple_module); }
#else
DL_EXPORT(void) initpysimple(void) {
Py_InitModule("pysimple", pysimple_methods); }
#endif
Build:
!cc {inc} -shared -o pysimple.so pysimple.c {plat}
Run:
import pysimple
pysimple.square(2.0)
4.0
C++: Why do we need more?
Sometimes simple is enough! And, if we are in C++, we can use export "C"
to
export a C interface. But, C++ API can have overloading, classes, memory
management, etc… We could manually translate everything using C API, but
there’s a better way…
Solution:
C++ binding tools!
This is our C++ example:
%%writefile SimpleClass.hpp
#pragma once
class Simple {
int x;
public:
Simple(int x): x(x) {}
int get() const {return x;}
};
Overwriting SimpleClass.hpp
SWIG
- SWIG: Produces “automatic” bindings
- Works with many output languages
- Has supporting module built into CMake
- Very mature
Downsides:
- Can be all or nothing
- Hard to customize
- Customizations tend to be language specific
- Slow development
%%writefile SimpleSWIG.i
%module simpleswig
%{
/* Includes the header in the wrapper code */
#include "SimpleClass.hpp"
%}
/* Parse the header file to generate wrappers */
%include "SimpleClass.hpp"
Overwriting SimpleSWIG.i
!swig -python -c++ SimpleSWIG.i
!c++ -shared SimpleSWIG_wrap.cxx {inc} -o _simpleswig.so {plat}
import simpleswig
x = simpleswig.Simple(2)
x.get()
2
Cython
- Built to be a Python+C language for high performance computations
- Performance computation space in competition with Numba
- Due to design, also makes binding easy
- Easy to customize result
- Can write Python 2 or 3, regardless of calling language
Downsides:
- Requires learning a new(ish) language
- Have to think with three hats
- Very verbose
Aside: Speed comparison Python, Cython, Numba
We’ll take a quick minute to look at what Cython (and Numba) was built for: fast from-scratch computing.
If we look at a really stupidly useless example in Python:
def f(x):
for _ in range(100000000):
x = x + 1
return x
And time it:
%%time
f(1)
CPU times: user 6.21 s, sys: 708 µs, total: 6.22 s
Wall time: 6.21 s
We’ll see that it takes a long time just to add numbers. Let’s try in Cython:
%load_ext Cython
%%cython
def f(int x):
for _ in range(100000000):
x=x+1
return x
%%timeit
f(23)
64.9 ns ± 3.96 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Wow, that’s so much faster. In fact, if we assume this only consists of one instruction per add, with 100M instructions, we have a 15 PHz machine! Hopefully, this does not sound right; in fact, the C compiler was smart enough to optimise the loop into a single add!
Let’t try again in numba. This time, we don’t need any magics or special compilers, just the numba library:
import numba
@numba.jit
def f(x):
for _ in range(100000000):
x = x + 1
return x
%time
f(41)
CPU times: user 13 µs, sys: 1 µs, total: 14 µs
Wall time: 39.1 µs
%%timeit
f(41)
213 ns ± 19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Here, we have a similar result; the first run was “slow”, and the rest of the runs were almost as fast as Cython (and like Cython, this does not depend on the number of iterations; the LLVM compiler has optimised away the loop). There is a tiny bit more overhead in the call, and that’s it.
Binding with Cython
Back to our original problem, making bindings.
%%writefile simpleclass.pxd
# distutils: language = c++
cdef extern from "SimpleClass.hpp":
cdef cppclass Simple:
Simple(int x)
int get()
%%writefile cythonclass.pyx
# distutils: language = c++
from simpleclass cimport Simple as cSimple
cdef class Simple:
cdef cSimple *cself
def __cinit__(self, int x):
self.cself = new cSimple(x)
def get(self):
return self.cself.get()
def __dealloc__(self):
del self.cself
!cythonize cythonclass.pyx
Compiling pybindings_cc/cythonclass.pyx because it changed
[1/1] Cythonizing pybindings_cc/cythonclass.py
!g++ cythonclass.cpp -shared {inc} -o cythonclass.so {plat}
import cythonclass
x = cythonclass.Simple(3)
x.get()
3
pybind11
- Similar to Boost::Python, but easier to build
- Pure C++11 (no new language required), no dependencies
- Builds remain simple and don’t require preprocessing
- Easy to customize result
- Great Gitter community
- Used in GooFit 2.1+ for CUDA too [CHEP talk]
Downsides:
- Still verbose
- Development variable
%%writefile pybindclass.cpp
#include <pybind11/pybind11.h>
#include "SimpleClass.hpp"
namespace py = pybind11;
PYBIND11_MODULE(pybindclass, m) {
py::class_<Simple>(m, "Simple")
.def(py::init<int>())
.def("get", &Simple::get)
;
}
Overwriting pybindclass.cpp
!c++ -std=c++11 pybindclass.cpp -shared {inc} -o pybindclass.so {plat}
import pybindclass
x = pybindclass.Simple(4)
x.get()
4
CPPYY
- Born from ROOT bindings
- Built on top of Cling
- JIT, so can handle templates
- See Enric’s talk for more
Downsides:
- Header code runs in Cling
- Heavy user requirements (Cling)
- ROOT vs. pip version
- Broken on SWAN due to ROOT version
import cppyy
cppyy.include("SimpleClass.hpp")
x = cppyy.gbl.Simple(5)
x.get()
5