Tools to Bind to Python

This was originally given as a PyHEP 2018 talk, It is designed to be interactive, and can be run in SWAN if you have a CERN account. If you want to run it manually, just download the repository: github.com/henryiii/pybindings_cc. It is easy to run in Anaconda.

Focus

  • What Python bindings do
  • How Python bindings work
  • What tools are available

Caveats

  • Will cover C++ and C binding only
  • Will not cover every tool available
  • Will not cover cppyy in detail (but see Enric’s talk)
  • Python 2 is dying, long live Python 3!
    • but this talk is Py2 compatible also

Overview:

Part one

Tool(s) Features Opionion of Author
ctypes, CFFI Pure Python, C only Great for simple cases
CPython How all bindings work Too complex for most cases
SWIG Multi-language, automatic Too automatic for most cases
Cython New language Can be very verbose
Pybind11 Pure C++11 Often a great fit
CPPYY From ROOT’s JIT engine Handles templates!

Part two

  • An advanced binding in Pybind11

Timeline

If you have the original talk from the repository, it is an interactive notebook, and no code will be hidden. Here are the required packages:

!pip install --user cffi pybind11 numba
# Other requirements: cython cppyy (SWIG is also needed but not a python module)
# Using Anaconda recommended for users not using SWAN

If you are not on SWAN, you will want cython and cppyy as well. SWIG is also needed but not a python module, so be sure you find a way to get that.

Here are the standard imports. We will also add two variables to help with compiling:

from __future__ import print_function
import os
import sys
from pybind11 import get_include

inc = "-I " + get_include(user=True) + " -I " + get_include(user=False)
plat = "-undefined dynamic_lookup" if "darwin" in sys.platform else "-fPIC"

What is meant by bindings?

Bindings allow a function(alitiy) in a library to be accessed from Python.

We will start with this example:

%%writefile simple.c

float square(float x) {
    return x*x;
}

Desired usage in Python:

y = square(x)

ctypes

C bindings are very easy. Just compile into a shared library, then open it in python with the built in ctypes module:

!cc simple.c -shared -o simple.so
from ctypes import cdll, c_float

lib = cdll.LoadLibrary("./simple.so")
lib.square.argtypes = (c_float,)
lib.square.restype = c_float
lib.square(2.0)
4.0

This may be all you need! Example: AmpGen Python interface. In fact, in Pythonista for iOS, we can even use ctypes to access Apple’s public APIs!

CFFI

  • The C Foreign Function Interface for Python
  • Still C only
  • Developed for PyPy, but available in CPython too

We start with the same example as before:

from cffi import FFI

ffi = FFI()
ffi.cdef("float square(float);")
C = ffi.dlopen("./simple.so")
C.square(2.0)
4.0

CPython

  • Let’s see how bindings work before going into C++ binding tools
  • This is how CPython itself is implemented

C reminder: static means visible in this file only

%%writefile pysimple.c
#include <Python.h>

float square(float x) {return x*x; }

static PyObject* square_wrapper(PyObject* self, PyObject* args) {
  float input, result;
  if (!PyArg_ParseTuple(args, "f", &input)) {return NULL;}
  result = square(input);
  return PyFloat_FromDouble(result);}

static PyMethodDef pysimple_methods[] = {
 { "square", square_wrapper, METH_VARARGS, "Square function" },
 { NULL, NULL, 0, NULL } };

#if PY_MAJOR_VERSION >= 3
static struct PyModuleDef pysimple_module = {
    PyModuleDef_HEAD_INIT, "pysimple", NULL, -1, pysimple_methods};
PyMODINIT_FUNC PyInit_pysimple(void) {
    return PyModule_Create(&pysimple_module); }
#else
DL_EXPORT(void) initpysimple(void) {
  Py_InitModule("pysimple", pysimple_methods); }
#endif

Build:

!cc {inc} -shared -o pysimple.so pysimple.c {plat}

Run:

import pysimple

pysimple.square(2.0)
4.0

C++: Why do we need more?

Sometimes simple is enough! And, if we are in C++, we can use export "C" to export a C interface. But, C++ API can have overloading, classes, memory management, etc… We could manually translate everything using C API, but there’s a better way…

Solution:

C++ binding tools!

This is our C++ example:

%%writefile SimpleClass.hpp
#pragma once

class Simple {
    int x;
  public:
    Simple(int x): x(x) {}
    int get() const {return x;}
};
Overwriting SimpleClass.hpp

SWIG

SWIG

  • SWIG: Produces “automatic” bindings
  • Works with many output languages
  • Has supporting module built into CMake
  • Very mature

Downsides:

  • Can be all or nothing
  • Hard to customize
  • Customizations tend to be language specific
  • Slow development
%%writefile SimpleSWIG.i

%module simpleswig
%{
/* Includes the header in the wrapper code */
#include "SimpleClass.hpp"
%}

/* Parse the header file to generate wrappers */
%include "SimpleClass.hpp"
Overwriting SimpleSWIG.i
!swig -python -c++ SimpleSWIG.i
!c++ -shared SimpleSWIG_wrap.cxx {inc} -o _simpleswig.so {plat}
import simpleswig

x = simpleswig.Simple(2)
x.get()
2

Cython

Cython

  • Built to be a Python+C language for high performance computations
  • Performance computation space in competition with Numba
  • Due to design, also makes binding easy
  • Easy to customize result
  • Can write Python 2 or 3, regardless of calling language

Downsides:

  • Requires learning a new(ish) language
  • Have to think with three hats
  • Very verbose

Aside: Speed comparison Python, Cython, Numba

We’ll take a quick minute to look at what Cython (and Numba) was built for: fast from-scratch computing.

If we look at a really stupidly useless example in Python:

def f(x):
    for _ in range(100000000):
        x = x + 1
    return x

And time it:

%%time
f(1)
CPU times: user 6.21 s, sys: 708 µs, total: 6.22 s
Wall time: 6.21 s

We’ll see that it takes a long time just to add numbers. Let’s try in Cython:

%load_ext Cython
%%cython
def f(int x):
    for _ in range(100000000):
        x=x+1
    return x
%%timeit
f(23)
64.9 ns ± 3.96 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Wow, that’s so much faster. In fact, if we assume this only consists of one instruction per add, with 100M instructions, we have a 15 PHz machine! Hopefully, this does not sound right; in fact, the C compiler was smart enough to optimise the loop into a single add!

Let’t try again in numba. This time, we don’t need any magics or special compilers, just the numba library:

import numba


@numba.jit
def f(x):
    for _ in range(100000000):
        x = x + 1
    return x
%time
f(41)
CPU times: user 13 µs, sys: 1 µs, total: 14 µs
Wall time: 39.1 µs
%%timeit
f(41)
213 ns ± 19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Here, we have a similar result; the first run was “slow”, and the rest of the runs were almost as fast as Cython (and like Cython, this does not depend on the number of iterations; the LLVM compiler has optimised away the loop). There is a tiny bit more overhead in the call, and that’s it.

Binding with Cython

Back to our original problem, making bindings.

%%writefile simpleclass.pxd
# distutils: language = c++

cdef extern from "SimpleClass.hpp":
    cdef cppclass Simple:
        Simple(int x)
        int get()
%%writefile cythonclass.pyx
# distutils: language = c++

from simpleclass cimport Simple as cSimple

cdef class Simple:
    cdef cSimple *cself

    def __cinit__(self, int x):
        self.cself = new cSimple(x)

    def get(self):
        return self.cself.get()

    def __dealloc__(self):
        del self.cself
!cythonize cythonclass.pyx
Compiling pybindings_cc/cythonclass.pyx because it changed
[1/1] Cythonizing pybindings_cc/cythonclass.py
!g++ cythonclass.cpp -shared {inc} -o cythonclass.so {plat}
import cythonclass

x = cythonclass.Simple(3)
x.get()
3

pybind11

pybind11

  • Similar to Boost::Python, but easier to build
  • Pure C++11 (no new language required), no dependencies
  • Builds remain simple and don’t require preprocessing
  • Easy to customize result
  • Great Gitter community
  • Used in GooFit 2.1+ for CUDA too [CHEP talk]

Downsides:

  • Still verbose
  • Development variable
%%writefile pybindclass.cpp

#include <pybind11/pybind11.h>
#include "SimpleClass.hpp"

namespace py = pybind11;

PYBIND11_MODULE(pybindclass, m) {
    py::class_<Simple>(m, "Simple")
        .def(py::init<int>())
        .def("get", &Simple::get)
    ;
}
Overwriting pybindclass.cpp
!c++ -std=c++11 pybindclass.cpp -shared {inc} -o pybindclass.so {plat}
import pybindclass

x = pybindclass.Simple(4)
x.get()
4

CPPYY

  • Born from ROOT bindings
  • Built on top of Cling
  • JIT, so can handle templates
  • See Enric’s talk for more

Downsides:

  • Header code runs in Cling
  • Heavy user requirements (Cling)
  • ROOT vs. pip version
  • Broken on SWAN due to ROOT version
import cppyy
cppyy.include("SimpleClass.hpp")
x = cppyy.gbl.Simple(5)
x.get()
5