Azure DevOps: Introduction

Continuous Integration (CI) is fantastic for software development and deployment. One of the newest entries into the CI market1 is Microsoft’s Azure DevOps. Their Open Source support is impressive; it is likely part of the recent push2 by Microsoft to be more Open Source friendly. Open Source projects get 10 parallel builds, unlimited build minutes, 6 hour job timeouts, and incredibly fast jobs on macOS, Linux, and Windows, all via a single platform. Quite a few major projects3 have been moving to Azure since the initial release in December 2018. The configuration of DevOps is second only to GitLab CI in ease of use and possibly the most expressive system available. The multiple pipeline support also scales well to complicated procedures.

This is the first in a series of posts covering an introduction to setting up projects in Azure DevOps, developed to update the testing and releasing of Python packages for Scikit-HEP, a project for a coherent High Energy Physics Python analysis toolset. The second post covers release pipelines, and the third covers building binary Python packages using DevOps.

Note: I now highly recommend GitHub Actions, which is almost “Azure 2.0”, if you are interested in setting up CI. The language is very similar, although simplified, with some non-backward compatible bugfixes (such as multiline expressions will error if any line files, instead of just on the last line in Azure). You can read my tutorials on GitHub Actions on the Scikit-HEP developer pages.

[Read More]
Categories: Azure DevOps  Tags: programming python ci azure 

ROOT on Conda Forge

Linux and macOS packages for Python 2.7, 3.6, 3.7, and 3.8

For High Energy Physics, the go-to framework for big data analysis has been CERN’s ROOT framework. ROOT is a massive C++ library that even predates the STL in some areas. It is1 also a JIT C++ interpreter called Cling, probably the best in the business. If you have heard of the Xeus C++ Kernel for Jupyter, that is built on top of Cling. ROOT has everything a HEP physicist could want: math, plotting, histograms, tuple and tree structures, a very powerful file format for IO, machine learning, Python bindings, and more. It also does things like dictionary generation and arbitrary class serialization (other large frameworks like Qt have similar generation tools).

You may already be guessing one of the most common problems for ROOT. It is huge and difficult to install – if you build from source, that’s a several hour task on a single core. It has gotten much better in the last 6 years, and there are several places you can find ROOT, but there are still areas where it is challenging. This is especially true for Python; ROOT is linked to your distro’s Python (both python2 and python3 if your distro supports it, as of ROOT 6.22); but the common rule for using Python is “don’t touch your system Python” - so modern Python users should be in a virtual environment, and for that ROOT requires the system site-packages option be enabled, which is not always ideal. And, if you use the Anaconda Python distribution, which is the most popular scientific distribution of Python and massively successful for ML frameworks, the general rule even for people who build ROOT themselves has been: don’t. But now, you can get a fully featured ROOT binary package for macOS or Linux, Python 2.7, 3.6, 3.7, or 3.8 from Conda-Forge, the most popular Anaconda community channel! Many more HEP recipes have now been added to Conda-Forge, as well! ROOT now also provides a conda docker image, too!

[Read More]
Categories: Python  Tags: programming python physics root 

ROOT Install Options

New Conda Forge package of ROOT for Unix and more options

For particle physicists, ROOT is one of the most important toolkits around. It is a huge suite of tools that predates the C++ standard library, and has almost anything a particle physicist could want. It has driven developments in other areas too. ROOT’s current C++ interpreter, CLING, is the most powerful C++ interpreter available and is used by the Xeus project for Jupyter. The Python work has helped PyPy, with CPPYY also coming from ROOT. However, due to the size, complexity, and age of some parts of ROOT, it can be a bit challenging to install; and it is even more challenging when you want it to talk to Python. I would like to point to the brand-new Conda-Forge ROOT package for Linux and macOS, and point out a few other options for macOS installs. Note for Windows users: Due to the fact that ROOT expects the type long to match the system pointer size, 64-bit Windows cannot be supported for quite some time. While you can use it in 32 bit form, this is generally impossible to connect to Python, which usually will be a 64-bit build.

[Read More]
Categories: Python  Tags: programming python physics root 

Histogram Speeds in Python

Let’s compare several ways of making Histograms. I’m going to assume you would like to end up with a nice OO histogram interface, so all the 2D methods will fill a Physt histogram. We will be using a 2 x 1,000,000 element array and filling a 2D histogram, or 10,000,000 elemends in a 1D histogram. Binnings are regular.

1D 10,000,000 item histogram

ExampleKNLMBPX24
NumPy: histogram704 ms147 ms114 ms
NumPy: bincount432 ms110 ms117 ms
fast-histogram337 ms45.9 ms45.7 ms
Numba312 ms58.8 ms60.7 ms

2D 1,000,000 item histogram

ExampleKNLMBPX24
Physt1.21 s293 ms246 ms
NumPy: histogram2d456 ms114 ms88.3 ms
NumPy: add.at247 ms62.7 ms49.7 ms
NumPy: bincount81.7 ms23.3 ms20.3 ms
fast-histogram53.7 ms10.4 ms7.31 ms
fast-hist threaded 0.5(6) 62.5 ms9.78 ms(6) 15.4 ms
fast-hist threaded (m)62.3 ms4.89 ms3.71 ms
Numba41.8 ms10.2 ms9.73 ms
Numba threaded(6) 49.2 ms4.23 ms(6) 4.12 ms
Cython112 ms12.2 ms11.2 ms
Cython threaded(6) 128 ms5.68 ms(8) 4.89 ms
pybind11 sequential93.9 ms9.20 ms17.8 ms
pybind11 OpenMP atomic4.06 ms6.87 ms1.91 ms
pybind11 C++11 atomic(32) 10.7 ms7.08 ms(48) 2.65 ms
pybind11 C++11 merge(32) 23.0 ms6.03 ms(48) 4.79 ms
pybind11 OpenMP merge8.74 ms5.04 ms1.79 ms
[Read More]
Categories: Python  Tags: programming python 

Announcing CLI11 1.6

CLI11, a powerful library for writing beautiful command line interfaces in C++11, has been updated to 1.6, the largest update ever. CLI11 output is more customizable than ever, and has a better functionality separation under the hood.

CLI11 has had the formatting system completely redesigned, with minor or complete customization of the output possible. Configuration files reading and writing also can be configured; a new example with json instead of ini formatting is included. Validators (finally) have custom help output, as well. Many odd corner cases have been made possible, such as interleaving options.

[Read More]
Categories: cpp  Tags: programming cpp cli cli11 plumbum