For High Energy Physics, the go-to framework for big data analysis has been
CERN’s ROOT framework. ROOT is a massive C++ library that even predates the
STL in some areas. It is also a JIT C++ interpreter called Cling, probably
the best in the business. If you have
heard of
the Xeus C++ Kernel for Jupyter, that is built on top of Cling. ROOT has
everything a HEP physicist could want: math, plotting, histograms, tuple and
tree structures, a very powerful file format for IO, machine learning, Python
bindings, and more. It also does things like dictionary generation and arbitrary
class serialization (other large frameworks like Qt have similar generation
tools).
You may already be guessing one of the most common problems for ROOT. It is huge
and difficult to install – if you build from source, that’s a several hour task
on a single core. It has gotten much better in the
last 6 years, and there are several places you can find ROOT, but there are
still areas where it is challenging. This is especially true for Python; ROOT is
linked to your distro’s Python (both python2 and python3 if your distro supports
it, as of ROOT 6.22); but the common rule for using Python is “don’t touch your
system Python” - so modern Python users should be in a virtual environment, and
for that ROOT requires the system site-packages option be enabled, which is not
always ideal. And, if you use the Anaconda Python distribution,
which is the most popular scientific distribution of Python and massively
successful for ML frameworks, the general rule even for people who build ROOT
themselves has been: don’t. But now, you can get a fully featured ROOT binary
package for macOS or Linux, Python 2.7, 3.6, 3.7, or 3.8 from Conda-Forge,
the most popular Anaconda community channel! Many more HEP recipes have now been
added to Conda-Forge, as well! ROOT now also provides a conda docker image,
too!
[Read More]