This repository contains code related to the data analysis of Timepix based gaseous detectors.
It contains code to calibrate a Timepix ASIC and perform event shape analysis of data to differentiate between background events (mainly cosmic muons) and signal events (X-rays).
…
Many parts of this repository are specifically related to an InGrid based X-ray detector in use at the CERN Axion Solar Telescope: http://cast.web.cern.ch/CAST/
…
This repository contains a big project combining several tools used to analyze data based on Timepix detectors as well as the CAST experiment.
NOTE: If you are mainly interested in using the reconstruction and analysis utilities for TOS data, the Analysis folder is what you’re looking for. See the Installation section for more information.
- Analysis:
Is the
ingrid
module, which contains the major programs of this repository raw_data_manipulation and reconstruction and to a lesser extent (depending on your use case) likelihood.- raw_data_manipulation:
Reads folders of raw TOS data and outputs to a HDF5 file.
Supported TOS data types:
- old ~2015 era Virtex V6 TOS
- current Virtex V6 TOS
- soon: current SRS TOS
- reconstruction: Takes the output of the above program and performs reconstruction of clusters within the data, i.e. calculate geometric properties.
- likelihood: Performs an event shape likelihood based analysis on the reconstructed data comparing with reference X-ray datasets.
The other files in the folder are imported by these programs. An exception is skeleton program analysis, which will eventually become a wrapper of the other programs so that a nicer interface can be provided. A combination of a https://github.com/yglukhov/nimx based GUI with a
readline
based command line interface will be developed. - raw_data_manipulation:
Reads folders of raw TOS data and outputs to a HDF5 file.
Supported TOS data types:
- CDL-RootToHdf5: A Python tool to (currently only) convert X-ray calibration data from the CAST detector lab from ROOT trees to HDF5 files. This could be easily extended to be a ROOT to HDF5 converter. TODO: this should be moved to Tools.
- endTimeExtractor:
A Nim tool to extract the following information from a TOS run:
- start of the Run
- end of the Run
- total run time
and output it as an Org date string. TODO: should be moved to Tools.
- extractScintiTriggers:
A Nim tool to extract the number of scintillator triggers of a TOS
run (either read from a raw run folder or a HDF5 file). Outputs
total numbers of those and provides functionality to copy raw files
containing non trivial scintillator counts (
< 4095
cycles) to a different location to view them with TOS’s event display. TODO: should be moved to Tools. - Figs: Plots, which are created from the analysis and have been used in a talk etc.
- InGridDatabase:
A Nim program which provides, writes to and reads from the InGrid
database. If the a folder describing the used detector is given to
it (containing
fsr
,threshold
,thresholdMeans
,ToT
calibration and / orSCurves
and an additional file containing the chip name and additional information) it can be added to that database, which is simply a HDF5 file. The analysis progam makes use of this database to read calibration relevant data from it. TODO: link to explanation of required folder structure and add files / folders for current chips part of database. - InGrid-Python: A Python module containing additional functions used in the Nim analysis (fit of Fe55 spectrum and polya gas gain fit done using https://github.com/yglukhov/nimpy) and the Python plotting tool (see below).
- LogReader: A Nim tool to read and process CAST slow control and tracking log files. From these environmental sensors can be read if needed for data analysis puposes of CAST data as well as information about when solar trackings took place. If a HDF5 file is given the tracking information is added to the appropriate runs.
- NimUtil:
The
helpers
nimble module. It contains general procedures used in the rest of the code, which are unrelated to CAST or Timepix detectors. - Plotting:
A Nim tool to create plots of Timepix calibration data. Reads from
the InGrid database and plots
ToT
calibration (+ fits) and SCurves. - PlottingPython:
A set of Python plotting tools.
- PyS_createBackgroundRate.py: used to create the background rate plots for the CAST data taking after the likelihood analysis has been performed.
- PyS_plotH5data.py: used to plot arbitrary 1D column data (basically everything resulting from the reconstruction) from the reconstruction HDF5 files.
- README.org: this file. :)
- resources: Contains data, which is needed for analysis purposes, e.g. information about run numbers for data taking periods, the 2014/15 background rates etc. TODO: maybe add folders for known chips for InGrid database in here or at least an example directory.
- SolarEclipticToEarth: A simple Python tool part of solar chameleon analysis, which calculates the projection of the solar ecliptic onto Earth (chameleon flux potentially varies greatly depending on solar latitude). TODO: should be moved to Tools.
- Tests: Some very simple “test cases”, which typically just test new features separately from the rest of the analysis programs.
- Tools: Directory for other smaller tools, for which a separate directory in the root of the repository does not make sense (either used too infrequently or are very specific and small tools).
- VerticalShiftProblem: A simple Python tool to plot CAST log data to debug a problem with the belt, which slipped and caused misalignment. That problem has since been fixed. TODO: should be moved to Tools.
The project has only a few dependencies, which are all mostly easy to
install. The Nim compiler is only a dependency to compile the Nim
programs. But if you just wish to run the built binaries, the Nim
compiler is not a dependency! E.g. compiling the
raw_data_manipulation
and reconstruction
on an x86-64 linux system
creates an (almost) dependency free binary.
The following shared libraries are linked at runtime:
libhdf5
libnlopt
libmpfit
libpcre
Their installation procedures are explained below.
A note about the dependeny of the source code on the Nim compiler:
This project strictly depends on the devel branch of the Nim
compiler! If new features are implemented in the compiler (or
libraries it depends on for that matter), which are useful for this
project, they will be used! If you run into compilation issues try to
update to the current #head
of the package, which fails compilation
(if the error happens in a module not part of this repo) and update
the Nim compiler!
A general note about compiling Nim programs. Unless debuggin the code,
you should always compile your programs with the -d:release
flag. It
disables many different run time checks, which slow down the execution
speed by a factor of 5 to 10, depending on the workload!
Include an example of a config.nims
, which defines common
compilation flags like -d:release
, --threads:on
or -d:H5_LEGACY
(if applicable) to ease the compilation process for users.
Nim is obviously required to compile the Nim projects of this
repository. There are two approaches to install the Nim
compiler. Using choosenim
or cloning the Nim repository.
Go to some folder where you wish to store the Nim compiler, e.g. ~/src or create a folder if does not exist:
cd ~/
mkdir src
Please replace this directory by your choice in the rest of this section.
Then clone the git repository from GitHub (assuming git
is
installed):
git clone https://github.com/nim-lang/nim
enter the folder:
cd nim
and if you’re on a Unix system run:
sh build_all.sh
to build the compiler and additional tools like nimble
(Nim’s
package manager), nimsuggest
(allows smart auto complete for Nim
procs), etc.
Now add the following to your PATH
variable in your shell’s
configuration file, e.g. ~/.bashrc:
# add location of Nim's binaries to PATH
export PATH=$PATH:$HOME/src/nim/bin
and finally reload the shell via
source ~/.bashrc
or the appropriate shell config (or start a new shell).
With this approach updating the Nim compiler is trivial. First update
your local git repository by pulling from the devel
branch:
cd ~/src/nim
git pull origin devel
and finally use Nim’s build tool koch
to update the Nim compiler:
./koch boot -d:release
An alternative to the above mentioned method is to use choosenim
.
Type the following into your terminal:
curl https://nim-lang.org/choosenim/init.sh -sSf | sh
Then follow the instructions and extend the PATH
variable in your
shell’s configuration file, e.g. ~/.bashrc.
Finally reload that file via:
source ~/.bashrc
or simply start a new shell.
The major dependency of the Nim projects is HDF5. On a reasonably
modern Linux distribution the libhdf5
should be part of the package
repositories. The supported HDF5 versions are:
1.8
: as a legacy mode, compile the Nim projects with-d:H5_LEGACY
1.10
: the current HDF5 version and the default
If the HDF5 library is not available on your OS, you may download the binaries or the source code from the HDF group.
HDF View is a very useful tool to look at HDF5 files with a graphical user interface. For HEP users: it is very similar to ROOT’s TBrowser.
Although many package repositories contain a version of HDF View, it is typically relatively old. The current version is version 3.0.0, which has some nice features, so it may be a good idea to install it manually.
The NLopt library is a nonlinear optimization library, which is used in this project to fit the rotation angle of clusters and perform fits of the gas gain. The Nim wrapper is found at https://github.com/vindaar/nimnlopt. To build the C library follow the following instructions, (taken from here):
git clone https://github.com/stevengj/nlopt # clone the repository
cd nlopt
mkdir build
cd build
cmake ..
make
sudo make install
This introduces cmake
as a dependency. Note that this installs the
libnlopt.so
system wide. If you do not wish to do that, you need to
set your LD_PRELOAD_PATH
accordingly!
Afterwards installation of the Nim nlopt
module is sufficient (done
automatically later).
MPfit is a non-linear least squares fitting library. It is required as a dependency, since it’s used to perform different fits in the analysis. The Nim wrapper is located at https://github.com/vindaar/nim-mpfit. Compilation of this shared object is easiest by cloning the git repository of the Nim wrapper:
cd ~/src
git clone https://github.com/vindaar/nim-mpfit
cd nim-mpfit
And then build the library from the c_src
directory as follows:
cd c_src
gcc -c -Wall -Werror -fpic mpfit.c mpfit.h
gcc -shared -o libmpfit.so mpfit.o
which should create the libmpfit.so
. Now install that library system
wide (again to avoid having to deal with LD_PRELOAD_PATH
manually). Depending on your system, a suitable choice may be
/usr/local/lib/:
sudo cp libmpfit.so /usr/local/lib
Finally, you may install the Nim wrapper via
nimble install
or tell nimble
to point to the directory of the respitory here via:
nimble develop
The latter makes updating the package much easier, since updating the git repository is enough.
Perl Compatible Regular Expressions (PCRE) is a library for regular expression matching. On almost any unix system, this library is already available. For some distributions (possibly some CentOS or Scientific Linux) it may not be.
This currently means you’ll have to build this library by yourself.
The default RE library in Nim is a wrapper around PCRE, due to PCRE’s
very high performance. However, the performance critical parts do not
depend on PCRE anymore.
In principle we could thus replace the re
module with
https://github.com/nitely/nim-regex, a purely Nim based regex
engine. PRs welcome! :)
Blosc is a compression library used to compress the binary data in the
HDF5 files. By default however Zlib
compression is used, so this is
typically not needed.
If one wishes to read Timepix3 based HDF5 files however, this module
will is needed, although support for these detectors is currently not
part of this repository.
Once the dependencies are installed, we can prepare the framework.
We start by cloning the TimepixAnalysis
repository somewhere, e.g.:
cd ~/src
git clone https://github.com/Vindaar/TimepixAnalysis
The next step is to prepare installation of the modules within this repository. That means we need to install
- the helpers module
- the InGridDatabase module
- the ingrid (contains the analysis) module
This is done by calling either nimble install
or nimble develop
in
the folders linked above, which contain a .nimble
file.
Aside: The difference between nimble’s install
and develop
commands is:
install
copies the source files of the module to your localnimble
packages folder, by default ~/.nimble/pkgs/develop
just creates a link in the said folder, which points to the location where the source files lie, e.g.~/src/TimepixAnalysis/InGridDatabase/src
or similar.
Thus, using nimble develop
is very convenient for packages, which
are updated frequently using git pull
or which are actively
developed by yourself. No reinstallation necessary, if the source changes.
Choosing nimble develop
, we install the following (assuming you’re
in the root of the TimepixAnalysis
repository):
cd NimUtil
nimble develop
cd ../InGridDatabase
nimble develop
cd ../Analysis
nimble develop
cd ..
Calling nimble develop
in the Analysis directory, will install all
needed dependencies (in principle also nimhdf5
, nlopt
and mpfit
libraries).
If there are no regressions upstream on any of the packages (we
install #head
of all dependencies), installation should be smooth
and you should be set to compile the programs!
Now we’re ready to compile the raw_data_manipulation
and
reconstruction
programs. First enter the Analysis directory:
cd Analysis/ingrid
Now a basic Nim compilation looks as follows:
nim c raw_data_manipulation.nim
c
stands for compile to C
(technically just for compile
with the
default backend. The C
target specifically is called by
cc
). Alternatively you can use cpp
to compile to C++
, js
to
compile to Javascript or objc
to compile to Objective-C. Note that
the filename extension for myfile.nim
is optional.
If you compile a program to actually use it (and not to test or
debug), you’ll want to compile it with the -d:release
flag, like so:
nim c -d:release raw_data_manipulation.nim
Since basically all programs part of this project use multiple
threads, another option is necessary, the --threads:on
flag:
nim c -d:release --threads:on raw_data_manipulation.nim
This in principle is all you need to do to get a standalone binary, which depends on the aforementioned shared libraries.
By default the resulting binary is called after the compiled Nim file
without a file extension. If you wish a different filename, use the
--out
option:
nim c -d:release --threads:on --out:myName raw_data_manipulation.nim
Note: this can also be used to place the resulting binary in a
different folder!
Note 2: take care that you cannot write neither --threads on
nor
--threads=on
! The colon is mandatory.
The reconstruction
program is compiled in the same way.
nim c -d:release --threads:on reconstruction.nim
For some parts of the later analysis a Python module is necessary, because we call Python code from Nim to perform two different fits.
Mainly we need to install the InGrid-Python
module, via:
cd InGrid-Python
python3 setup.py develop
cd ..
potentially with sudo
rights, depending on your setup. This will
create a link to the InGrid-Python
directory, similar to what
nimble develop
does.
TODO: To run the code for the gas gain calculations, in addition we need to compile a small Nim module procsForPython.nim. This module defines several Nim procs, which are compiled as a shared object and called from Python in order to accelerate the fitting significantly.
That module needs to be compiled as:
cd Analysis/ingrid/
nim c -d:release --app:lib --out:procsForPython.so procsForPython.nim
and potenatially copied over to the source directory of the
InGrid-Python
module.
If you run into problems trying to run one of the programs, it might be an easy fix.
An error such as
could not import: H5P_LST_FILE_CREATE_g
means that you compiled against a different HDF5 libary version than
the one you have installed and is being tried to link at run time.
Solution: compile the program with the -d:H5_LEGACY
option, e.g.:
nim c -d:release --threads:on -d:H5_LEGACY raw_data_manipulation.nim
Another common problem is an error such as:
Error: cannot open file: docopt
This indicates that the module named docopt
(only an example) could
not be imported. Most likely a simple
nimble install docopt
would suffice. A call to nimble install
with a package name will try
to install a package from the path declared in the packages.json
from here:
https://github.com/nim-lang/packages/blob/master/packages.json
If you know that you need the #head
of such a package, you can
install it via
nimble install "docopt@#head"
Note: depending on your shell the "
may not be needed.
Note 2: instead of a simple package name, you may also hand nimble a
full path to a git or mercurial repository. This is necessary in some
cases, e.g. for the seqmath
module, because we depend on a fork:
nimble install "https://github.com/vindaar/seqmath#head"
The following Nim modules are definitely required for
raw_data_manipulation
and reconstruction
:
loopfusion arraymancer https://github.com/vindaar/seqmath#head https://github.com/vindaar/shell#head https://github.com/vindaar/ginger#head https://github.com/vindaar/ggplotnim#head nimhdf5 docopt mpfit nlopt plotly zero_functional helpers nimpy karax parsetoml https://github.com/yglukhov/threadpools#head ingridDatabase
In general the usage of the analysis programs is straight forward and
explained in the docstring, which can be echoed by calling a program
with the -h
or --help
option:
./reconstruction -h
would print:
Version: b49c061 built on: 2018-10-10 at 13:01:29 InGrid raw data manipulation. Usage: raw_data_manipulation <folder> [options] raw_data_manipulation <folder> --runType <type> [options] raw_data_manipulation <folder> --out=<name> [--nofadc] [--runType=<type>] [--ignoreRunList] [options] raw_data_manipulation <folder> --nofadc [options] raw_data_manipulation -h | --help raw_data_manipulation --version Options: --runType=<type> Select run type (Calib | Back | Xray) The following are parsed case insensetive: Calib = {"calib", "calibration", "c"} Back = {"back", "background", "b"} Xray = {"xray", "xrayfinger", "x"} --out=<name> Filename of output file --nofadc Do not read FADC files --ignoreRunList If set ignores the run list 2014/15 to indicate using any rfOldTos run --overwrite If set will overwrite runs already existing in the file. By default runs found in the file will be skipped. HOWEVER: overwriting is assumed, if you only hand a run folder! -h --help Show this help --version Show version.
similar docstrings are available for all programs.
In order to analyze a raw TOS run, we’d perform the following steps. The command line arguments are examples. Those required will be exaplained, for the others see the doc stings.
Assuming we have a TOS run folder located in
~/data/Run_168_180702-15-24/
:
cd ~/src/TimepixAnalysis/Analysis/ingrid
./raw_data_manipulation ~/data/Run_168_180702-15-24/ --runType=calibration --out=run168.h5
where we give the runType
(either calibration, background or X-ray
finger run), which is useful to store in the resulting HDF5 file. For
calibration runs several additional reconstruction steps are also done
automatically during the reconstruction phase. We also store the data
in a file called run168.h5
. The default filename is
run_file.h5
. The HDF5 file now contains two groups (runs
and
reconstruction
). runs
stores the raw data. reconstruction is
still mainly empty, some datasets are linked from the =runs
group.
Alternatively you may also hand a directory, which contains several
run folders. So if you had several runs located in ~/data
, simply
handing that would work. The program would work on all runs in data
after another. Each run is stored in its own group in the resulting
HDF5 file.
Afterwards we go on to the reconstruction phase. Here the raw data is read back from the HDF5 file and clusters within events are separated and geometric properties calculated. This is done by:
./reconstruction run168.h5
After the reconstruction is done and depending on whether the run type is calibration or background / X-ray finger run, you can continue to calculate futher properties, e.g. the energy of all clusters.
The next step is to apply the ToT calibration to calculate the charge of all clusters via:
./reconstruction run168.h5 --only charge
Note: this requires an entry for your chip in the ingrid database. See below for more information.
Once the charges are calibrated, you may calculate the gas gain of the run via:
./reconstruction run168.h5 --only_gas_gain
Note: this depends on an optional Python module to fit the polya distribution. See above for an explanation on how to compile that.
Finally, you can calculate the energy of all custers by doing:
./reconstruction run168.h5 --only_energy_from_e
The last three steps are not part of the first call to
reconstruction
, due to non trivial dependencies
- charge calib requires ToT data
- gas gain requires Python module
- energy from charge requires the above two.
For a full analysis, you’d now have to perform the likelihood analysis.
TODO: add a note about creation of Fe spectra
The likelihood analysis is the final step done in order to filter out events, which are not X-ray like, based on a likelihood cut. The likelihood program however, needs two different input files. This is not yet as streamlined as it should be, which is why it’s not explained here in detail. Take a look at the docstring of the program or ask me (@Vindaar).
TODO: make the CDL data part of the repository somehow?
If you wish to perform charge calibration and from that energy calibration, you need to add your chip to the ingrid database.
For now take a look at InGridDatabase/src/ingridDatabase.nim to understand how to do that.
TODO: finish explanation on how to do that. For that first add example folder, which is handed.
There are several tools available to visualize the data created by the programs in this repository.
Some words…
The code in this repository is published under the MIT license.