From 9e79cf8ee050a001faacf0d6da6415aabf367f17 Mon Sep 17 00:00:00 2001 From: Sofia Calgaro Date: Mon, 11 Mar 2024 11:23:34 +0100 Subject: [PATCH 1/3] added SC docu --- docs/source/index.rst | 2 +- docs/source/manuals/get_sc_plots.rst | 94 +++++++++++++++++----------- docs/source/manuals/index.rst | 1 + 3 files changed, 59 insertions(+), 38 deletions(-) diff --git a/docs/source/index.rst b/docs/source/index.rst index 91826d7..09340b0 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -6,7 +6,7 @@ In particular, this tool helps: * set up dataframe objects containing channel map and status for a given subsystems (pulser, geds, spms) * get data for parameters (from raw/dsp/hit tiers or user defined ones) of interest based on a given dataset -* inspect parameters by providing either a time interval, runs or keys to inspect +* inspect parameters by providing either a time interval, a list of run(s) or key(s) to inspect * plotting status maps (e.g., ON/OFF/...) for each channel, spotting those that are problematic when overcoming/undercoming given thresholds Getting started diff --git a/docs/source/manuals/get_sc_plots.rst b/docs/source/manuals/get_sc_plots.rst index 0da4977..174802a 100644 --- a/docs/source/manuals/get_sc_plots.rst +++ b/docs/source/manuals/get_sc_plots.rst @@ -7,49 +7,84 @@ How to load SC data A number of parameters related to the LEGEND hardware configuration and status are recorded in the Slow Control (SC) database. The latter, PostgreSQL database resides on the ``legend-sc.lngs.infn.it`` host, part of the LNGS network. To access the SC database, follow the `Confluence (Python Software Stack) `_ instructions. -Data are loaded following the ``pylegendmeta`` tutorial , which shows how to inspect the database. +Data are loaded following the `pylegendmeta `_ tutorial, which shows how to retrieve info from the SC database. -... put here some text on how to specify the plotting of a SC parameter in the config file (no ideas for the moment)... +Available SC parameters +----------------------- + +Available parameters at the moment include: +* ``PT114``, ``PT115``, ``PT118`` (cryostat pressures) +* ``PT202``, ``PT205``, ``PT208`` (cryostat vacuum) +* ``LT01`` (water loop fine fill level) +* ``RREiT`` (injected air temperature clean room), ``RRNTe`` (clean room temperature north), ``RRSTe`` (clean room temperature south), ``ZUL_T_RR`` (supply air temperature clean room) +* ``DaqLeft-Temp1``, ``DaqLeft-Temp2``, ``DaqRight-Temp1``, ``DaqRight-Temp2`` (rack present temperatures) +* if you want more, contact us! + +These can be easily access for any time range of interest by giving a my_config.json file as input to the command line in the following way: + +.. code-block:: -Files are collected in the output folder specified in the ``output`` config entry: + legend-data-monitor user_scdb --config my_config --port N --pswd ThePassword + +.. note:: + + - ``N`` is whatever number in the range 1024-65535. Setting a personal port different from the default one (5432) is a safer option, otherwise if a port is already in use by another user, you'll receive an error indicating that the port is already taken and you will not be able to access the SC database; + - ``ThePassword`` can be found on Confluence at `this page `_. + +An example of a config.json file is the following: .. code-block:: json { - "output": "/out", - // ... + "output": "/data1/users//prod-ref-v2", + "dataset": { + "experiment": "L200", + "period": "p09", + "version": "tmp-auto", + "path": "/data2/public/prodenv/prod-blind/", + "type": "phy", + "time_selection": ... + }, + "saving": "overwrite", + "slow_control": { + "parameters": ["DaqLeft-Temp1", "ZUL_T_RR"] + } + } -In principle, for plotting the SC data you would need just the start and the end of a time interval of interest. This means that SC data does not depend on any dataset info (``experiment``, ``period``, ``version``, ``type``) but ``time_selection``. -However, there are cases were we want to inspect a given run or time period made of keys as we usually do with germanium. +The meaning of each entry is explained below: -In the first case, we end up saving data in the following folder: +* ``output``: foldeer where to store output files; +* ``dataset``: -.. code-block:: + * ``experiment``: either *L60* (to be checked) or *L200* + * ``period``: period to inspect + * ``version``: prodenv version (eg *tmp-auto* or *ref-v1.0.0*) + * ``path``: global path to prod-blind prodenv folder + * ``type``: type of data to inspect (either *cal* or *phy*) + * ``time selection``: list of either ``runs`` or ``timestamps`` (use the format *YMDTHMSZ*), or add entries ``start`` and ``end`` with format *Y-M-D H:M:S* (see below for more detailed info) - /out/ - └── generated - └── plt - └── SC - └── - ├── SC-.pdf - ├── SC-.log - └── SC-.{dat,bak,dir} +* ``saving``: either *overwrite* (overwrites any already present file) or *append* (takes the previous file and append new data, eg for a new inspected time range) +* ``slow_control``: filed for specifying SC parameters + + * ``parameters``: list of parameters to inspect (see among the available ones what you can choose) -Otherwise, we store the SC data/plots as usual: + +In principle, for plotting the SC data you would need just the start and the end of a time interval of interest. This means that SC data does not depend on any dataset info (i.e. on entries ``experiment``, ``period``, ``version``, ``type``). +However, these entries are important to retrieve any channel map of interest for the given time range of interest. + +We store SC data in the following way: .. code-block:: - /out/ + └── generated └── plt └── └── - └── SC └── - ├── SC-.pdf - ├── SC-.log + ├── SC-.hdf └── SC-.{dat,bak,dir} @@ -63,18 +98,3 @@ Otherwise, we store the SC data/plots as usual: - if ``{'runs': 1}`` (one run), then = ``r001``; - if ``{'runs': [1, 2, 3]}`` (multiple runs), then = ``r001_r002_r003``. -Shelve output objects -~~~~~~~~~~~~~~~~~~~~~ -*Under construction...* - - -Available SC parameters ------------------------ - -Available parameters include: - -- ``PT114``, ``PT115``, ``PT118`` (cryostat pressures) -- ``PT202``, ``PT205``, ``PT208`` (cryostat vacuum) -- ``LT01`` (water loop fine fill level) -- ``RREiT`` (injected air temperature clean room), ``RRNTe`` (clean room temperature north), ``RRSTe`` (clean room temperature south), ``ZUL_T_RR`` (supply air temperature clean room) -- ``DaqLeft-Temp1``, ``DaqLeft-Temp2``, ``DaqRight-Temp1``, ``DaqRight-Temp2`` (rack present temperatures) diff --git a/docs/source/manuals/index.rst b/docs/source/manuals/index.rst index d4d8c7d..fa1dc56 100644 --- a/docs/source/manuals/index.rst +++ b/docs/source/manuals/index.rst @@ -6,4 +6,5 @@ User Manual avail_pars get_plots + get_sc_plots inspect_plots From 49bde726d455f7870efbca9ba28ff40fcb6af6e7 Mon Sep 17 00:00:00 2001 From: Sofia Calgaro Date: Mon, 11 Mar 2024 12:06:29 +0100 Subject: [PATCH 2/3] fixed old docu --- docs/source/manuals/avail_pars.rst | 8 ++ docs/source/manuals/get_plots.rst | 36 +++++---- docs/source/manuals/inspect_plots.rst | 111 ++++++-------------------- 3 files changed, 52 insertions(+), 103 deletions(-) diff --git a/docs/source/manuals/avail_pars.rst b/docs/source/manuals/avail_pars.rst index 4753c5e..d69f71e 100644 --- a/docs/source/manuals/avail_pars.rst +++ b/docs/source/manuals/avail_pars.rst @@ -44,6 +44,14 @@ Available parameters - you can pick only ``phy`` or ``all`` entries - you can flag special events, like ``pulser``, ``pulser01ana``, ``FCbsln`` or ``muon`` events +.. warning:: + + It has been found out that no muon signals were being recorded in the auxiliary channel MUON01 for periods p08 and p09 (up to r003 included). + This means the present code is not able to flag the germanium events for which there was a muon crossing the experiment. + In other words, the dataframe associated to the ``muon`` events here will be empty. + Moreover, if you select ``phy`` entries, these will still contain muons since the cut over this does not work. + + .. important:: Special parameters are typically saved under ``settings/special-parameters.json`` and carefully handled when loading data. diff --git a/docs/source/manuals/get_plots.rst b/docs/source/manuals/get_plots.rst index dc83434..f7402ad 100644 --- a/docs/source/manuals/get_plots.rst +++ b/docs/source/manuals/get_plots.rst @@ -7,9 +7,9 @@ After the installation, a executable is available at ``~/.local/bin``. To automatically generate plots, two different methods are available. All methods rely on the existence of a config file containing the output folder (``output``) where to store results, the ``dataset`` you want to inspect, and the ``subsystems`` (pulser, geds, spms) -you want to study and for which you want to load data. +you want to study and for which you want to load data. See next section for more details. -You can either run it by importing the ``legend-data-monitor`` module: +You can either run the code by importing the ``legend-data-monitor`` module: .. code-block:: python @@ -23,11 +23,21 @@ Or run it by parsing to the executable the path to the config file: $ legend-data-monitor user_prod --config path_to_config.json +If you want to inspect bunches of data (useful to avoid the process to get killed +when loading lots of heavy files), you can use + +.. code-block:: bash + + $ legend-data-monitor user_bunch --config path_to_config.json --n_files N + +where ``N`` specifies how many files you want to inspect together at each iteration e.g. ``N=40`` +(one run is usually made up of ca. 160 files). + + .. warning:: Use the ``user_prod`` command line interface for generating your own plots. - ``auto_prod`` was designed to be used during automatic data production, for generating monitoring plots on the fly when processing data. For the moment, no documentation will be provided. - ``user_rsync_prod`` was designed to be used by an user for a personal automatic plot generation, using rsync to synchronize with lh5 files automatically produced. + ``auto_prod`` and ``user_rsync_prod`` were designed to be used during automatic data production, for generating monitoring plots on the fly for new processed data. For the moment, no documentation will be provided. Configuration file @@ -40,12 +50,12 @@ Example config .. code-block:: json { - "output": "/out", // output folder + "output": "", // output folder "dataset": { "experiment": "L200", - "period": "p02", - "version": "v06.00", - "path": "/data1/users/marshall/prod-ref", + "period": "p09", + "version": "tmp-auto", + "path": "/data2/public/prodenv/prod-blind/", "type": "phy",// data type (either cal, phy, or ["cal", "phy"]) "start": "2023-02-07 02:00:00", // time cut (here based on start+end) "end": "2023-02-07 03:30:00" @@ -86,16 +96,8 @@ In particular, ``dataset`` settings are: - ``'window': '1d 2h 0m'`` ( time window in the past from current time point) in format ``Xd Xh Xm`` for days, hours, minutes; - ``'runs': 1`` (one run) or ``'runs': [1, 2, 3]`` (list of runs) in integer format. -.. - Note: currently taking range between earliest and latest i.e. also including the ones in between that are not listed, will be modified to either - - 1. require only two timestamps as start and end, or - 2. get only specified timestamps (strange though, because would have gaps in the plot) - - The same happens with run selection. - -Then, ``subsystems`` can either be ``pulser``, ``geds`` or ``spms`` (note, 2023-03-07: spms plots are not implemented yet, but DataLoader can load the respective data if needed). +Then, ``subsystems`` can either be ``pulser``, ``geds`` or ``spms`` (note: spms plots are not implemented yet, but DataLoader can load the respective data if needed). For each subsystem to be plotted, specify diff --git a/docs/source/manuals/inspect_plots.rst b/docs/source/manuals/inspect_plots.rst index cf02410..b770e3e 100644 --- a/docs/source/manuals/inspect_plots.rst +++ b/docs/source/manuals/inspect_plots.rst @@ -4,16 +4,22 @@ How to inspect plots Output files ------------ -After the code has run, shelve object files containing the data and plots generated for the inspected parameters/subsystems +After the code has run, hdf object files containing the data and plots generated for the inspected parameters/subsystems are produced, together with a pdf file containing all the generated plots and a log file containing running information. In particular, the last two files are created for each inspected subsystem (pulser, geds, spms). +.. warning:: + + Shelve files are produced as an output as well, this was the first format chosen for the output. + The code still has to be fixed to remove these files from routines. + At the moment, they are important when using the ``"saving": "append"`` option, so do not remove them if you are going to use it! + Files are usually collected in the output folder specified in the ``output`` config entry: .. code-block:: json { - "output": "/out", + "output": "", // ... Then, depending on the chosen dataset (``experiment``, ``period``, ``version``, ``type``, time selection), @@ -21,7 +27,7 @@ different output folders can be created. In general, the output folder is struct .. code-block:: - /out/ + └── prod-ref └── └── generated @@ -32,6 +38,7 @@ different output folders can be created. In general, the output folder is struct ├── ----.pdf ├── ----.log └── ---.{dat,bak,dir} + �~T~T�~T~@�~T~@ --- = ``r001_r002_r003``. -Shelve output objects -~~~~~~~~~~~~~~~~~~~~~ -*Under construction... (structure might change over time, but content should remain the same)* +Output .hdf files +------------- -The output object ``---.{dat,bak,dir}`` has the following structure: +Output hdf files for ``geds`` have the following dictionary structure, where ```` is the name of one of the inspected parameters, ```` is the event type, e.g. *IsPulser* or *IsBsln*: -.. code-block:: +- ``__info`` = some useful info +- ``_`` = absolute values +- ``__mean`` = average over the first 10% of data (within the selected time window) of ``_`` +- ``__var`` = % variations of ```` wrt ``__mean`` +- ``__pulser01anaRatio`` = ratio of absolute values ``_`` with PULS01ANA absolute values +- ``__pulser01anaRatio_mean`` = average over the first 10% of data (within the selected time window) of ``__pulser01anaRatio`` +- ``__pulser01anaRatio_var`` = % variations of ``__pulser01anaRatio`` wrt ``__pulser01anaRatio_mean`` +- ``__pulser01anaDiff`` = difference of absolute values ``_`` with PULS01ANA absolute values +- ``__pulser01anaDiff_mean`` = average over the first 10% of data (within the selected time window) of ``__pulser01anaDiff`` +- ``__pulser01anaDiff_var`` = % variations of ``__pulser01anaDiff`` wrt ``__pulser01anaDiff_mean`` - --- - └── monitoring - ├── pulser // event type - │ └── cuspEmax_ctc_cal // parameter - │ ├── 4 // this is the channel FC id - │ │ ├── values // these are y plot-values shown - │ │ │ ├── all // every timestamp entry - │ │ │ └── resampled // after the resampling - │ │ ├── timestamp // these are plot-x values shown - │ │ │ ├── all - │ │ │ └── resampled - │ │ ├── mean // mean over the first 10% of data within the range inspected by the user - │ │ └── plot_info // some useful plot-info: ['title', 'subsystem', 'locname', 'unit', 'plot_style', 'parameter', 'label', 'unit_label', 'time_window', 'limits'] - │ ├── ...other channels... - │ ├── df_geds // dataframe containing all geds channels for a given parameter - │ ├──
// Figure object - │ └── map_geds // geds status map (if present) - ├─all - │ └── baseline - │ ├── ...channels data/info... - │ └── ...other summary objects (df/status map/figures)... - │ └── wf_max - │ └── ... - └──phy - └── ... - -One way to open it and inspect the saved objects for a given channel, eg. ID='4', is to do - -.. code-block:: python - - import shelve - - with shelve.open("---") as file: - # get y values - all_data_ch4 = file['monitoring']['pulser']['baseline']['4']['values']['all'] - resampled_data_ch4 = file['monitoring']['pulser']['baseline']['4']['values']['resampled'] - # get info for plotting data - plot_info_ch4 = file['monitoring']['pulser']['baseline']['4']['plot_info'] - -To get the corresponding dataframe (containing all channels with map/status info and loaded parameters), you can use - -.. code-block:: python - - import shelve - - with shelve.open("---") as file: - df_geds = file['monitoring']['pulser']['baseline']['df_geds'].data - -To open the saved figure for a given parameter, one way to do it is through - -.. code-block:: python - - import io - from PIL import Image - with io.BytesIO(shelf['monitoring']['pulser']['baseline']['
']) as obj: - # create a PIL Image object from the bytes - pil_image = Image.open(obj) - # convert the image to RGB color space (to enable PDF saving) - pil_image = pil_image.convert('RGB') - # save image to disk - pil_image.save('figure.pdf', bbox_inches="tight") - -.. important:: - -The key name ``
`` changes depending on the used ``plot_style`` for producing that plot. In particular, - -- if you use ``"plot_style": "per channel"``, then ``
= figure_plot_string_``, where ``string_no`` is the number of one of the available strings; -- if you use ``"plot_style": "per cc4"`` or ``"per string"`` or ``"array"``, then ``
= figure_plot``; -- if you use ``"plot_style": "per barrel"``, then ``
= figure_plot__``, where ```` is either "IB" or "OB, while ```` is either "top" or "bottom". -.. note:: - - There is no need to create one shelve object for each inspected subsystem. - Indeed, one way to separate among pulser, geds and spms is to look at channel IDs. - In any case, the subsystem info is saved under ``["monitoring"][][]["plot_info"]["subsystem"]``. Inspect plots ------------- -*Under construction* - -- Near future: `Dashboard `_ tool -- Future: notebook to interactively inspect plots (with buttons?) +- Some standard plots to monitor detectors' response can be found online on the `Dashboard `_ +- Some notebooks to interactively inspect plots can be found under the ``notebook`` folder From d4e594d902b1011b4c08d01d8822c3e575648a3e Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon, 11 Mar 2024 11:07:22 +0000 Subject: [PATCH 3/3] style: pre-commit fixes --- docs/source/manuals/avail_pars.rst | 4 ++-- docs/source/manuals/get_sc_plots.rst | 3 +-- docs/source/manuals/inspect_plots.rst | 4 ++-- 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/docs/source/manuals/avail_pars.rst b/docs/source/manuals/avail_pars.rst index d69f71e..43ea7e6 100644 --- a/docs/source/manuals/avail_pars.rst +++ b/docs/source/manuals/avail_pars.rst @@ -46,9 +46,9 @@ Available parameters .. warning:: - It has been found out that no muon signals were being recorded in the auxiliary channel MUON01 for periods p08 and p09 (up to r003 included). + It has been found out that no muon signals were being recorded in the auxiliary channel MUON01 for periods p08 and p09 (up to r003 included). This means the present code is not able to flag the germanium events for which there was a muon crossing the experiment. - In other words, the dataframe associated to the ``muon`` events here will be empty. + In other words, the dataframe associated to the ``muon`` events here will be empty. Moreover, if you select ``phy`` entries, these will still contain muons since the cut over this does not work. diff --git a/docs/source/manuals/get_sc_plots.rst b/docs/source/manuals/get_sc_plots.rst index 174802a..a2de1d5 100644 --- a/docs/source/manuals/get_sc_plots.rst +++ b/docs/source/manuals/get_sc_plots.rst @@ -56,7 +56,7 @@ An example of a config.json file is the following: The meaning of each entry is explained below: * ``output``: foldeer where to store output files; -* ``dataset``: +* ``dataset``: * ``experiment``: either *L60* (to be checked) or *L200* * ``period``: period to inspect @@ -97,4 +97,3 @@ We store SC data in the following way: - if ``{'timestamps': ['20230207T103123Z', '20230207T141123Z', '20230207T083323Z']}`` (multiple keys), then = ``20230207T083323Z_20230207T141123Z`` (min/max timestamp interval) - if ``{'runs': 1}`` (one run), then = ``r001``; - if ``{'runs': [1, 2, 3]}`` (multiple runs), then = ``r001_r002_r003``. - diff --git a/docs/source/manuals/inspect_plots.rst b/docs/source/manuals/inspect_plots.rst index b770e3e..68592fe 100644 --- a/docs/source/manuals/inspect_plots.rst +++ b/docs/source/manuals/inspect_plots.rst @@ -11,7 +11,7 @@ the last two files are created for each inspected subsystem (pulser, geds, spms) .. warning:: Shelve files are produced as an output as well, this was the first format chosen for the output. - The code still has to be fixed to remove these files from routines. + The code still has to be fixed to remove these files from routines. At the moment, they are important when using the ``"saving": "append"`` option, so do not remove them if you are going to use it! Files are usually collected in the output folder specified in the ``output`` config entry: @@ -81,5 +81,5 @@ Output hdf files for ``geds`` have the following dictionary structure, where ``< Inspect plots ------------- -- Some standard plots to monitor detectors' response can be found online on the `Dashboard `_ +- Some standard plots to monitor detectors' response can be found online on the `Dashboard `_ - Some notebooks to interactively inspect plots can be found under the ``notebook`` folder