Skip to content

Sam_guide.md

Lynn Garren edited this page Jun 12, 2023 · 1 revision

Sam guide

The art framework is capable of generating sam metadata, which it stores internally in artroot format files. Internal sam metadata can be extracted into human-readable format using the helper program sam_metadata_dumper:

    sam_metadata_dumper myfile.root

Generating sam metadata in an art program does not require interacting with the samweb server. Rather, generation of sam metadata is controlled by art built-in and user-written services, modules and plugins, with corresponding fcl configurations.

Sam metadata can be classified as either built-in or experiment-specific (also called parameters). Sam metadata can be further classified as either per-job (is the same for each output file) or per-file (is different for different output files). When we use the term "per-job" sam metadata in this article, we mean "known in advance," meaning able to be fully specified as fcl parameters (not needing to be captured or generated inside the program). The two concepts (built-in vs. experiment-specific, and per-job vs. per-file) are orthogonal, so there are actually four different categories of sam metadata, and four different methods of generating them in art programs. All built-in sam metadata is capable of being generated using art-provided services and modules, requiring only that the user supply the appropriate fcl configuration. Generating experiment-specific sam metadata in general requires some experiment-supplied c++ code.

Art provides a built-in service called FileCatalogMetadata. This service should be configured in any art program where one wants the output files to contain sam metadata. The FileCatalogMetadata service allows all sam built-in and per-job sam metadata to be specified as part of its fcl configuration. Here is a typical configuration showing some typically used fcl parameters.

    services.FileCatalogMetadata: { applicationFamily:  "art" 
                                    applicationVersion: "development" 
                                    fileType:           "mc" 
                                    group:              "uboone" 
                                    runType:            "physics" 
                                  }

Built-in per-file sam metadata (RootOutput module){.wiki-anchor}

The art-provided module RootOutput is capable of generating all built-in per-file sam metadata (including file parentage, run, and event information). Here is a typical RootOutput fcl configuration, including sam metadata:

    outputs.out1: { module_type: RootOutput
                    fileName:    "output.root" 
                    dataTier:    "simulated" 
                    streamName"  "all" 
                  }

The RootOutput fcl parameters related to sam metadata are dataTier and streamName. Any program that generates sam metadata should specify dataTier for each configured RootOutput module. It is important to keep the dataTier name specified here the same as the datatier option in the xml stage configuration. The streamName fcl parameter is optional. If not specified, streamName defaults to the module label (out1 in the above example).

Experiment-specific sam metadata can be transmitted to the FileCatalogMetadata service by calling method addMetadata of that service. These calls can occur anywhere in experiment-provided c++ code. A simple and configurable method of doing this is to write an experiment-specific service that does this (a service calling another service). The experiment specific service should be configured in the services.user section of the job configuration fcl file.

The project_utilities pytyhon module provides an experiment-configuration hook function called get_sam_metadata for generating the fcl configuration for the experiment-specific sam metadata service.

To be filled.

To be filled.

To be filled.