Skip to content

Testing for v4.2.0

Pawel Plesniak edited this page Aug 8, 2024 · 16 revisions

Setup

At EHN1 preferably, you need to set up a recent nightly and install drunc. Steps for doing this are here.

Configuration generation

There are then 2 options:

  • Either you want to run with a DAQConf configuration (see here)
  • Either you want to run with an OKS configuration (see here)

DAQConf configuration

You need to create a configuration compatible with drunc.

To do this, you will need to run fddaqconf_gen.

Create a file called daqconf.json which contains the following:

{
    "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": false,
        "connectivity_service_host": "np04-srv-023",
        "connectivity_service_port": 30005,
        "run_control": "drunc",
        "controller_host": "localhost",
        "ers_impl": "cern",
        "opmon_impl": "cern"
    },
    "daq_common": {
        "data_rate_slowdown_factor": 1
    },
    "detector": {
        "clock_speed_hz": 62500000
    },
    "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?label=WIBEth&subsystem=readout"
    },
    "trigger": {
        "trigger_window_before_ticks": 1000,
        "trigger_window_after_ticks": 1000
    },
    "hsi": {
        "random_trigger_rate_hz": 1.0
    }
}

Note the two important parameters:

  • run_control
  • controller_host.

Create a dro.json file containing:

[
    {
        "src_id": 100,
        "geo_id": {
            "det_id": 3,
            "crate_id": 1,
            "slot_id": 0,
            "stream_id": 0
        },
        "kind": "eth",
        "parameters": {
            "protocol": "udp",
            "mode": "fix_rate",
            "rx_iface": 0,
            "rx_host": "localhost",
            "rx_mac": "00:00:00:00:00:00",
            "rx_ip": "0.0.0.0",
            "tx_host": "localhost",
            "tx_mac": "00:00:00:00:00:00",
            "tx_ip": "0.0.0.0"
        }
    },
    {
        "src_id": 101,
        "geo_id": {
            "det_id": 3,
            "crate_id": 1,
            "slot_id": 0,
            "stream_id": 1
        },
        "kind": "eth",
        "parameters": {
            "protocol": "udp",
            "mode": "fix_rate",
            "rx_iface": 0,
            "rx_host": "localhost",
            "rx_mac": "00:00:00:00:00:00",
            "rx_ip": "0.0.0.0",
            "tx_host": "localhost",
            "tx_mac": "00:00:00:00:00:00",
            "tx_ip": "0.0.0.0"
        }
    }
]

You can now run:

fddaqconf_gen -c ./daqconf.json -m ./dro.json drunc_config

OKS configuration

First generate a daqconf configuration as described above, then convert the generated boot.json to an OKS Session with:

generate_oks_session drunc_config/boot.json

By default this will generate a Session called generated-session and save it to a file called generatedSession.data.xml. These defaults can be overridden with -s session-name and -d data-file respectively. By convention, OKS data files end in .data.xml while OKS schema files end in .schema.xml.

Running

You will need at least 3 terminal windows open, and have drunc setup in all three of them (i.e. sourcing the same env.sh).

Shell #1 Process Manager

You will need to generate a configuration for the process manager. Most likely, you will only ever need to do that once. The configuration consists of a file, that you can create in your home directory. The next will assume the configuration was created at ~/drunc-process-manager-conf.json.

The content of this file should be:

{
    "type": "ssh",
    "name": "SSHProcessManager",
    "command_address": "0.0.0.0:10054",

    "authoriser": {
        "type": "dummy"
    },

    "broadcaster": {
        "type": "kafka",
        "kafka_address": "monkafka.cern.ch:30092",
        "publish_timeout": 2
    }
}

A couple of interesting things here:

  • command_address refers to the address the process manager expects commands from. It could be that someone is using that port already, you may want to change it.
  • broadcaster.kafka_address points to the Kafka instance. This should be changed if you are using pocket.

Now you can run the process manager, by simply doing:

drunc-process-manager ~/drunc-process-manager-conf.json
<snip>
ProcessManager was started on 0.0.0.0:10054

This means that the process manager was started, and expects command on that host/port. When you want to quit it, you can ctrl-c (do not do that now, we want to run the process manager!)

Shell #2 Process Manager CLI

This shell will be the process manager CLI. For this example, we will use the configuration generated in the previous step to start DAQ applications and a controller (DAQConf or OKS).

Let us start the process manager CLI, it will connect to the process manager in shell #1. Assuming the shell #1 was on np04-srv-019, this is how you would run it:

drunc-process-manager-shell np04-srv-019:10054
pm > boot daqconf drunc_config session-name

# OR if you have generated an OKS configuration:
pm > boot oks generatedSession.data.xml generated-session # Note that oks-session-name has to be the SAME as the one provided to generate_oks_session
pm > ps
                                            Processes running
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┓
┃ session      ┃ user     ┃ friendly name   ┃ uuid                                 ┃ alive ┃ exit-code ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━┩
│ session-name │ plasorak │ dataflow0       │ 15f32283-b920-4e0b-9746-ea146f2889ad │ True  │ 0         │
│ session-name │ plasorak │ dfo             │ 498bf1e6-78a5-42a1-a429-2b56c5e942ae │ True  │ 0         │
│ session-name │ plasorak │ fakehsi         │ d9a471da-acde-47ae-abe3-98d7f5c1abaa │ True  │ 0         │
│ session-name │ plasorak │ rulocalhosteth0 │ fc76d990-d32d-44ce-be98-f45c28e1c4cd │ True  │ 0         │
│ session-name │ plasorak │ trigger         │ b3082787-3039-4df9-bcf9-3d137bea8faf │ True  │ 0         │
│ session-name │ plasorak │ controller      │ 81901dc8-3417-4870-8ea4-d222fd9a9e12 │ True  │ 0         │
└──────────────┴──────────┴─────────────────┴──────────────────────────────────────┴───────┴───────────┘

We have started 5 processes, the standard DAQ applications, and a controller that will control them. The logs and work directory is the ${PWD} where you executed the drunc-process-manager-shell.

To kill the processes, you can do

pm > kill --user plasorak
# or
pm > kill --session session-name
# or
pm > kill --name trigger

There are many more functionalities to the shell, head over to the process manager CLI documentation to see how to interact with it.

Shell #3 Controller CLI

Assuming you are following these instructions, you now have a controller running. To connect to it, you will need to know the address of the controller. For that, you can either:

  • look at the logs from the controller in the process manager by doing:
pm > logs --name controller
  • read your drunc_config/boot.json and extract it from there.

You can use the drunc-controller-shell as such:

drunc-controller-shell <controller_host>:<controller_port>
<snip>
INFO "ControllerShell": You are in control.
drunc-controller >

Let us send some commands to the controller then!

drunc-controller > describe
                                                        controller.session-name (controller) commands
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ name                ┃ input typereturn typehelp                                                                ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ describe            │ None                       │ request_response_pb2.Description  │ Describe self (return a list of commands, the type of endpoint, the │
│                     │                            │                                   │ name and session).                                                  │
│ get_children_status │ generic_pb2.PlainText,None │ controller_pb2.ChildrenStatus     │ Get the status of all the children. Only get the status from the    │
│                     │                            │                                   │ child if provided in the request.                                   │
│ get_status          │ None                       │ controller_pb2.Status             │ Get the status of self                                              │
│ ls                  │ None                       │ generic_pb2.PlainTextVector       │ List the children                                                   │
│ describe_fsm        │ None                       │ request_response_pb2.Description  │ List available FSM commands for the current state.                  │
│ execute_fsm_command │ controller_pb2.FSMCommand  │ controller_pb2.FSMCommandResponse │ Execute an FSM command                                              │
│ include             │ None                       │ controller_pb2.FSMCommandResponse │ Include self in the current session, if a children is provided,     │
│                     │                            │                                   │ include it and its eventual children                                │
│ exclude             │ None                       │ controller_pb2.FSMCommandResponse │ Exclude self in the current session, if a children is provided,     │
│                     │                            │                                   │ exclude it and its eventual children                                │
│ take_control        │ None                       │ generic_pb2.PlainText             │ Take control of self and children                                   │
│ surrender_control   │ None                       │ generic_pb2.PlainText             │ Surrender control of self and children                              │
│ who_is_in_charge    │ None                       │ generic_pb2.PlainText             │ Get who is in control of self                                       │
└─────────────────────┴────────────────────────────┴───────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘
drunc-controller > ls
['dataflow0', 'dfo', 'fakehsi', 'rulocalhosteth0', 'trigger']
drunc-controller > describe --command fsm
                           controller.session-name (controller) commands
┏━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ name ┃ input typereturn typehelp ┃ Command arguments ┃
┡━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ conf │ controller_pb2.FSMCommand │ controller_pb2.FSMCommandResponse │      │                   │
└──────┴───────────────────────────┴───────────────────────────────────┴──────┴───────────────────┘
drunc-controller > fsm conf
<snip>
          conf execution report
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Name                ┃ Command success ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ <root>              │ Yes             │
│ ├── dataflow0       │ Yes             │
│ ├── fakehsi         │ Yes             │
│ ├── trigger         │ Yes             │
│ ├── rulocalhosteth0 │ Yes             │
│ └── dfo             │ Yes             │
└─────────────────────┴─────────────────┘
drunc-controller > describe --command fsm
                                                controller.session-name (controller) commands
┏━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ name  ┃ input typereturn typehelp ┃ Command arguments                                          ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ start │ controller_pb2.FSMCommand │ controller_pb2.FSMCommandResponse │      │ run_number (INT OPTIONAL) default: 1 help:                 │
│       │                           │                                   │      │ disable_data_storage (BOOL OPTIONAL) default: False help:  │
│       │                           │                                   │      │ trigger_rate (FLOAT OPTIONAL) default: 1.0 help:           │
│ scrap │ controller_pb2.FSMCommand │ controller_pb2.FSMCommandResponse │      │                                                            │
└───────┴───────────────────────────┴───────────────────────────────────┴──────┴────────────────────────────────────────────────────────────┘
drunc-controller > fsm start run_number 55

We have done:

  • INFO "ControllerShell": You are in control.: made sure we are able to control the controller (someone else could be in control, in which case the FSM commands will not work)
  • describe: asked the controller to describe itself (and rendered the information in a table)
  • ls: listed the children (in this case all DAQ applications)
  • describe --command fsm: asked the controller which FSM command it is expecting and what arguments
  • fsm conf: configured the DAQ application
  • fsm start run_number 55: started run 55.

Let's take the DAQ for a spin:

drunc-controller > fsm enable_triggers
[...we wait for a bit of time, to get a file...]
drunc-controller > fsm disable_triggers
drunc-controller > fsm drain_dataflow
drunc-controller > fsm stop_trigger_sources
drunc-controller > fsm stop
drunc-controller > fsm scrap

You should see a file appearing with events inside.

Several things to note:

  • There is a profusion of logging that happens. This is coming from the asynchronous logging from the controller. If someone tries to connect at the same time or to execute a command you will see it appearing on your shell too (you can try it yourself in a different shell). Unfortunately CLIs do not lend itself very well with this, and one needs a better UI for the logs to be rendered better and to not distract.
  • The describe command describes the endpoint NOT the shell. This means that there are some commands that are not directly available in the shell (for example get_children_status). However if you do status is the shell, you will get the children statuses because the shell calls get_children_status under the hood.
  • The sequence commands (start_run, shutdown etc.) are not available (yet). This means if you start the run, you will need to manually drain_dataflow, stop_trigger_sources and stop.

You can now head to the controller CLI documentation for more information.

Shutting down

You can head to shell #1 and ctrl-c, it should kill every process it manages. If you want to be less brutal, head to shell #2 and kill your session first, you will be able to make sure that no process is running by doing ps after that.

Clone this wiki locally