Sanity check bot configuration #289

ocaisa · 2024-11-29T09:44:22Z

It would be great to add something that can sanity check the bot configuration. For example, when creating a cluster with Magic Castle it would be quite easy to attach the wrong type of instance to a partition name (and it would not be easy to notice, the Zen2 nodes are called Standard_HB120-16rs_v2 and the Zen4 nodes are Standard_HB176-24rs_v4 so no obvious hints in the names).

One way to verify the architecture is

[ocaisa@login1 ~]$ srun -n 1 -p x86-64-amd-zen2-node /cvmfs/software.eessi.io/versions/2023.06/init/eessi_archdetect.sh cpupath
x86_64/amd/zen2                                                                                                                                                             
[ocaisa@login1 ~]$ srun -n 1 -p x86-64-amd-zen4-node /cvmfs/software.eessi.io/versions/2023.06/init/eessi_archdetect.sh cpupath
x86_64/amd/zen4
[ocaisa@login1 ~]$ srun -n 1 -p aarch64-neoverse-n1-node /cvmfs/software.eessi.io/versions/2023.06/init/eessi_archdetect.sh cpupath
aarch64/neoverse_n1

This would also check that the bot can successfully submit jobs.

You'd also want to verify that the bot can talk to the target repository with the correct permissions.

The text was updated successfully, but these errors were encountered:

ocaisa · 2024-11-29T09:52:28Z

I can see how that would be a bit tricky though, given how you define arch_target_map. Perhaps a sanity check command can be part of that definition? Or maybe allow them to define a sanity check script that if run via srun with arch_target_map dict value options should return the corresponding key.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sanity check bot configuration #289

Sanity check bot configuration #289

ocaisa commented Nov 29, 2024

ocaisa commented Nov 29, 2024 •

edited

Loading

Sanity check bot configuration #289

Sanity check bot configuration #289

Comments

ocaisa commented Nov 29, 2024

ocaisa commented Nov 29, 2024 • edited Loading

ocaisa commented Nov 29, 2024 •

edited

Loading