Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanity check bot configuration #289

Open
ocaisa opened this issue Nov 29, 2024 · 1 comment
Open

Sanity check bot configuration #289

ocaisa opened this issue Nov 29, 2024 · 1 comment

Comments

@ocaisa
Copy link
Member

ocaisa commented Nov 29, 2024

It would be great to add something that can sanity check the bot configuration. For example, when creating a cluster with Magic Castle it would be quite easy to attach the wrong type of instance to a partition name (and it would not be easy to notice, the Zen2 nodes are called Standard_HB120-16rs_v2 and the Zen4 nodes are Standard_HB176-24rs_v4 so no obvious hints in the names).

One way to verify the architecture is

[ocaisa@login1 ~]$ srun -n 1 -p x86-64-amd-zen2-node /cvmfs/software.eessi.io/versions/2023.06/init/eessi_archdetect.sh cpupath
x86_64/amd/zen2                                                                                                                                                             
[ocaisa@login1 ~]$ srun -n 1 -p x86-64-amd-zen4-node /cvmfs/software.eessi.io/versions/2023.06/init/eessi_archdetect.sh cpupath
x86_64/amd/zen4
[ocaisa@login1 ~]$ srun -n 1 -p aarch64-neoverse-n1-node /cvmfs/software.eessi.io/versions/2023.06/init/eessi_archdetect.sh cpupath
aarch64/neoverse_n1

This would also check that the bot can successfully submit jobs.

You'd also want to verify that the bot can talk to the target repository with the correct permissions.

@ocaisa
Copy link
Member Author

ocaisa commented Nov 29, 2024

I can see how that would be a bit tricky though, given how you define arch_target_map. Perhaps a sanity check command can be part of that definition? Or maybe allow them to define a sanity check script that if run via srun with arch_target_map dict value options should return the corresponding key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant