Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slurm broken #196

Open
MilesCranmer opened this issue Feb 10, 2024 · 7 comments
Open

Slurm broken #196

MilesCranmer opened this issue Feb 10, 2024 · 7 comments
Labels

Comments

@MilesCranmer
Copy link
Collaborator

MilesCranmer commented Feb 10, 2024

Is anybody maintaining this package? I haven't been able to get Slurm working for the past month or so... It just ends up stalling on connecting to worker 1 out of <N>:

julia> p = addprocs_slurm(2)
connecting to worker 1 out of <N>

The exact same code seemed to work a month ago. This is slurm 22.05.8. Not sure if this new version is breaking things or not.

@Moelf
Copy link
Collaborator

Moelf commented Feb 11, 2024

I don't have access to a slurm now, but it would be useful to know if a previous version was okay

@MilesCranmer
Copy link
Collaborator Author

Not sure I know how to test other versions of slurm... I am stuck with whatever my institute cluster has installed

@Moelf
Copy link
Collaborator

Moelf commented Feb 11, 2024

I guess I this case ask the HPC admin see if they know anything that might be the problem

@kescobo
Copy link
Collaborator

kescobo commented Feb 11, 2024

Ugh, @MilesCranmer that's annoying. I also don't currently have access to a SLURM cluster... this is the kind of thing that it would be nice if we had #105 that could test on different schedulers 🤦

Would definitely be worth checking in with the cluster admin to see if SLURM was recently updated so we can at least know that that's the culprit.

@MilesCranmer
Copy link
Collaborator Author

We could if this PR gets finished #193

@cnrrobertson
Copy link
Contributor

@kescobo I can confirm that this issue started for me after an upgrade to Slurm on my institution's cluster. Unfortunately, I don't know what the previous version was, but currently the version is 23.11.1

@kescobo
Copy link
Collaborator

kescobo commented Mar 5, 2024

From the initial post, it looks like it goes back to v22

This is slurm 22.05.8

Does anyone know if SLURM follows SemVer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants