Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask JobQueue and TCP connections between login and compute nodes #354

Closed
orbitfold opened this issue Oct 16, 2019 · 15 comments
Closed

Dask JobQueue and TCP connections between login and compute nodes #354

orbitfold opened this issue Oct 16, 2019 · 15 comments
Labels
documentation Documentation-related question Further information is requested

Comments

@orbitfold
Copy link

Hello all, sorry for the noob question. I am trying to understand why my software works on some clusters and not on others. Does Dask Jobqueue require a TCP connection between login and compute nodes? Since that seems to be the difference between working and non-working right now. If it does require one and it is not allowed, is there a workaround?

@lesteve
Copy link
Member

lesteve commented Oct 16, 2019

Does Dask Jobqueue require a TCP connection between login and compute nodes?

The dask scheduler needs to connect by TCP to the dask workers. The dask scheduler is where you create your FooCluster, in most cases people do it on the login node because this is the simplest thing to do.

A hacky work-around would be to start an interactive job, and launch the same script / Jupyter notebook inside your interactive job. That may work.

Another longer-term fix would be #186 which seems to have been created with your use case in mind (in particular #186 (comment)). I have to say, I don't think anyone is planning to work on this in the medium-term.

Out of interest, do you have an idea why no TCP connection between login and compute nodes are allowed. I am still learning about all the possible configuration tweaks of HPC clusters, this feels like quite an endless task ...

@lesteve
Copy link
Member

lesteve commented Oct 16, 2019

I have to say, I don't think anyone is planning to work on this in the medium-term.

As always a PR would be more than welcome!

@guillaumeeb
Copy link
Member

A hacky work-around would be to start an interactive job

Not that hacky, in platforms with lot of users, it may be the best thing to do :)!

@lesteve
Copy link
Member

lesteve commented Oct 16, 2019

Not that hacky, in platforms with lot of users, it may be the best thing to do :)!

I guess hacky was not the best word, I meant more "not as convenient". For example:

  • the ssh tunnel command if you use one is slightly more involved
  • you need to change your ssh tunnel command depending on which node your scheduler job end up in
  • if your scheduler interactive job goes over its walltime you lose everything
  • in some cluster I am guessing that interactive job walltime is quite limited. Interactive jobs are meant for debugging purposes not coordinating a long-running computation
  • probably other restrictions I may not know about

@lesteve
Copy link
Member

lesteve commented Oct 17, 2019

@orbitfold do you know whether all the ports are blocked on your problematic cluster? The reason I am asking is because in #355 only a range of ports are blocked. I am trying to get a better feeling of the possible configurations in different clusters.

Maybe @guillaumeeb has some sys-admin perspective about how common TCP/IP restriction between login and compute nodes is and whether it is more likely that it is a partial or total port restriction.

@orbitfold
Copy link
Author

orbitfold commented Oct 17, 2019

@orbitfold do you know whether all the ports are blocked on your problematic cluster?

Apparently they don't want a pilot job type situation where someone submits a large job that then orchestrates the computations. They want each process to be submitted via the batch system. I guess they don't trust you to do your own load balancing. At least that is what we're told by an admin on one of such system.

@orbitfold
Copy link
Author

I have to say, I don't think anyone is planning to work on this in the medium-term.

As always a PR would be more than welcome!

I'm more than happy to contribute but I'll have to piece together what needs to be done here.

@lesteve
Copy link
Member

lesteve commented Oct 17, 2019

I think one really cool thing you could do, is check whether the simplest work-around we suggested works.

The idea is to run your script or notebook inside an interactive job.

If that works, this would be great to document the work-around (related to #356).

@orbitfold
Copy link
Author

I think one really cool thing you could do, is check whether the simplest work-around we suggested works.

The idea is to run your script or notebook inside an interactive job.

If that works, this would be great to document the work-around (related to #356).

So I personally only have access to a cluster where it works. However I have been told that of the two problem clusters the work-around worked on one of them. In the other one it failed to establish a TCP connection which would imply they disallow TCP connections between compute nodes. Which is nuts but that is the world we live in. If you need more details I'm happy to try and provide them.

@orbitfold
Copy link
Author

I also checked to see if it works if I run the script on our cluster (where it already works) in interactive mode and it does. So I think as long as the cluster admins let you tcp between compute nodes this is a valid work-around.

@lesteve
Copy link
Member

lesteve commented Oct 18, 2019

Thanks a lot for your feed-back. It would be great if you want to contribute some doc with this content, I am thinking mostly login nodes - compute nodes TCP/IP port restrictions + interactive job work-around.

In the other one it failed to establish a TCP connection which would imply they disallow TCP connections between compute nodes. Which is nuts but that is the world we live in

If you haven't done it already, I would suggest contacting IT about this and explaining your use case. You can certainly use examples of "serious" clusters, like Cheyenne (look at Pangeo doc) or Summit (First cluster in the Top 500 IIRC), see https://blog.dask.org/2019/08/28/dask-on-summit. The Pangeo community may be a good place to get involved as well if you haven't already.

It can be frustrating at times but some issues can not be fixed technically, but only socially or politically whatever you want to call it. I fully sympathise with the "frustration part": I am currently trying to get an account on the newfangled IT cluster for Artificial Intelligence in France, and let me tell you, there is some room for improvement in the user experience area ...

Out of curiosity, could you give us a bit more clues about the clusters you were mentioning. Info like: name, geographical location, main scientific domain of cluster users if any, etc ... My goal here is to get a better picture of the variability in HPC situations.

@orbitfold
Copy link
Author

orbitfold commented Oct 18, 2019

Out of curiosity, could you give us a bit more clues about the clusters you were mentioning. Info like: name, geographical location, main scientific domain of cluster users if any, etc ... My goal here is to get a better picture of the variability in HPC situations.

I personally work for LRZ in the Munich area and we host SuperMUC-NG and a number of smaller clusters. There is no single focus scientific domain. They are mostly Skylake and Haswell nodes. We use SLURM for all clusters. I asked a colleague to provide info about the other clusters.

@lesteve
Copy link
Member

lesteve commented Oct 18, 2019

Thanks a lot, that was exactly the kind of information level I was after.

@lesteve lesteve added the question Further information is requested label Dec 2, 2019
@guillaumeeb
Copy link
Member

Maybe @guillaumeeb has some sys-admin perspective about how common TCP/IP restriction between login and compute nodes is and whether it is more likely that it is a partial or total port restriction.

I've really no idea how common this is, but I imagine this would be a near total restriction. In our setup, we've got login nodes that automatically redirects to what we call interactive nodes upon connection (but we can stay on these nodes indefinitly, it's not interactive in the interactive job sense). login nodes are highly secured, but interactive ones are fully open to compute nodes.

@guillaumeeb guillaumeeb added the documentation Documentation-related label Dec 30, 2019
@guillaumeeb
Copy link
Member

Closing in favor of #356 as what is needed here is documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation-related question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants