Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing an Ersatz of ClusterManager to fix jobqueue issues linked to upstream deploy.Cluster limitations #170

Closed
guillaumeeb opened this issue Oct 6, 2018 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@guillaumeeb
Copy link
Member

guillaumeeb commented Oct 6, 2018

So we have a target: dask/distributed#2235.

We have some issues related to it:

and perhaps less related: #103

I propose as discussed in some of the issues or PR mentioned above to try to fix those issues directly in dask-jobqueue. This will involve duplicating some of the logic of distributed.deploy.Cluster object here, but also give some interesting insights of how to refactor things for dask/distributed#2235.

I propose to do this gradually, and to slowly provides some PR that will fix issues one by one, and also analyse and underline some existing code pieces from the current deploy.Cluster mechanism that should be modified for dask/distributed#2235.

@guillaumeeb guillaumeeb changed the title Implementing an Erzatz of ClusterManager to fix jobqueue issues linked to upstream deploy.Cluster limitations Implementing an Eszatz of ClusterManager to fix jobqueue issues linked to upstream deploy.Cluster limitations Oct 6, 2018
@mrocklin
Copy link
Member

mrocklin commented Oct 7, 2018

I think that having dask-jobqueue break entirely from the classes in dask-distributed would be an interesting approach. Of the three projects (jobqueue, kubernetes, yarn) jobqueue seems to be the fastest moving right now. Rather than constrain your development by the slowness of the group, it might be useful to see what you come up with and how you restructure things.

@guillaumeeb guillaumeeb changed the title Implementing an Eszatz of ClusterManager to fix jobqueue issues linked to upstream deploy.Cluster limitations Implementing an Ersatz of ClusterManager to fix jobqueue issues linked to upstream deploy.Cluster limitations Oct 15, 2018
@guillaumeeb guillaumeeb added the enhancement New feature or request label Oct 27, 2018
@guillaumeeb guillaumeeb self-assigned this Oct 27, 2018
@guillaumeeb
Copy link
Member Author

@mrocklin I'm currently working on the ClusterManager implementation here. I'm wondering if this wouldn't be cleaner if I just removed dependency on distributed.deploy from dask-jobqueue, and duplicate the needed part here. Currently I'm somewhere in between...

@guillaumeeb
Copy link
Member Author

But this is probably what you had in mind...

@mrocklin
Copy link
Member

I see no problem with breaking from the distributed.deploy code and starting fresh.

@guillaumeeb
Copy link
Member Author

Closing this as SpecCluster implementation #306 is covering it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants