Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Dask Gateway clusters from config #135

Open
TomAugspurger opened this issue Jun 29, 2020 · 8 comments
Open

Support for Dask Gateway clusters from config #135

TomAugspurger opened this issue Jun 29, 2020 · 8 comments

Comments

@TomAugspurger
Copy link
Member

Right now, IIUC, to create a cluster using the button the config takes a python class, args, and kwargs to create the cluster. This isn't flexible enough for dask-gateway, which requires creating an intermediate Gateway object.

Two options

  1. Expand the logic of the lab extension's cluster creation to take a snippet of code to run.
  2. Update dask-gateway to have a "simple" way of creating a cluster that just uses the defaults (cc @jcrist).

dask/dask-gateway#55 is related, but more focused on expanding dask-labextension to take advantage of dask-gateway

@jcrist
Copy link
Member

jcrist commented Jun 29, 2020

dask-gateway doesn't require creating an intermediate Gateway object - you can already call dask_gateway.GatewayCluster directly (we might want to update our docs to better show this). I've verified things work with dask-labextension with default parameters just like any other cluster.

@TomAugspurger
Copy link
Member Author

Thanks for the info Jim.

@TomAugspurger
Copy link
Member Author

@jcrist one issue with this still, the code generated to connect the client is something like

from dask.distributed import Client

client = Client("gateway://traefik-gcp-uscentral1b-prod-dask-gateway.prod:80/prod.bb16cdceacd541089ac9d7288d717595")
client

but when auth is enabled, you don't have the security object and so that raises

TypeError: Gateway expects a `ssl_context` argument of type ssl.SSLContext, instead got None

Do you or @ian-r-rose have any guesses on if that can be supported? Nothing comes to my mind immediately.

@ian-r-rose
Copy link
Collaborator

@TomAugspurger good question. I don't think there is a good way to do this right now from the labextension side. As you point out, the code template to generate a client connection is pretty dumb:

export function getClientCode(cluster: IClusterModel): string {
return `from dask.distributed import Client
client = Client("${cluster.scheduler_address}")
client`;
}

If the client has a way to pick up an SSL key from the environment context, that would be best from my perspective.
Otherwise, we may need to teach the cluster representation about auth. The current typings for the model are specified here:

/**
* An interface for a JSON-serializable representation of a cluster.
*/
export interface IClusterModel extends JSONObject {
/**
* A unique string ID for the cluster.
*/
id: string;
/**
* A display name for the cluster.
*/
name: string;
/**
* A URI for the dask scheduler.
*/
scheduler_address: string;
/**
* A URL for the Dask dashboard.
*/
dashboard_link: string;
/**
* Total number of cores used by the cluster.
*/
cores: number;
/**
* Total memory used by the cluster, as a human-readable string.
*/
memory: string;
/**
* The number of workers for the cluster.
*/
workers: number;
/**
* If adaptive is enabled for the cluster, this contains an object
* with the minimum and maximum number of workers. Otherwise it is `null`.
*/
adapt: null | { minimum: number; maximum: number };
}

so only the address is tracked at the moment.

@jcrist
Copy link
Member

jcrist commented Jul 14, 2020

Could we make the template configurable and formatted on the server side? Then it could use Gateway.connect instead, which might be cleaner. Would need the cluster variable in the template format, but that's about it.

@ian-r-rose
Copy link
Collaborator

@jcrist yes, that would be doable. Would you envision users setting the template in their config, or adding some kind of entrypoint to dask-gateway? At that point, I wonder if it would also be worthwhile to bite the bullet and special case dask-gateway to use it for cluster discovery and management (at least optionally).

@jcrist
Copy link
Member

jcrist commented Jul 14, 2020

I was thinking this would be part of the user-side config for dask-labextension. I do think in the long run we'll want to special-case dask-gateway for the lab extension, but exposing the template to the user will resolve this issue, and feels like a useful thing to do generally (there's other kwargs the user might potentially want to set as well).

@thomafred
Copy link

I see that @consideRatio found a fix using the GatewayCluster-class (#203). Confirmed to work with the daskhub helm-chart:

jupyterhub:
  hub:
    extraConfig:
      10-patch-dask-labextension-config: |-
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__MODULE", "dask_gateway")
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__CLASS", "GatewayCluster")
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__ARGS", "[]")
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__KWARGS", "{}")

However, the DASK DASHBOARD URL is not set correctly as mentioned by Erik in the issue above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants