Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open discussion of various topics... #9

Open
kevin-bates opened this issue May 22, 2019 · 9 comments
Open

Open discussion of various topics... #9

kevin-bates opened this issue May 22, 2019 · 9 comments

Comments

@kevin-bates
Copy link
Collaborator

@takluyver - it's not clear to me where to open these discussions. I believe a Jupyter Server discourse category will be created soon and I'd like to see a Kernel Providers sub-category within it. Until then, I thought I'd open (for now) a single issue with multiple topics that, I believe, are worthy of discussion.

I'm happy to start working on these, but wanted to have the discussion first. I'm also copying @minrk, @rgbkrk, and @Carreau in case they have opinions. Anyone else, please feel free as well.

Whitelisting
Whitelisting is important in scenarios where many (dozens) of "kernelspecs" are configured. It provides a mechanism for administrators to isolate/limit which kernel types are available to the end user. I believe we need to preserve this functionality. Coupled with this PR in traitlets, admins could modify the whitelist without having to restart the server.

The current implementation doesn't consider whitelisting, nor is there a location from which to associate the trait. Was its omission intentional? If so, could you explain why?

Parameters
This came up in the Jupyter Server Design workshop. Each provider that supports parameterization will need to return the parameter metadata corresponding to their provided kernel. Rather than expose a new field returned from the REST API, I propose we make a parameters stanza within the already supported metadata stanza. The format of the parameters will be JSON which adhere's to a schema - the idea being that consuming applications will have what they need to appropriately prompt for values. Frontend applications that are interested in presenting parameters and gathering their values, would look for the parameters stanza in metadata. Those that aren't, will continue as they do today.

This implies that metadata should be (optionally) returned from find_kernels depending on if the provider supports parameters or needs to convey other information in that stanza.

I don't think there's any need to return argv. That strikes me as purely server and, for that matter, provider-specific.

Besides the parameter values, other information will need to be fed to the provider methods - so we probably want **kwargs on init or launch. Examples of this are configuration-related items and other items included the json body of the kernel start request.

Kernel Session Persistence
We need the ability to persist relevant information associated with a running kernel in order to reconnect to it from another server instance. The use-case here is high availability. While HA functionality may be provided as an extension to Jupyter Server, methods to save and load that information are necessary, as is the ability to "launch" from the persisted state.

Async Kernels
I think we should ONLY support async kernels in this new implementation. This implies that Jupyter Server will need support async kernels at its MappingKernelManager level. We can leverage the async kernel management work in the PRs already open on jupyter_client and notebook.

If this isn't the right mechanism to use for discussion, please let me know where that can take place.

Thank you.

@takluyver
Copy link
Owner

Hi Kevin. Sorry for the delay replying, but I am happy to use Github issues for questions.

  • Whitelisting: This probably makes sense as a parameter like KernelFinder.from_entrypoints(allow_only=['foo', 'bar']). I'm trying to keep traitlets out of this machinery.
  • Parameters: I think what you propose makes sense. I'd pass the parameters back into the launch method, probably like kernel_params={...} rather than **kwargs, because that makes it easier if we want to add other parameters for the infrastructure.
  • Persistence: the connection info is a plain dict, so there's no problem there. The manager object (equivalent to your process proxies) can have a method to serialise itself (maybe we should standardise the name for this). And for deserialisation, I'd conceive it as an alternative constructor for the manager object, rather than as an alternative 'launch' mechanism.
  • Async kernels: that probably makes sense. But then, which async framework? The notebook server has traditionally used tornado, which is now shifting to run on top of asyncio. But then I started reading about the ideas behind trio, and that's a pretty attractive design. And how convenient should we make it to use the machinery in straightforward synchronous code? This was about where my thinking got to when I decided to postpone the async question.

I should also highlight that this machinery allowing kernels to be dynamically provided was meant to be the counterpart to a rethink of how notebooks are associated with kernels. Naming 'the' kernel for a notebook in its metadata makes sense when that's something like 'python', but not when it might be something like an environment or a container on a specific system, or a remote host that someone else may not have access to.

The mechanisms to replace that never really got fleshed out. I think it would be a mistake to migrate to this machinery without also fixing that problem - the two were always meant to go together. But I don't have the time or the energy to push that kind of major redesign through. So I can't give any estimate of when this machinery would be ready for real use.

@rgbkrk
Copy link
Contributor

rgbkrk commented May 28, 2019

Hey all! Thanks for pinging me! cc @ivanov @mpacer @MSeal

I realize we've each taken different approaches to explore the space for parametrized kernels. Since I wasn't able to attend the server workshop, I'll have to peruse what all you explored at some point. @SylvainCorlay mentioned you all exploring parametrized kernels there.

The way I've been thinking about it is largely in two areas:

  • User focus: What does allowing the selecting of attributes provide for the user and UI developers?
  • Standards: What's the on-disk "spec" for launching these (the kernel.json, argv, etc.), so that we ensure these work across jupyter server implementations, cocalc, nbconvert, papermill, nteract desktop, hydrogen, and vs code?

The major difference from what I'm about to outline is that I don't wish to prescribe how the jupyter server operates, I think this is a great space for you all to innovate. I primarily want to make sure it's interoperable and solves the needs of many operators and users. 😄

Purpose

In the deployment environment I work in, there are many versions of spark as well as clusters that can be connected to. When a user wants to launch a notebook, they either click through this large menu:

Or are faced with a huge grid:

When in reality that's more like 4 kernels that have a custom version of spark and cluster properties set. We currently have to generate these kernelspecs, which works yet is burdensome in our image (and possibly not up to date with the platform).

What I'd love to expose for users is a way to select the version of spark available, from a list of valid values. On top of that, I'd love to be able to specify jars on start. Because of how spark is launched and started, it's paramount to do this in advance of the kernel starting.

Possible means

The approach I've discussed with some other folks (see hackpad for parametrized kernelspecs) involves adding another variable to the kernelspec's argv that will be a path to parameters for a kernel. Alternatively, the parameters could go into the connection file, possibly on a parameters key. For now, let's assume we create a parameters-UUID.json file in the same directory as our connection file.

To make this a bit more clear, here's an example kernel.json:

{
  "display_name": "pyspark kernel",
  "language": "python",
  "argv": [
    "/usr/local/bin/pyspark-kernel",
    "-c",
    "{connection_file}",
    “-p”,
    “{parameters_file}”
  ]
}

In order to fill out these parameters though, we need to know what's allowed! One way is by having a JSON Schema with allowed values. Here's an example parameter-spec.json that would be within the kernelspec directory:

{
  "$schema": "http://json-schema.org/draft-04/schema#"
  "description": "Schema for kernel with parameters"
  "type" : "object",
  "properties": {
    "sparkVersion": {
      "type": "string",
      # somehow we want this to be a dropdown that gets provided
      # default behavior is to provide a text box, but we’d like to be able to
      # query the available options for spark
      "default": "2.1.1",
    },
    "anotherSetting": {
      "type": "boolean", # generate a checkbox on the frontend 
      "default": true
    }
  }
  "required":
  [
      "sparkVersion",
      "anotherSetting"
  ]
}

This allows for server side validation before passing to the kernel. As noted in the comments, the jupyter server and the UI can use a more focused version of this spec, replacing the string field for spark version with an enum when passed to the UI. It would be up to the server to introduce advanced ways of querying and setting this. The on disk schema is a superset of the schema presented to the frontend.

@kevin-bates
Copy link
Collaborator Author

(Sorry, I was afk the last 6 days.) I see a great discussion brewing here - good stuff.

I'd like to first respond to @takluyver's comments, then @rgbkrk's in a separate comment...

Whitelisting: This probably makes sense as a parameter like KernelFinder.from_entrypoints(allow_only=['foo', 'bar']). I'm trying to keep traitlets out of this machinery.

That seems like a reasonable way to do things. So the KernelFinder would have access to the whitelist traitlet value and pass that, knowing it spans a set of providers where the allowed values may or may not apply.

@takluyver: I'm curious why you felt the need to keep traitlets out of providers? I think that may be a difficult task, but its probably too early to fully determine that.

Parameters: I think what you propose makes sense. I'd pass the parameters back into the launch method, probably like kernel_params={...} rather than **kwargs, because that makes it easier if we want to add other parameters for the infrastructure.

We can talk more about this (see next comment). I agree that a separate argument is fine. This argument should be whatever we call it when it's provided in the start kernel REST API body. kernel_params sounds fine to me.

Persistence: the connection info is a plain dict, so there's no problem there. The manager object (equivalent to your process proxies) can have a method to serialise itself (maybe we should standardise the name for this). And for deserialisation, I'd conceive it as an alternative constructor for the manager object, rather than as an alternative 'launch' mechanism.

Hmm. The entity that takes the two pieces of persisted information (connection-info and "provider state") must be the provider. It is up to the provider as to how the KernelManager manager is constructed. The caller of the provider has no idea what kind of KernelManger it needs to deal with - so this must be a provider thing and, I would argue, this alternate launch mechanism requires both pieces of information (connection-info and whatever the kernel-manager produces following the initial launch) - because the provider may want to produce a new set of connection info when loading from persisted state.

I agree these methods should be standardized. I'll toss out KernelManger.get_launch_state() as a proposed name to return the serialized information required to re-establish connection with the kernel - although this all strikes me as extremely circular. And KernelProvider.launch_from_state(connection_info, launch_state) as the means of deserialization. This method would return the same connection_info and KernelManager instance as launch(). (I suppose the persisted information could be optional parameters to launch(), but I think this is cleaner and not every provider will implement this.)

The thing that actually persists these two pieces of information (the connection info and the "provider state") will be responsible for associating them into a single entity. This will likely be done via the subclassing of MappingKernelManager unless we want to bake kernel session persistence into the base server.

Async kernels: that probably makes sense. But then, which async framework? The notebook server has traditionally used tornado, which is now shifting to run on top of asyncio. But then I started reading about the ideas behind trio, and that's a pretty attractive design. And how convenient should we make it to use the machinery in straightforward synchronous code? This was about where my thinking got to when I decided to postpone the async question.

Seems like we should get on the same page here and figure this out. I think as long as jupyter is based on tornado that we should be using its async model. Then again, I don't know the entire evolutionary story here.

I should also highlight that this machinery allowing kernels to be dynamically provided was meant to be the counterpart to a rethink of how notebooks are associated with kernels. Naming 'the' kernel for a notebook in its metadata makes sense when that's something like 'python', but not when it might be something like an environment or a container on a specific system, or a remote host that someone else may not have access to.

The mechanisms to replace that never really got fleshed out. I think it would be a mistake to migrate to this machinery without also fixing that problem - the two were always meant to go together. But I don't have the time or the energy to push that kind of major redesign through. So I can't give any estimate of when this machinery would be ready for real use.

I guess I'm not really following the issue here. If the kernel metadata in the notebook is 'Python on Kubernetes with Spark 2.4' and that notebook is shared with someone in which 'Python on Kubernetes with Spark 2.4' doesn't apply, they will get prompted to change the kernel - and, in all likelihood, fail to be able to run that notebook. That seems orthogonal to kernel providers. I suspect I'm missing something though. Could you please shed more light on this? Thanks.

@kevin-bates
Copy link
Collaborator Author

kevin-bates commented May 29, 2019

@rgbkrk - regarding paramaterized kernels...

I think we're all very close here.

I definitely agree that a json spec that describes what parameters a given kernel provider supports is the way to go. I'd like to see that particular stanza embedded in the already existing metadata stanza because, well, that's exactly what it is. :-)

Given the proposal for kernel providers (which requires further ratification IMHO), disk-based kernelspecs as we know them cannot be assumed. However, what can be assumed is the REST API content, in which the structure of kernelspecs is defined. Just bringing this up since things can get a little ambiguous between these worlds.

I'd like to be careful just how parameters are used and I think this is a good thing relative to the kernel providers proposal. That is, its up to the provider to determine how and when parameters are applied. The provider is responsible for producing the metadata corresponding to the parameters. Some providers may use a file, others may not. That said, we should agree on what meta-properites are supported, etc. so that client-side tooling can be achieved.

We should not assume that all parameters are interpreted solely by the target kernel. In fact, I believe the vast majority of parameters are used to seed the kernel's environment, rather than getting fed directly to the kernel. So, again, this is more of a launch thing and, thus, provider-specific.

Here's what I'd like to see relative to how parameterization is accomplished...

  1. The provider (which would likely be the kernel.json file but, again, shouldn't be assumed) include a parameters: stanza within the metadata: stanza that describes supported parameters, along with various meta-properties, like min/max values, default value, enumerated values, max lengths, etc. Any metadata tooling would need to properly present information to the user.

  2. The json body on the /api/kernels POST request include a kernel_params dict. This can be a set of name/value pairs since the provider can link the parameter name to the metadata to perform validation, etc.

  3. For compatibility purposes with EG, I'd like to see support for an env dict in the json body. This is something Kernel Gateway already does and has proven extremely useful as means of providing variable values. I suppose we could have a meta-property of type environment, but the env stanza would be primarily for backwards compatibility purposes with existing KG/EG clients. (Shameless add, but important since B/C is a thing and should be supported whenever possible.)

It would be great to hop on a call with everyone at some point to hash this out! Again, I think we're close.

@dhirschfeld
Copy link
Contributor

I guess I'm not really following the issue here. If the kernel metadata in the notebook <snip> doesn't apply, they will get prompted to change the kernel

I guess one problem is that most people probably have a kernel called Python3 or dev or somesuch but there's no guarantee that anyone else's Python3 environment will have the correct dependencies to be able to run the notebook. At that point, is there much point in embedding kernel information in the notebook itself? Maybe it's useful for informational purposes but not for automatically starting a kernel.

Perhaps the language could be embedded and used to filter the suggested kernel with the top suggestion being the best fuzzy match to the embedded kernel name... but require the user to actually confirm the kernel choice so they can at least see the possible options to choose from.

@kevin-bates
Copy link
Collaborator Author

Since the vast majority of notebooks will tend to be used against the same configurations, I think storing the selected kernel, along with its parameterized values, should be the default behavior.

However, client applications should probably enable the ability to enter new values - perhaps as an alternative means of starting the kernel? E.g., I just ran this notebook with 8 GB, now I want 32 GB and 2 GPUs.

The generically named kernels, like Python3, are really at the mercy of the environment in which they'll run. Not sure you can solve that any better than, I guess, using the new launch option to let them pick a different Python3. Perhaps a hash of the kernelspec could be used to determine if this Python3 kernelspec is different from the one that the notebook previously used? But again, it comes down to the env.

@rgbkrk
Copy link
Contributor

rgbkrk commented May 30, 2019

I think storing the selected kernel, along with its parameterized values, should be the default behavior.

Definitely agree there.

@MSeal
Copy link

MSeal commented May 30, 2019

I'm glad this conversation is advancing! We've been talking about this for a while as something we should do. I agree with the majority of the conclusions above. I did have one question I hadn't seen answered. How should the existing clients behave if they encounter a parameterized kernel? Basically what's the backwards or forwards compatibility story arch for this change, in particular for jupyter_client?

On the opinion side, I believe we should keep full dependency management out of the scope of the initial parameterized kernel spec. By this I mean that we should be clear that the notebook metadata won't store all of it's dependencies in relation to the kernel and that kernels aren't obligated to present an interface for controlling all of it's dependencies within it's environment. The examples above are mostly focused on configuring existing tooling in the environment rather than generating a completely new dependency chain for each kernel. I believe this distinction will help with keeping the proposal smaller and define a boundary of responsibility with this step forward.

On the dependency topic I'm going to submit a proposal to nbformat in June for a notebook requirements/dependency spec based on-going conversations with a mix of Jupyter devs and notebook teams at various companies. It's possible we might want to move that into a specialization of parameterized kernels as people review, but when we initially discussed this topic in depth at NES with folks it seemed like combining these topcis was not a good path forward at the time.

@kevin-bates
Copy link
Collaborator Author

How should the existing clients behave if they encounter a parameterized kernel? Basically what's the backwards or forwards compatibility story arch for this change, in particular for jupyter_client?

Do you mean - encounter a parameterized kernel spec or an actual kernel? In the former case, the applications (e.g., Notebook or Lab or a REST client) would ignore the parameter metadata of the kernel spec - which is the case today. It should be the responsibility of the kernel provider to provide reasonable defaults to required parameters and, if those cannot be provided, should fail to launch the kernel. I suspect the same is true in the latter case and hope kernels that require parameters have reasonable defaults or, again, fail the request.

I believe the majority of parameters will only apply to the launch environment in which the kernel will run (cpus, memory, volume to mount, image to launch, etc) and not necessarily be acted upon by the kernel. I may be off on that, but we should make sure that any parameters that are indeed acted on by the kernel itself are well-documented so that different kernel providers can use those same kernel implementations and provide the necessary parameter values.

If your compatibility question is more in regard to the kernel provider architecture, I think any applications that currently subclass KernelSpecManager or KernelManager will likely break since those kinds of customizations are meant to be addressed in terms of the kernel provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants