Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore API implications for allowing multiple notebook extensions to share the same file format #106694

Closed
rebornix opened this issue Sep 14, 2020 · 5 comments
Assignees
Labels
feature-request Request for new features or functionality notebook-api
Milestone

Comments

@rebornix
Copy link
Member

Currently the notebook API consist of three parts: content persistence (content provider), notebook execution (notebook kernels) and output rendering (output renderers). This abstraction can help reduce redundant code, for example, .NET and Python Jupyter notebooks can share the same content provider as the parsing/persisting logic is identical.

However, it's not clear yet how language extensions would opt-in if the content provider is shared. One example is how would a .NET extension decide if it should contribute actions/commands to the editor title bar / context menu in a ipynb files.

@rebornix rebornix added feature-request Request for new features or functionality notebook labels Sep 14, 2020
@rebornix rebornix added this to the September 2020 milestone Sep 14, 2020
@rebornix
Copy link
Member Author

rebornix commented Sep 15, 2020

NotebookDocument#languages

Currently notebook content provider uses this property to indicate which languages for code cells are supported in a specific notebook type. However in Jupyter, supported language is actually a kernel info (standard kernels are single language kernel).

"metadata": {
	"kernelspec": {
		"display_name": "Micronaut",
		"language": "groovy",
		"name": "micronaut"
	},
	"language_info": {
		"codemirror_mode": "groovy",
		"file_extension": ".groovy",
		"mimetype": "",
		"name": "Groovy",
		"nbconverter_exporter": "",
		"version": "2.5.6"
	}
}

If content provider and kernel provider are registered in different extensions, content provider only knows what's the language being used last time when the document is connected to a kernel

  • if there is a language_info, it means that's the language being used in existing cells or new cells
  • if a new kernel is connected to a document, kernel_spec/language_info will be changed accordingly, it will also change the preferred language being used in the document
  • Jupyter document currently doesn't have concepts of polyglot yet.
    • available languages (a .NET kernel supports csharp, fsharp, ps1)
    • default language (maybe csharp)

Imagine we built a jupyter content provider, to meet the requirements listed above it needs to

  • provide available languages
  • provide preferred language (preferred language is from kernel)
  • listen to active kernel change event on the document, and read the supported languages of the kernel
    • the kernel should describe what code languages it supports (maybe by NotebookKernel.metadata which is a dictionary)
    • the content provider then update the available languages used in the document
  • language for a new cell
    • when an untitled document is opened, we can probably ask users what language they want to use (when there are multiple ones)
    • when appending a cell to the document, we can guess based on the context

notes

In Jupyter, the frontend gets kernel_info_reply from the kernel, which contains kernelInfo and languageInfo.

@rebornix
Copy link
Member Author

rebornix commented Sep 15, 2020

Notebook document/cell metadata

Prerequisites: content provider and kernels providers are contributed by separated extensions

Jupyter content provider is responsible for parsing the file based on the file format (jupyter nbdocument v4 or v5). However the kernels/frontends usually involve faster than the document schema, the content provider should be flexible enough to allow kernels to insert additional custom document or cell metadata.

For example, currently there are no identifiers for cells but if a kernel can support that, it can save it into NotebookCell#metadata.id. Now the question for content provider is, should the content provider save id to disk? It seems only the kernel knows if this id is persistent or just transient. However currently only the content provider can control the transient options.

To support this scenario, we may want to enrich the document/cell/output metadata

interface *Metadata {
  runnable?: boolean;
  ...
  // transient custom metadata
  transient?: { [key: string]: any };
  // other custom metadata
  [key: string]: any
}

*Metadata#transient is always transient. Kernels can use the editing api to add custom transient metadata by modifying this property. For all other properties (and not in NotebookDocumentContentOptions#transientMetada), the jupyter content provider will save them to disk.

@rebornix
Copy link
Member Author

rebornix commented Sep 15, 2020

How would extensions contribute features based on context

Now that the jupyter content provider is contributed by a shared/common extension, language extensions can fully focus on language related business, but they need a bit more information from either the core or the content provider if one particular feature should be enabled:

  • debugging
    • the language extension needs to know if the right kernel is selected: for example, Julia turns on debugging only when the kernel is from the Julia extension.
  • language features
    • Monaco Editor has context keys editorLanguageId, we may want to add notebookEditorLanguages which contains all supported languages in the document.
    • language extensions may also need to read what's the active kernel to contribute actions/commands.

@rebornix
Copy link
Member Author

A shorter summary of API implications if the content provider will be shared between multiple extensions:

  • Allow content providers to update supported languages (and maybe preferred language)
  • Allow language extensions to add custom (persistent/transient) metadata to notebook document and cells
  • Support metadata in NotebookKernel which will carry jupyter kernel/language info
  • Context keys for active kernel (by id) and language info (support languages in the document)

Notes:

While doing above analysis, a question occurred to me "what is the scope of the shared jupyter extension". The traditional Jupyter consists of two process:

Kernel

  • runs user code
  • provide partial rich language support (auto complete)

and Frontend, which does all the rest

  • a user interface where the user types code and previews results (REPL/Interactive Window, Jupyter Lab)
  • document management (store user input and outputs in a file)
  • kernel management (kernel detection and connection management)

The current implementation of Jupyter notebook support in VS Code almost follows this model, with one exception that VS Code is the User Interface and the Python extension handles the document and kernel management.

If we move the jupyter content provider from Python extension to a shared extension, where should the kernel management go?

  • it sounds natural that extensions should share the kernel management infra, as they are all jupyter kernels.
  • but how would a language extension contribute richer language support on top of kernel then? (debugging is similar) we will need to figure out how one extension talks to a kernel. In current implementation of Julia or R notebook, the extension is responsible for spawing the kernel process and then directly communicate with the kernel to better debugging/language support.

@rebornix
Copy link
Member Author

rebornix commented Nov 4, 2021

Closing as we already shipped the serialzier API.

@rebornix rebornix closed this as completed Nov 4, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Dec 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Request for new features or functionality notebook-api
Projects
None yet
Development

No branches or pull requests

3 participants