Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter Notebook Roadmap #1815

Open
1 of 6 tasks
jleibs opened this issue Apr 11, 2023 · 1 comment
Open
1 of 6 tasks

Jupyter Notebook Roadmap #1815

jleibs opened this issue Apr 11, 2023 · 1 comment
Labels
enhancement New feature or request notebook Jupyter notebooks etc 🐍 Python API Python logging API 🎄 tracking issue issue that tracks a bunch of subissues

Comments

@jleibs
Copy link
Member

jleibs commented Apr 11, 2023

The initial version of Jupyter support established in the Jupyter MVP is still extremely limited.

This is a tracking issue intended to plot out the longer-term goals for using notebooks with Rerun.

Much of this is motivated by the desire to support:

The Long-term Vision

Two Different Workflows

The general pattern of creating a view withing a notebook always involves 3 distinct pieces:

  1. Create a Recording
  2. Send data to the Recording
  3. Combine the Recording with a Blueprint to create a View

Although Step 1 always comes first, it's important to note that step 2 and 3 can happen in either order.

When 2 comes before 3 we call this the "end-of-cell" workflow. We don't emit the viewer until the end of cell execution, which means the viewer is loading a single static recording payload. It's important to note that this same recording can still be used to create additional views of the data (either in the same cell or subsequent cells), without needing to run the computation again. This is practically similar to our current "save / open RRD" standalone modes, and is the only mode supported by the current jupyter MVP.

When 3 comes before 2, however, we call this an "incremental-cell" workflow. In this mode the View context is created first, and then data is incrementally live-streamed into it. This could all happen from within a single long-running cell or multiple cells could be used to incrementally update a viewer instance output by a previous cell. This is practically more similar to the standalone "connect to viewer" mode.

Creating Blueprints for Views

Regardless of which workflow is being employed, ergonomic APIs for creating these blueprints are an essential part of the Jupyter experience. A user must be able to:

  • Choose what data exists in their view
  • Choose the type of view that will be created
  • Potentially Layout multiple views
  • Apply additional styling to that data
  • Filter the data in different ways

We suspect at least two ways that users might want to construct these blueprints:

  • An object-oriented builder-style API, such as:
view1 = rec.3dview(base='car', paths=['car/sensors/lidar', 'car/detections/*'])
view2 = rec.2dview(base='world', paths=['world/map', 'car/trajectory', 'car/detections/bbox'])
rr.horizontal_layout(view1, view2)
  • A "config-file" style JSON or YAML document.

TBD(@jleibs) continue populating this section.

Tracking Backlog.

  • APIs to reset TimePoint #1808
  • Investigate introducing a "RecordingHandle" to the Python SDK to simplify some of the global-context/state pieces. The handle would expose all the existing python APIs and eventually allow creation of a blueprint which would return a jupyter-renderable object.
    rh = rr.start_recording()
    rh.log_image(...)
    rh.log_points(...)
    rh.blueprint().view('img/')
    
    The existing rr APIs would just pass through to the default recording handle. (Update python APIs to make init return a RecordingHandle which manages most state #1903)
  • Long-term optionally move rerun from iFrame back to a single instance controlling multiple canvases. This would allow us to have multiple views (from different blueprints) on top of the same data without the need to duplicate memory. (Note: this won't work in google colab, so we'll still always want to support a iframe-isolated model).
  • The "right" way of outputting data in jupyterlab is ultimately with a custom mime-type and renderer extension. (https://jupyterlab.readthedocs.io/en/stable/user/file_formats.html). Ideally we would still support both inlined recordings or a reference to a recording-id on an existing server-instance. NOTE: this might not be as portable or worth the effort.
  • Port to ipywidgets. See: this guide. This seems like the best candidate for cross-platform support including bidirectional sync for features like retrieving blueprint data back from the viewer or eventually supporting use-callbacks.
  • Rather than encoding the entire rrd as a blob, we should be able to use the ipython websocket to incrementally send (batched?) messages to the rerun server. ipywidgets (above) is a good candidate for handling this kind of data-flow
@jleibs jleibs added enhancement New feature or request 🐍 Python API Python logging API 🎄 tracking issue issue that tracks a bunch of subissues labels Apr 11, 2023
@jleibs jleibs changed the title Jupyter Improvements Jupyter Notebook Roadmap Apr 18, 2023
@emilk emilk added the notebook Jupyter notebooks etc label Apr 19, 2023
@jprochazk
Copy link
Member

@jleibs What's the status of this after the recent notebook changes? It's linked from our docs, and I'm not sure how the roadmap has changed since over a year ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request notebook Jupyter notebooks etc 🐍 Python API Python logging API 🎄 tracking issue issue that tracks a bunch of subissues
Projects
None yet
Development

No branches or pull requests

3 participants