Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Jupyter should be able to handle large notebooks #75

Open
mlucool opened this issue Oct 27, 2021 · 2 comments
Open

Proposal: Jupyter should be able to handle large notebooks #75

mlucool opened this issue Oct 27, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@mlucool
Copy link

mlucool commented Oct 27, 2021

Problem

We should add a benchmark test and make changes so that 2k cell notebooks feels good to work with. In practice, I have seen some users make notebook in the 1k-ish range, so 2k is an arbitrary number that is bigger than that (maybe it should be 10k?).

We'd first need to define "feels good to work with" a bit more, which I'll state as something like:

  1. Allows a user to interact with it in no more than 10s seconds
  2. Clicking on cells within the notebook become interactive as fast as a 10 cell notebook
  3. Characters typed in code cells are rendered as fast as a 10 cell notebook
  4. Switching tabs should be no more than 20% slower than with a 10 cell notebook
  5. Scrolling/jumping to a cell (e.g. via ToC) should be interactive in less than 500ms
  6. It does not significantly interfere with the rest of the page (e.g. button clicks takes no more than 20% more than if the notebook was not on the page)

All the numbers and metrics above are just somewhere to get started. Happy to put in other metrics and/or change any of the numbers as I choose them somewhat arbitrarily as well. That being said, today none of these metrics pass.


What is is like today?

Given this generated notebook (note, there is no output for any cell which makes this simpler than in the real world):

import json
import nbformat

NUM_CELLS = 2000

nb = nbformat.v4.new_notebook()
nb.metadata.kernelspec = {
    "display_name": "Python 3",
    "language": "python",
    "name": "python3",
}
for n in range(NUM_CELLS):
    nb.cells.append(nbformat.v4.new_code_cell("# cell {}".format(n + 1)))

with open(
    "generated-{}cells.ipynb".format(NUM_CELLS),
    "w",
) as f:
    f.write(json.dumps(nb, indent=4))

In lab 3.1 I am finding the following performance when I open the above notebook and try to use it:
image

Zooming in, all the work seems to be this codemirror pattern over and over again:
image

While we are working on things like jupyterlab/jupyterlab#10370 and jupyterlab/lumino#231 I thought it would be good to set a both a bit more defined goal and give everyone the same example to test against.

@blois, I'm curious how this notebook performs with the new colab virtualization (#68 (comment)).

What do others think?

CC those who have come to performance meetings as this size notebook was a topic of our first meeting. @fcollonval @sagemaster @echarles @Zsailer @jasongrout @afshin @ellisonbg @3coins @goanpeca

@mlucool mlucool added the enhancement New feature or request label Oct 27, 2021
@blois
Copy link

blois commented Oct 27, 2021

@mlucool scrolling is definitely hitting some layout issues in Colab- the switch between virtualized and the real editor incurs some solid layout issues. But typing isn't terrible. https://colab.research.google.com/gist/blois/68f1dc5b50ea315de5071c978d0b3f35/generated-2000cells.ipynb

@krassowski
Copy link
Member

Thanks for the example! Cross-referencing jupyterlab/jupyterlab#9757 - large notebooks can choke the UI due to overuse of layout reflows. It looks like the fix will need to land in lumino.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants