Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addition of cell id introduced with nbformat>=4.5 to text format #1263

Open
itcarroll opened this issue Jul 26, 2024 · 5 comments · May be fixed by #1270
Open

addition of cell id introduced with nbformat>=4.5 to text format #1263

itcarroll opened this issue Jul 26, 2024 · 5 comments · May be fixed by #1270

Comments

@itcarroll
Copy link

itcarroll commented Jul 26, 2024

Discussion on #735, after it was closed, pointed to the need for an issue on inclusion of the cell id in jupytext text formats. I can't find one, and I think it's an enhancement worth considering.

Currently, a cell id is preserved in paired notebooks, but there are cases where the paired notebook is not present. Primary among these is when only text formats are held in a git repository. In this case, collaborators that generate notebooks locally from the text format end up with all cells having a different id. I'm interested to know, is there support for directly incorporating the cell id in the "light" format?

The obvious proposal would be to require a start-of-cell delimiter for every cell and include the id. The id is distinct from metadata because 1) it is first and 2) is not a key=value pair (the "=" character is not permitted in a cell id).

The examples would become:

# +b457cb9f-93c0_456a-a652-3f597535aa2d
# This is a multiline
# Markdown cell

# +a99ac56a-3859_4a15-9023-bab26654380f
# Another Markdown cell


# +4e9a328c-7d49_4e7e-9af4-a9f86ccddd14
# This is a code cell
class A():
    def one():
        return 1

    def two():
        return 2
# +3435c495-ba0c_4ca6-8a65-7b3658b66733
# A single code cell made of two paragraphs
a = 1


def f(x):
    return x+a
# +a8345b4b-8282_47fe-96c4-1d2c02bc92ca key="value"
# A code cell with metadata

# +a8345b4b-8282_47fe-96c4-1d2c02bc92ca [markdown] key="value"
# A Markdown cell with metadata
@mwouts
Copy link
Owner

mwouts commented Jul 26, 2024

Hi @itcarroll , thanks for opening this discussion.

Sure, we could do something in that direction. Actually, some of the formats have support for a cell title that dates back to the spyder format. It might make sense to map that to the cell id.

Right now I think the Pandoc markdown format might have support for cell ids, if you want to give it a try, but I understand that you might be more interested into a Python format.

I will have some time to give this a try in two weeks time or later.

@mwouts mwouts linked a pull request Aug 10, 2024 that will close this issue
@mwouts
Copy link
Owner

mwouts commented Aug 10, 2024

Hi @itcarroll , I have a first draft of this functionality in the attached PR (which contains instructions on how to install the development version).

Would you like to give it a try and let me know what you think?

The new option is not active by default. If you want to use it, you will have to create a jupytext.toml file with that content:

cell_id_to_title = true

You can rename the cell ids as you wish, however the new name must match this regular expression: ^[a-zA-Z0-9-_]+$, otherwise Jupyter won't open the notebook. For convenience, Jupytext will replace spaces with underscores when converting titles to ids (but it won't convert them back).

Let me know what you think!

@itcarroll
Copy link
Author

Thanks for the work on this! I'll definitely give it a try and report back, but it will be sometime next week.

@mwouts
Copy link
Owner

mwouts commented Aug 13, 2024

Perfect! No rush, and thanks for suggesting that in the first place - I'm curious to see if/how we can turn this into something usable!

@itcarroll
Copy link
Author

This is looking very usable already, although I am encountering an error that only shows up when I've created an .ipynb file from a .py file with jupytext --sync *.py. Trying to open the resulting .ipynb file in JupyterLab gives an "Unhandled error".

[W 2024-09-01 10:01:10.423 ServerApp] Notebook test.ipynb is not trusted
[W 2024-09-01 10:01:10.424 ServerApp] test.ipynb (last modified 2024-09-01 14:01:03.779814+00:00) is more recent than test.py (last modified 2024-09-01 14:00:27.713327+00:00)
[I 2024-09-01 10:01:10.424 ServerApp] Reading SOURCE from test.py
[E 2024-09-01 10:01:10.428 ServerApp] Uncaught exception GET /api/contents/test.ipynb?type=notebook&content=1&hash=1&1725199270416 (::1)
    HTTPServerRequest(protocol='http', host='localhost:8889', method='GET', uri='/api/contents/test.ipynb?type=notebook&content=1&hash=1&1725199270416', version='HTTP/1.1', remote_ip='::1')
    Traceback (most recent call last):
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupyter_server/services/contents/handlers.py", line 155, in get
        self.contents_manager.get(
    TypeError: build_jupytext_contents_manager_class.<locals>.JupytextContentsManager.get() got an unexpected keyword argument 'require_hash'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/tornado/web.py", line 1790, in _execute
        result = await result
                 ^^^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupyter_server/auth/decorator.py", line 73, in inner
        return await out
               ^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupyter_server/services/contents/handlers.py", line 168, in get
        self.contents_manager.get(
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/contentsmanager.py", line 338, in get
        content = read_pair(
                  ^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/pairs.py", line 127, in read_pair
        in_text = jupytext.writes(notebook, inputs.fmt)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/jupytext.py", line 503, in writes
        return writer.writes(notebook, metadata)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/jupytext.py", line 291, in writes
        if self.config.cell_id_to_title and hasattr(cell, "id"):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AttributeError: 'NoneType' object has no attribute 'cell_id_to_title'
[W 2024-09-01 10:01:10.431 ServerApp] wrote error: 'Unhandled error'
    Traceback (most recent call last):
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupyter_server/services/contents/handlers.py", line 155, in get
        self.contents_manager.get(
    TypeError: build_jupytext_contents_manager_class.<locals>.JupytextContentsManager.get() got an unexpected keyword argument 'require_hash'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/tornado/web.py", line 1790, in _execute
        result = await result
                 ^^^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupyter_server/auth/decorator.py", line 73, in inner
        return await out
               ^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupyter_server/services/contents/handlers.py", line 168, in get
        self.contents_manager.get(
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/contentsmanager.py", line 338, in get
        content = read_pair(
                  ^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/pairs.py", line 127, in read_pair
        in_text = jupytext.writes(notebook, inputs.fmt)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/jupytext.py", line 503, in writes
        return writer.writes(notebook, metadata)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Users/icarroll/tmp/jupyext-pr/venv/lib/python3.11/site-packages/jupytext/jupytext.py", line 291, in writes
        if self.config.cell_id_to_title and hasattr(cell, "id"):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AttributeError: 'NoneType' object has no attribute 'cell_id_to_title'
[E 2024-09-01 10:01:10.453 ServerApp] {
      "Host": "localhost:8889",
      "Accept": "*/*",
      "Referer": "http://localhost:8889/lab/tree/test.ipynb",
      "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36"
    }
[E 2024-09-01 10:01:10.453 ServerApp] 500 GET /api/contents/test.ipynb?type=notebook&content=1&hash=1&1725199270416 (179348aa56624fa2bfe5663b9cfa432b@::1) 16.07ms referer=http://localhost:8889/lab/tree/test.ipynb
[I 2024-09-01 10:01:38.360 ServerApp] Shutting down on /api/shutdown request.
[I 2024-09-01 10:01:38.361 ServerApp] Shutting down 5 extensions

My test.py file is:

# ---
# jupyter:
#   jupytext:
#     text_representation:
#       extension: .py
#       format_name: light
#       format_version: '1.5'
#       jupytext_version: 1.16.5-dev
#   kernelspec:
#     display_name: Python 3 (ipykernel)
#     language: python
#     name: python3
# ---

# + 4a523f6f-1e2a-41c1-8e56-90c9f559dd1a [markdown]
# Nothing.

My pyproject.toml file is:

[tool.jupytext]
cell_id_to_title = true
formats = "ipynb,py"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants