ipynb writer: handle cell output with raw block of markdown #7563

ickc · 2021-09-10T20:14:50Z

feat(ipynb writer): write RawBlock of markdown in code-cell output
#7561 makes the ipynb reader reads code-cell output with mime "text/markdown" to a RawBlock of markdown
This commit makes the ipynb writer writes this RawBlock of markdown back inside a code-cell output with the same mime, preserving this information in round-trip

test(ipynb): "text/markdown" mime type in code-cell output
This add tests on ipynb reader (#7561) and ipynb writer (#7563)'s ability to handle a "text/markdown" mime type in a code-cell output

ickc · 2021-09-10T21:07:30Z

@jgm, this is a draft and perhaps needs more discussion.

#7561 added support of text/markdown in cell output in ipynb reader.

This one concerns the ipynb writer, specifically how to have some sort of round-trip identity.

This patch so far only map a raw-block of markdown back to text/markdown in a cell output. In a naive test pandoc texp_repr_mimebundle.ipynb -s --to ipynb this would work.

But if the ipynb has ever been converted to markdown, the markdown raw-block will be written as is, without the markdown raw-block syntax (i.e. {=markdown})

How should this be resolved? Should the AST be aggressively (after not matching other cases) converted to a markdown raw-block?

jgm · 2021-09-10T23:15:11Z

Maybe we should render a RawBlock (Format "markdown") in markdown as

a raw markdown block (```{=markdown}...) if it occurs in an output cell
otherwise, as its contents (as we'd usually do)

Would this make sense?

ickc · 2021-09-11T00:33:44Z

@jgm, I found that the same problem exists when there's a raw cell with mime-type text/markdown:

  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/markdown",
    "tags": []
   },
   "source": [
    "some markdown here"
   ]
  }

This markdown raw-cell is a strange thing as there's already a markdown cell type. But nonetheless it exists and is valid (although in Jupyter lab/notebook it won't be rendered.)

Then I remember commit 48fb6d9. So may be the solution is to enhance raw_markdown such that when this is specified (signaling people want to have round trip identity), not only will the markdown be written as is, but also enclosed in a markdown raw-block? What do you think?

Edit: I mean the proposal is that if raw_markdown is specified, any markdown will be enclosed as markdown raw-block, which should includes the 2 cases we mentioned in this thread.

ickc · 2021-12-03T09:27:44Z

@jgm, may be merging this first?

This patch already solve the roundtrip idempotence with ipynb to ipynb conversion.

The discussion above is about ipynb to markdown conversion is lossy (so that markdown back to ipynb doesn't have that markdown RawBlock.)

jgm · 2021-12-06T04:29:43Z

I'm sorry, I can't really understand the issue, reading the comments above. Can you explain it more clearly, with an example, including what pandoc does and what you think it should do?

ickc · 2021-12-06T06:18:44Z

In short, this complements #7561, which implements reading raw markdown from ipynb output cell. This PR writes raw markdown block as ipynb output cell.

I think the motivation of having a raw markdown in output cell is already gone through in #7561. This issue is only to support writing what #7561 knows how to read already.

jgm · 2021-12-06T17:31:22Z

I still need examples to understand.

ickc · 2021-12-06T23:51:16Z

@jgm, I added a test file. See if it helps clarifying.

ickc · 2021-12-07T00:04:57Z

Some of these tests are failing because the ordering of dict within JSON are different. Is there a better way to test this?

jgm · 2021-12-07T01:37:18Z

Looks like in the output the keys are in alphabetical order -- so try making the input that way, too?

ickc · 2021-12-07T01:40:44Z

But the orders seem to depend on the platform? ~~Also, I can't control the order of the output cell from jupyter (although for testing purposes we could change the ordering ourselves.)~~ Edit: the ordering problem happens from native to ipynb, and seems to depends on platforms. So there's nothing than can be done by modifying the test files.

Also, does the test file address your question?

ickc · 2021-12-07T02:26:15Z

The test fails because of #7733. Should this test be temporarily disabled and merge first? Regardless testing, I guess the test files illustrates the reasoning behind this PR?

jgm · 2021-12-07T16:17:34Z

Try rebasing against current master. I think some changes in upstream ipynb will give us deterministic order of keys (but this needs confirmation).

ickc · 2021-12-07T23:51:30Z

From the CI test, it seems that it doesn't help.

I can see that the id is now included.

But it doesn't change the ordering of the ipynb writer, and is still platform/compiler dependent.

jgm · 2021-12-08T16:57:06Z

Try rebasing over commit 8215ca0
I think this might do the trick for a deterministic (sorted) output for mime bundles.

jgm · 2021-12-08T16:57:39Z

(Existing tests may need to change, but at least you should get the same result on all platforms.)

ickc · 2021-12-08T23:54:23Z

I rebased and now it includes 8215ca0. But strangely the output ipynb doesn't change after this commit. (I just use the compiled pandoc itself to generate the ipynb/mime.out.ipynb:

pandoc -t native -f ipynb --ipynb-output=all ipynb/mime.ipynb -o ipynb/mime.native
pandoc -f native -t ipynb --wrap=preserve ipynb/mime.native -o ipynb/mime.out.ipynb

)

I.e. it generates the key order exactly the same as before commit 8215ca0.

Also, looking at the diff between ipynb/mime.ipynb & ipynb/mime.out.ipynb, the key isn't ordered in ipynb/mime.out.ipynb.

Also, 8215ca0 correctly points to jgm/ipynb@00246af

Another observation is that all the UNIX CI have the html and markdown key reversed, while the windows CI and my local macOS shares the same key ordering.

jgm · 2021-12-09T04:21:08Z

We're focusing on the difference in platforms. Is there also a difference in the aeson versions they're compiled with? This should be in the build logs.

ickc · 2021-12-09T04:36:03Z

Unfortunately the log doesn't have the information of which aeson version it is using.
I downloaded the log archive and

find .  -type f -name '*.txt' -exec grep --color=auto -iHnE 'aeson' {} +

and has no results.

Randomly selecting one of the build and search aeson using the web interface also has no results.

jgm · 2021-12-09T07:22:06Z

Okay, I think 9cbea69 may help.

jgm · 2021-12-09T17:37:06Z

Now that we have deterministic ordering, can you do these two things?

remove the third (temporary) commit (fast-fail)
add a fuller commit message to the first commit, describing how this changes pandoc's behavior

otherwise, looks good to me!

#7561 makes the ipynb reader reads code-cell output with mime "text/markdown" to a RawBlock of markdown This commit makes the ipynb writer writes this RawBlock of markdown back inside a code-cell output with the same mime, preserving this information in round-trip

This add tests on ipynb reader (#7561) and ipynb writer (#7563)'s ability to handle a "text/markdown" mime type in a code-cell output

ickc · 2021-12-10T01:57:30Z

@jgm, ready to merge.

jgm · 2021-12-10T04:37:02Z

Thanks!

ickc marked this pull request as ready for review December 3, 2021 09:26

ickc mentioned this pull request Dec 7, 2021

ipynb writer: deterministic key ordering #7733

Closed

ickc added 2 commits December 9, 2021 15:41

test(ipynb): "text/markdown" mime type in code-cell output

6b962e6

This add tests on ipynb reader (#7561) and ipynb writer (#7563)'s ability to handle a "text/markdown" mime type in a code-cell output

jgm merged commit 20eb8ac into jgm:master Dec 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ipynb writer: handle cell output with raw block of markdown #7563

ipynb writer: handle cell output with raw block of markdown #7563

ickc commented Sep 10, 2021 •

edited

Loading

ickc commented Sep 10, 2021

jgm commented Sep 10, 2021

ickc commented Sep 11, 2021 •

edited

Loading

ickc commented Dec 3, 2021

jgm commented Dec 6, 2021

ickc commented Dec 6, 2021

jgm commented Dec 6, 2021

ickc commented Dec 6, 2021

ickc commented Dec 7, 2021

jgm commented Dec 7, 2021

ickc commented Dec 7, 2021 •

edited

Loading

ickc commented Dec 7, 2021

jgm commented Dec 7, 2021

ickc commented Dec 7, 2021

jgm commented Dec 8, 2021

jgm commented Dec 8, 2021

ickc commented Dec 8, 2021

jgm commented Dec 9, 2021

ickc commented Dec 9, 2021

jgm commented Dec 9, 2021

jgm commented Dec 9, 2021

ickc commented Dec 10, 2021

jgm commented Dec 10, 2021

ipynb writer: handle cell output with raw block of markdown #7563

ipynb writer: handle cell output with raw block of markdown #7563

Conversation

ickc commented Sep 10, 2021 • edited Loading

ickc commented Sep 10, 2021

jgm commented Sep 10, 2021

ickc commented Sep 11, 2021 • edited Loading

ickc commented Dec 3, 2021

jgm commented Dec 6, 2021

ickc commented Dec 6, 2021

jgm commented Dec 6, 2021

ickc commented Dec 6, 2021

ickc commented Dec 7, 2021

jgm commented Dec 7, 2021

ickc commented Dec 7, 2021 • edited Loading

ickc commented Dec 7, 2021

jgm commented Dec 7, 2021

ickc commented Dec 7, 2021

jgm commented Dec 8, 2021

jgm commented Dec 8, 2021

ickc commented Dec 8, 2021

jgm commented Dec 9, 2021

ickc commented Dec 9, 2021

jgm commented Dec 9, 2021

jgm commented Dec 9, 2021

ickc commented Dec 10, 2021

jgm commented Dec 10, 2021

ickc commented Sep 10, 2021 •

edited

Loading

ickc commented Sep 11, 2021 •

edited

Loading

ickc commented Dec 7, 2021 •

edited

Loading