Skip to content

Conversation

@jonludlam
Copy link
Collaborator

This commit adds some new functionality to code blocks. Firstly it allows arbitrary delimiters meaning that code containing the usual code block terminator can now be parsed correctly. The syntax for this is:

{delimiter@ocaml[ ... ]delimiter}

The delimiter can contain the chars [ a-z A-Z 0-9 _ - ], the same as the language tag that comes after the '@' symbol. Note that there's no way to have a delimited code block without a language tag.

The second piece of functionality is that code blocks can now have associated output:

{@ocaml[

... ocaml code ...

][

... odoc formatted output ...

]}

This syntax also supports the delimiters as above. The delimiters only encode the code block, not the output:

{delim@ocaml[

... ocaml code containing ][ or ]} ...

]delim[

... odoc formatted output ...

]}

The idea is that the odoc formatted output should be well formed and thus any escaping is done in the usual way.

The output can then contain, for example, error blocks produced by mdx:

{delim@ocaml[
  let x = "]}
]delim[
{err@mdx-error[
Line 1, characters 9-10:
Error: String literal not terminated]err}
]}

In addition this also allows the possibility of code blocks to produce rich output - ie., allowing marked-up such as tables, headings, images and so on, in such a way that they are associated with the code block, and hence can be manipulated by a 'test-promote' workflow.

This commit adds some new functionality to code blocks. Firstly
it allows arbitrary delimiters meaning that code containing the
usual code block terminator can now be parsed correctly. The
syntax for this is:

    {delimiter@ocaml[ ... ]delimiter}

The delimiter can contain the chars `[ a-z A-Z 0-9 _ - ]`, the
same as the language tag that comes after the '@' symbol. Note
that there's no way to have a delimited code block without a
language tag.

The second piece of functionality is that code blocks can now have
associated output:

    {@ocaml[

    ... ocaml code ...

    ][

    ... odoc formatted output ...

    ]}

This syntax also supports the delimiters as above. The delimiters
only encode the _code_ block, not the output:

    {delim@ocaml[

    ... ocaml code containing ][ or ]} ...

    ]delim[

    ... odoc formatted output ...

    ]}

The idea is that the odoc formatted output should be well formed and thus
any escaping is done in the usual way.

The output can then contain, for example, error blocks produced by mdx:

    {delim@ocaml[
      let x = "]}
    ]delim[
    {err@mdx-error[
    Line 1, characters 9-10:
    Error: String literal not terminated]err}
    ]}

In addition this also allows the possibility of code blocks to produce
rich output - ie., allowing marked-up such as tables, headings,
images and so on, in such a way that they are associated with the code
block, and hence can be manipulated by a 'test-promote' workflow.
@jonludlam
Copy link
Collaborator Author

Review comment from group review in Cambridge: Could we perhaps make the delimiting generic, e.g.:

{delim@ocaml[ ... ]delim[ ]}
{delim|v ... verbatim v|delim}
{delim|m ... |delim}
{delim|math ... |delim}
{|v ... v|}

@jonludlam
Copy link
Collaborator Author

While I think having delimiters elsewhere is a worthy goal, we can do that in another PR.

Copy link
Contributor

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax looks reasonable, no need to block for more generic delimiters (which would probably still look different on code blocks).

| In_explicit_list -> (List.rev acc, next_token, where_in_line)
| In_tag -> (List.rev acc, next_token, where_in_line)
| In_table_cell -> (List.rev acc, next_token, where_in_line)
| In_code_results -> (List.rev acc, next_token, where_in_line))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most cases have the same type and could perhaps be written with an or-pattern ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly not (the return types are different in each case). See the comment immediately below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.


let emit_code_block ~start_offset input metadata c =
let c = trim_trailing_blank_lines c in
let emit_code_block ~start_offset content_offset input metadata delim terminator c has_results =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location calculations in this function are quite complicated and would deserve some comments.
Why is content_offset needed ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a remark on delim_char but otherwise I think it's ready to merge.
The CI failures are not related, everything is OK.

src/lexer.mll Outdated
['a'-'z' 'A'-'Z' '0'-'9' '_' '-' ]

let delim_char =
['a'-'z' 'A'-'Z' '0'-'9' '_' '-' ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without - ? This might interfere with the {- ...} syntax with a different parser engine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I thought I had removed that, as per your previous comment. Will fix, thanks!


let emit_code_block ~start_offset input metadata c =
let c = trim_trailing_blank_lines c in
let emit_code_block ~start_offset content_offset input metadata delim terminator c has_results =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants