-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add syntax for tables #11
Conversation
Thanks @gpetiot ! This is definitely an approach that could work very well. However, I think we ought to consider the possibility of having more visual way to layout tables, perhaps something similar to how rst or orgmode does it - as an example both of the following are valid rst tables:
and
and here's an org-mode table:
The reason being that more often than not right now I'm still looking at docs in my editor buffer, or on github, as opposed to nicely rendered on a web page. We'd also get to use pre-existing table-editing modes in various editors. |
Features that other languages have: text alignment of the column, header rows. The short syntax from RST is interesting: https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table if we drop the structured directive and attributes and make it less ambiguous, we get a syntax easy to type that is similar to lists and might be enough in most cases. I suggest this syntax, which works the same way as lists:
The long syntax that you suggest is needed, like we have for lists, to support nesting of bigger elements. To leave room for the Header row similar to HTML:
The main element can be |
Most lightweight markup languages add two things, headers and specification of column alignement (see for example the github flavoured markdown table definition). I think it would be nice to have these. I wouldn't bother too much with multi-columns. I have mixed feelings about what @jonludlam suggests. Of course these tables look nicer but the given examples are also not very content heavy. Once you start adding links, references and ocamldoc markup in the cell contents I have the impression that the regular, well-delimited, syntax of ocamldoc scales better. |
I've always hated both reading and writing tables drawn and aligned like that. The editor plugins to draw them are not worth learning for the few tables I ever needed to make so I end up not aligning them, they become write-only. Also, multi line cells are a pain. |
@dbuenzli agreed that larger-scale tables won't look as nice as those examples! Nothing stopping us from doing both though - we already have precedent in lists, where we can write both
and
we could simply choose based on the content of the
vs
|
I've never hated reading them, but writing is a bit of a pain without editor support :-) |
Actually, choosing based on contents is not a good idea. We could use a different tag though - |
To motivate with concrete examples here's a sample of tables that I currently write as code and that would benefit from the proposal:
Scroll a bit if you don't see the table, which reminds me: it would be nice to be able to reference them via a label like we do for sections, i.e. |
Regarding a possible lightweight notation. I think it's good at least have a look at what the experts do to when they can start designing from scratch. |
For the light syntax, what do you think about something like:
|
If you really want to add this then I strongly suggest not to invent anything new here. Have a look at pandoc's markdown table extensions for a few alternatives. Personally I think you should just go with |
I implemented the light and heavy syntaxes, this PR is ready to be reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parses as two columns, a
and b |c| d
, which is not intuitive:
{t
| a | b |c| d |
|---|--:|:--|:-:|
}
I assume that, since pipetables are pretty standard, there are a set of nasty test cases somewhere that we could use? |
Would it please be possible to update the first comment with the implemented syntaxes. For example how do I specify alignement in the "heavy" syntax ? |
Nothing decided yet, I'm torn between doing something similar to the light syntax:
or something that would be easier to read to my opinion but that would make both implementations diverge:
|
Note that even though IIRC
|
Sounds good to me, @jonludlam what do you think? |
Reusing the syntax we previously had makes sense to me. |
On the other hand, an attribute-like syntax would allow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of the pipe syntax at all but using the most common syntax can't hurt Odoc.
I fixed the parsing issues with the alignment tags for the heavy syntax. Let me know what you think @Julow @jonludlam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Adding a block element into a light table makes quite a mess of warnings and continue the parsing in an inconsistent way:
# p {|
{t
| {[ a ]} | b |
|---|---|
}
|} ;;
File "", line 2, characters 5-12:
'{[...]}' (code block) is not allowed in '{t ...}' (table).
File "", line 2, characters 13-14:
'|' is not allowed in top-level text.
Suggestion: move '|' into '{t ...}' (table).
File "", line 2, characters 15-16:
Paragraph should begin on its own line.
File "", line 2, characters 17-18:
'|' is not allowed in top-level text.
Suggestion: move '|' into '{t ...}' (table).
File "", line 3, characters 3-4:
'|' is not allowed in top-level text.
Suggestion: move '|' into '{t ...}' (table).
File "", line 3, characters 7-8:
'|' is not allowed in top-level text.
Suggestion: move '|' into '{t ...}' (table).
File "", line 3, characters 8-11:
Paragraph should begin on its own line.
File "", line 3, characters 11-12:
'|' is not allowed in top-level text.
Suggestion: move '|' into '{t ...}' (table).
File "", line 4, characters 2-3:
Unpaired '}' (end of markup).
Suggestion: try '\}'.
- : Ast.t =
[`Table (Light {header = []; data = []; align = []});
`Paragraph [`Word "b"; `Space " "]; `Paragraph [`Word "---"];
`Paragraph [`Word "---"]; `Paragraph [`Word "}"]]
The parser has the goal of not loosing text in case of bad syntax so the problematic block element shouldn't be removed. It would be nice to handle the conversion from table to paragraphs without generating too much warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice !
I do have change requested for some things:
- I think that the heavy syntax should allow us to define any cell to be a header cell (just like in html). This allows for several use cases, with two common one being: "vertical" tables (tables where the header is a column) and tables with two "axis" (both a header line and a header column).
Fortunately, I think this will actually simplify the code a lot. - I believe the lexer should do more work in the case of the light table. Currently, as mentioned by @Julow, opening a forbidden block inside a light table will generate a lot of warnings (where one would have sufficed) and the next closing will be wrongly paired. That's a major issue!
Using a specific lexer for the inside of light tables (which is what is done for verbatim for example) also removes the constraint added by this PR to escape|
characters outside of tables. - I don't think the AST type should make such a clear distinction between light and heavy syntax. Similarly to ``
List
, a variant on the side could distinguish between them, the information is almost always irrelevant for consumers of the parser, it has to be easy to ignore.
Another thing: I am not a huge fan of the alignment syntax for heavy tables.
To me, it seems that {th {R text}}
will only have an effect on the cell, by the nesting. (The fact that it has an effect on the whole columns can only be understood with the light syntax in mind).
Moreover, it prevents possible future uses of {L }
to specify the alignment on single cell, especially on header cell (if we want that, not sure...).
Instead, we could take inspiration from latex's table (or the col
html element), where the fact that it acts on columns is evident:
{table {col L C R}
{tr
{th left-aligned} {th centered} {th right-aligned}}
}
}
This leaves room for deciding the line borders in odoc (but maybe that's not something we want):
{table {col | L C | R}
{hline}
{tr
{th left-aligned} {th centered} {th right-aligned}}
}
{hline}
}
New lines inside markup (such as `{e ...}`) were not detected, although it breaks the light table format. Add a warning for such cases. Signed-off-by: Paul-Elliot <peada@free.fr>
Signed-off-by: Paul-Elliot <peada@free.fr>
Signed-off-by: Paul-Elliot <peada@free.fr>
Signed-off-by: Paul-Elliot <peada@free.fr>
contribution to the table PR
I had a quick look to see whether there were any differences running over the set of libraries in odoc's driver with the
In both instances they are actually code snippets without being enclosed in any sort of verbatim/code tags, but I was slightly surprised nevertheless. It's not a show stopper but I'd be quite interested to know why the differences are there. |
Signed-off-by: Paul-Elliot <peada@free.fr>
Fix new paragraph starting on `|`
This commit fixes several alignment issues. 1. No alignment in a cell was encoded as an alignment to `Center. "Centering" was certainly a typo, but there is also the issue that "no specified alignment" should really be different than "specified to ..." For instance (assuming `{R ...}` exists): ``` {R {t |aaa|bbb| |---|---| |xxx|yyy|}} ``` would not be right-aligned inside the table, with the previous behaviour. This commit allows for "unspecified alignment" using an option. 2. No alignment in a table was encoded as an empty list of alignment. This was wrong, since at some point we might want to raise warnings in case the number of alignment and the number of columns differ. Signed-off-by: Paul-Elliot <peada@free.fr>
Fix default alignment
Thanks @jonludlam for spotting a bug! It has been fixed in 97fd2a3. I've updated ocaml/odoc#893 to be in sync with this PR, in case you want to review them as a whole. |
src/syntax.ml
Outdated
@@ -209,9 +324,11 @@ and delimited_inline_element_list : | |||
parent_markup:[< Token.t ] -> | |||
parent_markup_location:Loc.span -> | |||
requires_leading_whitespace:bool -> | |||
context:inline_context -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit of threaded state is annoying. It's also very similar to the requires_leading_whitespace
state - I wonder if the two could be combined into a single Context.t
or similar? I suppose the new bit is relevant to both delimited_inline_element_list
and inline_element
so maybe not worth it? It's also possibly worth considering getting rid of the context and just checking in the light_table
code after we've called this function that the start and end lines are the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick go: jonludlam@5c49f98 - I think it's a bit less invasive and keeps the check within the light_table_row
function. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also my opinion that having the check in light_table_row
makes the code clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a small remark, the location warning in the current PR and jonludlam@5c49f98 will be different:
- In the current PR, it is only the newline. There is one warning per newline.
- In jonludlam@5c49f98 there is only one warning, with location being the whole inline item.
I think both solutions have different advantages/disadvantages (one is clearer, the other more general), and I don't have a preference. So let's go with @jonludlam's idea!
Looking good now, thanks @gpetiot! |
This is a first draft, I'm asking for feedback before going further. @jonludlam @Julow
I chose this syntax:
The markup is ovsiously inspired from the
<tr></tr><td></td>
HTML tags.The generated output is straightforward in HTML and Latex, at first I wanted to add header cells (
<th></th>
) but I'm not aware of anything similar in Latex but we could apply a different style.Some improvements:
<td colspan="N">...</td>
, in Latex it would be\multicolumn{N}{c}{...}
What do you think?