Skip to content
This repository has been archived by the owner on Nov 17, 2022. It is now read-only.

Incorrect table conversion #22

Open
ntolia opened this issue Apr 18, 2018 · 4 comments
Open

Incorrect table conversion #22

ntolia opened this issue Apr 18, 2018 · 4 comments

Comments

@ntolia
Copy link

ntolia commented Apr 18, 2018

First of all, thank you for creating and supporting m2r! Really appreciate it.

That said, I am trying to convert the following (extracted) markdown file into rst.

| Field | Type | Label | Description |
| ----- | ---- | ----- | ----------- |
| foo   | bar |  |  |

<a name="test"/>

The generated table is missing a row and will cause errors along the lines of Error parsing content block for the "list-table" directive: uniform two-level bullet list expected, but row 2 does not contain the same number of items as row 1 (3 vs 4). This happens because the generated output looks like:

.. role:: raw-html-m2r(raw)
   :format: html


.. list-table::
   :header-rows: 1

   * - Field
     - Type
     - Label
     - Description
   * - foo
     - bar
     -


:raw-html-m2r:`<a name="test"/>`

and is missing the fourth row in the table. However, if the last line of the markdown is removed, the problem disappers. I am using m2r 0.1.14.

@ntolia
Copy link
Author

ntolia commented Apr 20, 2018

I have a feeling the bug might be in mistune or how it is being used by m2r but haven't been able to narrow it down just yet. Will update this issue if I find out anything else.

@ntolia
Copy link
Author

ntolia commented Apr 23, 2018

Here is what I have found so far. For certain tables that lack a missing value in the last cell of the last row and that are followed by an internal HTML hyperlink (and possibly other text too), the tool will chew up the last cell and cause errors in the final HTML docs generation. On further debugging, this seems to happen because of a potential bug in the mistune library where, in parse_table(), the value of the matched group (m.group(3) if you are looking at the code) for cells includes a spurious newline at the end of the text that is being parsed. This is likely happening because of the rule matching in parse() against the rule table (see manipulate in mistune):

table = re.compile(
        r'^ *\|(.+)\n *\|( *[-:]+[-| :]*)\n((?: *\|.*(?:\n|$))*)\n*'
    )

It is possible to show that this bug goes away if I rstrip() the matched group before breaking it up into cells inside mistune's parse_table() but I am still not sure what the right fix is in this case or what might be wrong in the above regex.

@ntolia
Copy link
Author

ntolia commented Apr 24, 2018

This is likely the same thing as lepture/mistune#118

@m-holger
Copy link

The problem does seem to be with mistune. A quick fix is to add the following method to the RestBlockLexer class:

    def parse_table(self, m):
        #
        # ammended version of mistune.BlockLexer method
        #
        item = self._process_table(m)
        cols = len(item['header'])  #added
        cells = re.sub(r'(?: *\| *)?\n$', '', m.group(3))
        cells = cells.split('\n')
        for i, v in enumerate(cells):
            v = re.sub(r'^ *\| *| *\| *$', '', v)
            cells[i] = re.split(r' *(?<!\\)\| *', v)
            #
            # The header row must match the delimiter row in the number of cells. 
            # If not, a table will not be recognized. The remainder of the table’s 
            # rows may vary in the number of cells. If there are a number of cells 
            # fewer than the number of cells in the header row, empty cells are 
            # inserted. See https://github.github.com/gfm/#example-203
            while len(cells[i]) < cols: #added
                cells[i].append('')
            # If there are greater, the excess is ignored
            # see https://github.github.com/gfm/#example-203    
            del cells[i][cols:]  #added

        item['cells'] = self._process_cells(cells)
        self.tokens.append(item)
       
    

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants