Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle colspan on table headers #144

Closed
wants to merge 1 commit into from

Conversation

isaacwhite
Copy link

This change fixes an issue where the table formatter currently ignores colspan attributes on th tags. For example:
image

with this markup:

<table class="capitalization" summary="Capitalization Examples">
  <caption> Headline Capitalization </caption>
  <colgroup>
    <col style="width:15%">
      <col style="width:15%">
  </colgroup>
  <colgroup>
    <col style="width:10%">
      <col style="width:30%">
        <col style="width:30%">
  </colgroup>
  <thead>
    <tr style="background-color:#ddd">
      <th class="tableheader" colspan="2">Capitalize</th>
      <th class="tableheader" colspan="3">lowercase</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td> <i>No <br>    Nor <br>    Not</i> </td>
      <td> <i>Off <br>    Out <br>    So <br>    Up</i> </td>
      <td> <i>a <br>    and <br>    as <br>    at <br>    but <br>    by</i> </td>
      <td> <i>en</i> (in <i>en Route</i>)
        <br><i>for <br>    if <br>    in <br>    of <br>    on</i> </td>
      <td> <i>or</i>
        <br><i>the</i>
        <br><i>to</i>
        <br><i>v.</i> (in legal contexts)
        <br><i>vs. <br>    via</i> </td>
    </tr>
  </tbody>
</table>

Results in this text:

CAPITALIZE   LOWERCASE   
No           Off         a      en (in en Route)     or                        
Nor          Out         and    for                  the                       
Not          So          as     if                   to                        
             Up          at     in                   v. (in legal contexts)    
                         but    of                   vs.                       
                         by     on                   via

When really what should be output is this text:

CAPITALIZE           LOWERCASE                                                  
No           Off     a           en (in en Route)     or                        
Nor          Out     and         for                  the                       
Not          So      as          if                   to                        
             Up      at          in                   v. (in legal contexts)    
                     but         of                   vs.                       
                     by          on                   via

This change does not attempt to fix the absence of the <caption> tag's content from the output text, which is also demonstrated by the above markup.

@mlegenhausen
Copy link
Member

@isaacwhite Could you please rebase and add a test case?

jackellenberger pushed a commit to jackellenberger/node-html-to-text that referenced this pull request May 11, 2018
* fixup of html-to-text#144
* update integration test with new functionality
jackellenberger pushed a commit to jackellenberger/node-html-to-text that referenced this pull request May 11, 2018
* fixup of html-to-text#144
* update integration test with new functionality
jackellenberger pushed a commit to jackellenberger/node-html-to-text that referenced this pull request May 11, 2018
* fixup of html-to-text#144
* update integration test to cover new colspan functionality
* update unit tests to cover new colspan functionality
jackellenberger added a commit to jackellenberger/node-html-to-text that referenced this pull request Aug 3, 2018
* fixup of html-to-text#144
* update integration test to cover new colspan functionality
* update unit tests to cover new colspan functionality
@KillyMXI
Copy link
Member

KillyMXI commented Nov 1, 2020

Version 6 output:

CAPITALIZE   LOWERCASE
No    Off    a     en (in en Route)   or
Nor   Out    and   for                the
Not   So     as    if                 to
      Up     at    in                 v. (in legal contexts)
             but   of                 vs.
             by    on                 via

Cells with colspan and rowspan can make use of extra space across columns/rows now.

Your effort is appreciated, but this is no longer needed.

I made a note about caption and colgroup. Maybe I'll take a look at them later, along with extra formatting for thead, tfoot.

@KillyMXI KillyMXI closed this Nov 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants