Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra paragraph tags in figures #412

Closed
tunetheweb opened this issue Nov 10, 2019 · 3 comments
Closed

Extra paragraph tags in figures #412

tunetheweb opened this issue Nov 10, 2019 · 3 comments
Assignees
Labels
development Building the Almanac tech stack enhancement New feature or request
Milestone

Comments

@tunetheweb
Copy link
Member

tunetheweb commented Nov 10, 2019

(I know @mikegeyser said he would look today but raising so we don't forget).

As discussed in #394 (comment), generating the chapters using npm run generate leads to extra <p></p> lines in both <img and <table figures:

table:

            </table>
          </div>
        </div>
        <p></p>
        <figcaption>Figure 4. HTTP version usage for home pages.</figcaption>
        <p></p>
      </figure>

img:

        <figcaption>Figure 9. TCP connections per page. (Source: <a href="https://httparchive.org/reports/state-of-the-web#tcp">HTTP Archive</a>)</figcaption>
        <p></p>
      </figure>

This is invalid HTML when you validate it.

At least some of them look to be due to calling wrap_tables as commenting that out doesn't lead to the issue.

A simple fix is to add a regex replace in generate_chapters.js to remove these spurious tags:

  body = generate_figure_ids(body);
  body = wrap_tables(body);
  body = body.replace(/<p><\/p>/g,"");
  const toc = generate_table_of_contents(body);

Will see if @mikegeyser has a better fix to prevent them happening in first place before we go this route.

@tunetheweb tunetheweb changed the title Extra paragraph tags in visualisations Extra paragraph tags in figures Nov 10, 2019
@rviscomi rviscomi added development Building the Almanac tech stack enhancement New feature or request labels Nov 10, 2019
@rviscomi rviscomi added this to the Après Ski milestone Nov 10, 2019
@mikegeyser
Copy link
Contributor

As @bazzadp pointed out, I think it's the wrap_tables functionality. That uses JSDOM rather than regex, which relies on serializing the jsdom document to string. We had some uncontrollable behaviour in the generate_figure_ids chapter while using that approach, and eventually abandoned it in favour of regex. I think this is probably a similar situation, which is why the problem disappears when you comment our wrap_tables.

I'll carry on looking into it, though, and see if there's an expedient fix.

@mikegeyser
Copy link
Contributor

Actually, I think we should put in the fix that @bazzadp recommended while I keep working on a proper solution for the next release.

@rviscomi
Copy link
Member

#415 was an interim fix but I think we can close this since there's no other immediate action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Building the Almanac tech stack enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants