Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Use full parser for html5lib tests #7117

Closed
wants to merge 28 commits into from

Conversation

sirreal
Copy link
Member

@sirreal sirreal commented Jul 31, 2024

Trac ticket: Core-61646

Previously, the html5lib tests have only run in the fragment parser mode, but in #6977 a full parser was introduced to the HTML API. In this patch, the html5lib tests are running in the full parser mode where applicable.

#7146 is merged here and should be landed first.

- Tests: 609, Assertions: 405, Skipped: 204.
+ Tests: 1498, Assertions: 787, Skipped: 711.

This vastly increases the tests that are used from HTML5lib-tests from 609 to 1498! It's helped to discover two issues:

@sirreal sirreal force-pushed the html-api/support-initial-mode branch from 4a97cab to 1c834ba Compare July 31, 2024 17:33
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Previously, the fragment parser in WP_HTML_Processor has only allowed
creating a fragment with the `<body>` context. In this patch, any
context node is allowed.
strlen( $this->sought_tag_name ) !== $this->tag_name_length ||
0 !== substr_compare( $this->html, $this->sought_tag_name, $this->tag_name_starts_at, $this->tag_name_length, true )
)
) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need this. next_tag( 'PLAINTEXT' ) stopped on P

@dmsnell
Copy link
Member

dmsnell commented Aug 2, 2024

@sirreal I've added active format reconstruction to the PLAINTEXT element, even though it's missing from the spec, but all the major parsers clearly perform this. I've submitted a PR to the HTML spec to add the rule.

whatwg/html#10540

@dmsnell
Copy link
Member

dmsnell commented Aug 5, 2024

Leaving a note for posterity's sake: adding active format reconstruction to PLAINTEXT is wrong. Adding PLAINTEXT is more complicated than it originally seemed, because even though every remaining token is a character token, if there are any character tokens left, they may trigger active format reconstruction. This is something I'll try and fix

@dmsnell dmsnell force-pushed the html-api/support-initial-mode branch from adfe657 to a385685 Compare August 5, 2024 22:19
@dmsnell
Copy link
Member

dmsnell commented Aug 5, 2024

Fragment support extracted into #7141 where I thought I had already done so, sorry!

@sirreal
Copy link
Member Author

sirreal commented Aug 6, 2024

Thanks, I'll revert the changes to support arbitrary fragments here and continue to only support body. That work is more complicated and can continue in #7141.

sirreal added 7 commits August 6, 2024 09:22
This is no longer required. The full or fragment parser will be
used as necessary.
The "</#" HTML test demonstrates a bug that must be fixed.
A state change was missing when the input is too short to find a comment
closer in an opened funky comment. This fixes a issue where `</#` does
not correctly report incomplete input, but `</# ` does.
@sirreal sirreal requested a review from dmsnell August 6, 2024 11:18
@sirreal sirreal marked this pull request as ready for review August 6, 2024 11:18
Copy link

github-actions bot commented Aug 6, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props jonsurrell, dmsnell.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement in how many tests run, and it's not trivial to fix the test runner. Thanks again @sirreal!

This is a tests-only change.

pento pushed a commit that referenced this pull request Aug 6, 2024
Previously the `html5lib` tests have only run in the fragment parser mode,
assuming IN BODY context. This limited the number of tests which could run
and was a result of the HTML Processor only supporting the IN BODY fragment
parser. In [58836], however, a full parser was added to the HTML Processor.

In this patch the full parser is utilized in order to run more of the
previously-skipped tests, asserting more behaviors in the HTML parsing.

Developed in #7117
Discussed in https://core.trac.wordpress.org/ticket/61646

Props: dmsnell, jonsurrell.
See #61646.


git-svn-id: https://develop.svn.wordpress.org/trunk@58859 602fd350-edb4-49c9-b593-d223f7449a82
@dmsnell
Copy link
Member

dmsnell commented Aug 6, 2024

Merged in [58859]
9928cd6

@dmsnell dmsnell closed this Aug 6, 2024
@dmsnell dmsnell deleted the html-api/support-initial-mode branch August 6, 2024 22:06
markjaquith pushed a commit to markjaquith/WordPress that referenced this pull request Aug 6, 2024
Previously the `html5lib` tests have only run in the fragment parser mode,
assuming IN BODY context. This limited the number of tests which could run
and was a result of the HTML Processor only supporting the IN BODY fragment
parser. In [58836], however, a full parser was added to the HTML Processor.

In this patch the full parser is utilized in order to run more of the
previously-skipped tests, asserting more behaviors in the HTML parsing.

Developed in WordPress/wordpress-develop#7117
Discussed in https://core.trac.wordpress.org/ticket/61646

Props: dmsnell, jonsurrell.
See #61646.

Built from https://develop.svn.wordpress.org/trunk@58859


git-svn-id: http://core.svn.wordpress.org/trunk@58255 1a063a9b-81f0-0310-95a4-ce76da25c4cd
github-actions bot pushed a commit to platformsh/wordpress-performance that referenced this pull request Aug 6, 2024
Previously the `html5lib` tests have only run in the fragment parser mode,
assuming IN BODY context. This limited the number of tests which could run
and was a result of the HTML Processor only supporting the IN BODY fragment
parser. In [58836], however, a full parser was added to the HTML Processor.

In this patch the full parser is utilized in order to run more of the
previously-skipped tests, asserting more behaviors in the HTML parsing.

Developed in WordPress/wordpress-develop#7117
Discussed in https://core.trac.wordpress.org/ticket/61646

Props: dmsnell, jonsurrell.
See #61646.

Built from https://develop.svn.wordpress.org/trunk@58859


git-svn-id: https://core.svn.wordpress.org/trunk@58255 1a063a9b-81f0-0310-95a4-ce76da25c4cd
aslamdoctor pushed a commit to aslamdoctor/wordpress-develop that referenced this pull request Dec 28, 2024
Previously the `html5lib` tests have only run in the fragment parser mode,
assuming IN BODY context. This limited the number of tests which could run
and was a result of the HTML Processor only supporting the IN BODY fragment
parser. In [58836], however, a full parser was added to the HTML Processor.

In this patch the full parser is utilized in order to run more of the
previously-skipped tests, asserting more behaviors in the HTML parsing.

Developed in WordPress#7117
Discussed in https://core.trac.wordpress.org/ticket/61646

Props: dmsnell, jonsurrell.
See #61646.


git-svn-id: https://develop.svn.wordpress.org/trunk@58859 602fd350-edb4-49c9-b593-d223f7449a82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants