Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Add table support #6040

Closed
wants to merge 59 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
9ec4b31
Work on table support
sirreal Feb 2, 2024
ab408c1
table processing
sirreal Feb 4, 2024
5ec8ffe
Disable FRAME / HEAD
sirreal Feb 4, 2024
2be4e63
In table body rules
sirreal Feb 5, 2024
0c0484c
prep step_in_row
sirreal Feb 6, 2024
b91df28
Remove unsupported table element tests
sirreal Jul 3, 2024
84b3c74
phpcbf and implement in row
sirreal Jul 4, 2024
1a90c26
Add clear_up_to_last_marker to active formatting
sirreal Jul 4, 2024
dafe4ea
Add step_in_cell method
sirreal Jul 4, 2024
cd8d7e7
Use class name for processor state constants access
sirreal Jul 4, 2024
a7b565b
Update since tags
sirreal Jul 4, 2024
3971e2b
Add close_cell method
sirreal Jul 4, 2024
6316e4b
Pop from open elements instead of removing items
sirreal Jul 4, 2024
cff5faa
Complete cases in step_in_table
sirreal Jul 4, 2024
6ef9df9
Implement in_table_scope
sirreal Jul 4, 2024
9c59014
Add HTML elements to has_element_in_scope handling
sirreal Jul 4, 2024
2ded522
Merge remote-tracking branch 'upstream/trunk' into html-api/add-table…
sirreal Jul 16, 2024
fce641d
Use insert_marker over set_marker
sirreal Jul 16, 2024
693a791
Use newly implemented step_in_X methods
sirreal Jul 16, 2024
d688c10
Clean whitespace
sirreal Jul 16, 2024
39eba92
Use bail method in case of foster parenting
sirreal Jul 16, 2024
2a3d7d4
Add mising cell insertion mode on enter td,th
sirreal Jul 16, 2024
ff7541c
PHPCBF
sirreal Jul 16, 2024
2cf10df
Use todo comments for parse errors
sirreal Jul 16, 2024
a216d55
Remove EOF comment
sirreal Jul 16, 2024
7d7f688
Move stack methods to stack class
sirreal Jul 16, 2024
e171589
Add and use form_element pointer
sirreal Jul 16, 2024
fbd635d
Handle presumptuous tags as if they were comments
sirreal Jul 16, 2024
af1142a
Add test for table > form > #comment
sirreal Jul 16, 2024
a7f5a22
Pop FORM elements off the stack in tables
sirreal Jul 16, 2024
a6a7c7d
Be more consistent in parse error comments
sirreal Jul 16, 2024
345b776
Remove outdated "stub implementation" notes
sirreal Jul 16, 2024
4966e7a
Add return types
sirreal Jul 16, 2024
c111a74
Handle whitespace in TABLE text
sirreal Jul 16, 2024
c7f6da6
Fix table start tag handling
sirreal Jul 16, 2024
d14eaf3
Remove "COL" from void tags test
sirreal Jul 17, 2024
0ce8fc4
Merge remote-tracking branch 'upstream/trunk' into html-api/add-table…
sirreal Jul 17, 2024
eaa8359
Fix handling of table text according to specification
sirreal Jul 17, 2024
380b9c6
Expand text processing comment and whitespace special character form
sirreal Jul 18, 2024
9957194
Fix comment whitespace
sirreal Jul 18, 2024
ffd0e1c
Clarify empty check after processing and null-remove
sirreal Jul 19, 2024
f49812e
Use consistent "\n" style character escapes
sirreal Jul 19, 2024
d9aa8fb
Merge branch 'trunk' into html-api/add-table-support
dmsnell Jul 23, 2024
40b55b4
Remove redundant null byte text replacement
sirreal Jul 23, 2024
9046cb3
Apply suggestion to compare multiple elements against node name
sirreal Jul 23, 2024
e4b874c
Add spec quote when generating a COLGROUP token
sirreal Jul 23, 2024
1cca15f
Add spec quote when generating a TBODY token
sirreal Jul 23, 2024
b29e1d3
Use goto for safer move to "anything else" condition
sirreal Jul 23, 2024
3f780fc
Revert "Remove "COL" from void tags test"
sirreal Jul 23, 2024
b44f7a3
fixup! Apply suggestion to compare multiple elements against node name
sirreal Jul 23, 2024
085950e
Add comment for no-quirks p table nesting
sirreal Jul 23, 2024
e057ff9
Remove strspn default args
sirreal Jul 23, 2024
dc752ff
Remove assertion in implementation from HTML spec
sirreal Jul 23, 2024
2cfe504
Adjust code after review.
dmsnell Jul 23, 2024
9590793
Remove typo.
dmsnell Jul 23, 2024
cf16373
Fix HTML spec quoting close_cell method
sirreal Jul 24, 2024
4c4bfc8
Use pop instruction for form elements that are immediately popped in …
sirreal Jul 24, 2024
0686de8
Remove unwanted change to expects-closer
dmsnell Jul 24, 2024
d43c220
Merge remote-tracking branch 'upstream/trunk' into html-api/add-table…
dmsnell Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 103 additions & 35 deletions src/wp-includes/html-api/class-wp-html-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -765,6 +765,7 @@ public function expects_closer( $node = null ): ?bool {
}

return ! (
( $node->has_self_closing_flag ?? false ) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, @sirreal, this is no longer required, since we moved to processing the event queue. I'm leaving it in for now, though, while pondering the later change which avoids immediately popping the FORM off the stack of open elements.

// Comments, text nodes, and other atomic tokens.
'#' === $token_name[0] ||
// Doctype declarations.
Expand Down Expand Up @@ -2138,7 +2139,8 @@ private function step_in_table(): bool {

switch ( $op ) {
/*
* > A character token, if the current node is table, tbody, template, tfoot, thead, or tr element
* > A character token, if the current node is table,
* > tbody, template, tfoot, thead, or tr element
*/
case '#text':
$current_node = $this->state->stack_of_open_elements->current_node();
Expand All @@ -2159,7 +2161,6 @@ private function step_in_table(): bool {
* U+0000 NULL bytes then ignore the token.
*/
if ( '' === $text ) {
// @todo Indicate a parse error once it's possible.
return $this->step();
}

Expand All @@ -2179,12 +2180,10 @@ private function step_in_table(): bool {
* >
* > Otherwise, insert the characters given by the pending table
* > character tokens list.
* > …
* > ASCII whitespace is U+0009 TAB, U+000A LF, U+000C FF, U+000D CR, or U+0020 SPACE.
*
* @see https://html.spec.whatwg.org/#parsing-main-intabletext
*/
if ( strlen( $text ) === strspn( $text, "\t\n\f\r " ) ) {
if ( strlen( $text ) === strspn( $text, " \t\f\r\n" ) ) {
sirreal marked this conversation as resolved.
Show resolved Hide resolved
$this->insert_html_element( $this->state->current_token );
return true;
}
Expand All @@ -2208,7 +2207,7 @@ private function step_in_table(): bool {
* > A DOCTYPE token
*/
case 'html':
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
sirreal marked this conversation as resolved.
Show resolved Hide resolved
return $this->step();

/*
Expand All @@ -2235,11 +2234,12 @@ private function step_in_table(): bool {
*/
case '+COL':
$this->state->stack_of_open_elements->clear_to_table_context();

/*
* > Insert an HTML element for a "colgroup" start tag token with no attributes,
* > then switch the insertion mode to "in column group".
*/
$this->insert_html_element( new WP_HTML_Token( null, 'COLGROUP', false ) );
$this->insert_virtual_node( 'COLGROUP' );
sirreal marked this conversation as resolved.
Show resolved Hide resolved
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
return $this->step( self::REPROCESS_CURRENT_NODE );

Expand All @@ -2265,18 +2265,20 @@ private function step_in_table(): bool {
* > Insert an HTML element for a "tbody" start tag token with no attributes,
* > then switch the insertion mode to "in table body".
*/
$this->insert_html_element( new WP_HTML_Token( null, 'TBODY', false ) );
$this->insert_virtual_node( 'TBODY' );
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
return $this->step( self::REPROCESS_CURRENT_NODE );

/*
* > A start tag whose tag name is "table"
*
* This tag in the IN TABLE insertion mode is a parse error.
*/
case '+TABLE':
// pase error
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
return $this->step();
}

$this->state->stack_of_open_elements->pop_until( 'TABLE' );
$this->reset_insertion_mode();
return $this->step( self::REPROCESS_CURRENT_NODE );
Expand All @@ -2289,6 +2291,7 @@ private function step_in_table(): bool {
// @todo Indicate a parse error once it's possible.
return $this->step();
}

$this->state->stack_of_open_elements->pop_until( 'TABLE' );
$this->reset_insertion_mode();
return true;
Expand All @@ -2307,7 +2310,7 @@ private function step_in_table(): bool {
case '-TH':
case '-THEAD':
case '-TR':
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();

/*
Expand All @@ -2318,7 +2321,9 @@ private function step_in_table(): bool {
case '+SCRIPT':
case '+TEMPLATE':
case '-TEMPLATE':
// > Process the token using the rules for the "in head" insertion mode.
/*
* > Process the token using the rules for the "in head" insertion mode.
*/
return $this->step_in_head();

/*
Expand All @@ -2339,6 +2344,8 @@ private function step_in_table(): bool {

/*
* > A start tag whose tag name is "form"
*
* This tag in the IN TABLE insertion mode is a parse error.
*/
case '+FORM':
if (
Expand All @@ -2347,10 +2354,12 @@ private function step_in_table(): bool {
) {
return $this->step();
}

// This FORM is special because it immediately closes and cannot have other children.
$this->state->current_token->has_self_closing_flag = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I'm pondering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for doing it this way instead of adding and popping right here?


$this->insert_html_element( $this->state->current_token );
$this->state->form_element = $this->state->current_token;
// > Pop that form element off the stack of open elements.
$this->state->stack_of_open_elements->pop();
return true;
}

Expand Down Expand Up @@ -2459,9 +2468,7 @@ private function step_in_table_body(): bool {
case '+TD':
// @todo Indicate a parse error once it's possible.
$this->state->stack_of_open_elements->clear_to_table_body_context();
$this->insert_html_element(
new WP_HTML_Token( null, 'TR', false )
);
$this->insert_virtual_node( 'TR' );
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
return $this->step( self::REPROCESS_CURRENT_NODE );

Expand All @@ -2471,18 +2478,24 @@ private function step_in_table_body(): bool {
case '-TBODY':
case '-TFOOT':
case '-THEAD':
if (
! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name )
) {
// @todo Indicate a parse error once it's possible.
/*
* @todo This needs to check if the element in scope is an HTML element, meaning that
* when SVG and MathML support is added, this needs to differentiate between an
* HTML element of the given name, such as `<center>`, and a foreign element of
* the same given name.
*/
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
// Parse error: ignore the token.
return $this->step();
}

$this->state->stack_of_open_elements->clear_to_table_body_context();
$this->state->stack_of_open_elements->pop();
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
return true;

/*
* > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead"
* > A start tag whose tag name is one of: "caption", "col", "colgroup","tbody", "tfoot", "thead"
* > An end tag whose tag name is "table"
*/
case '+CAPTION':
Expand All @@ -2497,7 +2510,7 @@ private function step_in_table_body(): bool {
! $this->state->stack_of_open_elements->has_element_in_table_scope( 'THEAD' ) &&
! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TFOOT' )
) {
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}
$this->state->stack_of_open_elements->clear_to_table_body_context();
Expand All @@ -2516,7 +2529,7 @@ private function step_in_table_body(): bool {
case '-TD':
case '-TH':
case '-TR':
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}

Expand Down Expand Up @@ -2564,9 +2577,10 @@ private function step_in_row(): bool {
*/
case '-TR':
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}

$this->state->stack_of_open_elements->clear_to_table_row_context();
$this->state->stack_of_open_elements->pop();
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
Expand All @@ -2585,9 +2599,10 @@ private function step_in_row(): bool {
case '+TR':
case '-TABLE':
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}

$this->state->stack_of_open_elements->clear_to_table_row_context();
$this->state->stack_of_open_elements->pop();
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
Expand All @@ -2599,14 +2614,22 @@ private function step_in_row(): bool {
case '-TBODY':
case '-TFOOT':
case '-THEAD':
/*
* @todo This needs to check if the element in scope is an HTML element, meaning that
* when SVG and MathML support is added, this needs to differentiate between an
* HTML element of the given name, such as `<center>`, and a foreign element of
* the same given name.
*/
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}

if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
// ignore the token.
// Ignore the token.
return $this->step();
}

$this->state->stack_of_open_elements->clear_to_table_row_context();
$this->state->stack_of_open_elements->pop();
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
Expand All @@ -2622,7 +2645,7 @@ private function step_in_row(): bool {
case '-HTML':
case '-TD':
case '-TH':
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}

Expand Down Expand Up @@ -2659,14 +2682,29 @@ private function step_in_cell(): bool {
*/
case '-TD':
case '-TH':
/*
* @todo This needs to check if the element in scope is an HTML element, meaning that
* when SVG and MathML support is added, this needs to differentiate between an
* HTML element of the given name, such as `<center>`, and a foreign element of
* the same given name.
*/
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}

$this->generate_implied_end_tags();
if ( ! $this->state->stack_of_open_elements->current_node()->node_name ) {

/*
* @todo This needs to check if the current node is an HTML element, meaning that
* when SVG and MathML support is added, this needs to differentiate between an
* HTML element of the given name, such as `<center>`, and a foreign element of
* the same given name.
*/
if ( ! $this->state->stack_of_open_elements->current_node_is( $tag_name ) ) {
// @todo Indicate a parse error once it's possible.
}

$this->state->stack_of_open_elements->pop_until( $tag_name );
$this->state->active_formatting_elements->clear_up_to_last_marker();
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
Expand All @@ -2685,6 +2723,12 @@ private function step_in_cell(): bool {
case '+TH':
case '+THEAD':
case '+TR':
/*
* > Assert: The stack of open elements has a td or th element in table scope.
*
* Nothing to do here, except to verify in tests that this never appears.
*/

$this->close_cell();
return $this->step( self::REPROCESS_CURRENT_NODE );

Expand All @@ -2696,7 +2740,7 @@ private function step_in_cell(): bool {
case '-COL':
case '-COLGROUP':
case '-HTML':
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();

/*
Expand All @@ -2707,8 +2751,14 @@ private function step_in_cell(): bool {
case '-TFOOT':
case '-THEAD':
case '-TR':
/*
* @todo This needs to check if the element in scope is an HTML element, meaning that
* when SVG and MathML support is added, this needs to differentiate between an
* HTML element of the given name, such as `<center>`, and a foreign element of
* the same given name.
*/
if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
// @todo Indicate a parse error once it's possible.
// Parse error: ignore the token.
return $this->step();
}
$this->close_cell();
Expand Down Expand Up @@ -4057,16 +4107,16 @@ private function run_adoption_agency_algorithm(): void {
/**
* Runs the close cell algorithm.
*
* @see https://html.spec.whatwg.org/multipage/parsing.html#close-the-cell
*
* Where the steps above say to close the cell, they mean to run the following algorithm:

*
* > 1. Generate implied end tags.
* > 2. If the current node is not now a td element or a th element, then this is a parse error.
* > 3. Pop elements from the stack of open elements stack until a td element or a th element has been popped from the stack.
* > 4. Clear the list of active formatting elements up to the last marker.
* > 5. Switch the insertion mode to "in row".
*
* @see https://html.spec.whatwg.org/multipage/parsing.html#close-the-cell
*
* @since 6.7.0
*/
private function close_cell(): void {
Expand Down Expand Up @@ -4095,6 +4145,24 @@ private function insert_html_element( WP_HTML_Token $token ): void {
$this->state->stack_of_open_elements->push( $token );
}

/**
* Inserts a virtual element on the stack of open elements.
*
* @since 6.7.0
*
* @param string $token_name Name of token to create and insert into the stack of open elements.
* @param string|null $bookmark_name Optional. Name to give bookmark for created virtual node.
* Defaults to auto-creating a bookmark name.
*/
private function insert_virtual_node( $token_name, $bookmark_name = null ): void {
$here = $this->bookmarks[ $this->state->current_token->bookmark_name ];
$name = $bookmark_name ?? $this->bookmark_token();

$this->bookmarks[ $name ] = new WP_HTML_Span( $here->start, 0 );

$this->insert_html_element( new WP_HTML_Token( $name, $token_name, false ) );
}

/*
* HTML Specification Helpers
*/
Expand Down
Loading