Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WP_HTML_Tag_Processor: Make get_attribute reflect attribute set via set_attribute, even without updating #46680

Merged
merged 3 commits into from
Jan 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 142 additions & 14 deletions lib/experimental/html/class-wp-html-tag-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -1059,7 +1059,7 @@ private function parse_next_attribute() {
return true;
}

/**
/*
* > There must never be two or more attributes on
* > the same start tag whose names are an ASCII
* > case-insensitive match for each other.
Expand Down Expand Up @@ -1116,24 +1116,33 @@ private function after_tag() {
* Converts class name updates into tag attributes updates
* (they are accumulated in different data formats for performance).
*
* This method is only meant to run right before the attribute updates are applied.
* The behavior in all other cases is undefined.
*
* @return void
* @since 6.2.0
*
* @see $classname_updates
* @see $lexical_updates
*/
private function class_name_updates_to_attributes_updates() {
if ( count( $this->classname_updates ) === 0 || isset( $this->lexical_updates['class'] ) ) {
$this->classname_updates = array();
if ( count( $this->classname_updates ) === 0 ) {
return;
}

$existing_class = isset( $this->attributes['class'] )
? substr( $this->html, $this->attributes['class']->value_starts_at, $this->attributes['class']->value_length )
: '';
$existing_class = $this->get_enqueued_attribute_value( 'class' );
if ( null === $existing_class || true === $existing_class ) {
$existing_class = '';
}

if ( false === $existing_class && isset( $this->attributes['class'] ) ) {
$existing_class = substr(
$this->html,
$this->attributes['class']->value_starts_at,
$this->attributes['class']->value_length
);
}

if ( false === $existing_class ) {
$existing_class = '';
}

/**
* Updated "class" attribute value.
Expand Down Expand Up @@ -1251,7 +1260,7 @@ private function apply_attributes_updates() {
return;
}

/**
/*
* Attribute updates can be enqueued in any order but as we
* progress through the document to replace them we have to
* make our replacements in the order in which they are found
Expand All @@ -1270,7 +1279,7 @@ private function apply_attributes_updates() {
}

foreach ( $this->bookmarks as $bookmark ) {
/**
/*
* As we loop through $this->lexical_updates, we keep comparing
* $bookmark->start and $bookmark->end to $diff->start. We can't
* change it and still expect the correct result, so let's accumulate
Expand Down Expand Up @@ -1370,6 +1379,69 @@ private static function sort_start_ascending( $a, $b ) {
return $a->end - $b->end;
}

/**
* Return the enqueued value for a given attribute, if one exists.
*
* Enqueued updates can take different data types:
* - If an update is enqueued and is boolean, the return will be `true`
* - If an update is otherwise enqueued, the return will be the string value of that update.
* - If an attribute is enqueued to be removed, the return will be `null` to indicate that.
* - If no updates are enqueued, the return will be `false` to differentiate from "removed."
ockham marked this conversation as resolved.
Show resolved Hide resolved
*
* @since 6.2.0
*
* @param string $comparable_name The attribute name in its comparable form.
* @return string|boolean|null Value of enqueued update if present, otherwise false.
*/
private function get_enqueued_attribute_value( $comparable_name ) {
ockham marked this conversation as resolved.
Show resolved Hide resolved
if ( ! isset( $this->lexical_updates[ $comparable_name ] ) ) {
return false;
}

$enqueued_text = $this->lexical_updates[ $comparable_name ]->text;

// Removed attributes erase the entire span.
if ( '' === $enqueued_text ) {
return null;
}

/*
* Boolean attribute updates are just the attribute name without a corresponding value.
*
* This value might differ from the given comparable name in that there could be leading
* or trailing whitespace, and that the casing follows the name given in `set_attribute`.
*
* Example:
* ```
* $p->set_attribute( 'data-TEST-id', 'update' );
* 'update' === $p->get_enqueued_attribute_value( 'data-test-id' );
* ```
*
* Here we detect this based on the absence of the `=`, which _must_ exist in any
* attribute containing a value, e.g. `<input type="text" enabled />`.
* ¹ ²
* 1. Attribute with a string value.
* 2. Boolean attribute whose value is `true`.
*/
$equals_at = strpos( $enqueued_text, '=' );
if ( false === $equals_at ) {
return true;
}

/*
* Finally, a normal update's value will appear after the `=` and
* be double-quoted, as performed incidentally by `set_attribute`.
*
* e.g. `type="text"`
* ¹² ³
* 1. Equals is here.
* 2. Double-quoting starts one after the equals sign.
* 3. Double-quoting ends at the last character in the update.
*/
$enqueued_value = substr( $enqueued_text, $equals_at + 2, -1 );
return html_entity_decode( $enqueued_value );
}

/**
* Returns the value of the parsed attribute in the currently-opened tag.
*
Expand Down Expand Up @@ -1397,12 +1469,43 @@ public function get_attribute( $name ) {
}

$comparable = strtolower( $name );

/*
* For every attribute other than `class` we can perform a quick check if there's an
* enqueued lexical update whose value we should prefer over what's in the input HTML.
*
* The `class` attribute is special though because we expose the helpers `add_class`
* and `remove_class` which form a builder for the `class` attribute, so we have to
* additionally check if there are any enqueued class changes. If there are, we need
* to first flush them out so can report the full string value of the attribute.
ockham marked this conversation as resolved.
Show resolved Hide resolved
*/
if ( 'class' === $name ) {
$this->class_name_updates_to_attributes_updates();
}
Comment on lines +1482 to +1484
Copy link
Contributor Author

@ockham ockham Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For $this->attribute_updates (below) it makes sense (and is easy enough) to only evaluate the updates for the attribute we're interested in (if any) and to ignore updates for all other attributes. For class name updates OTOH, we would need to replicate most of the logic from class_name_updates_to_attributes_updates here, so we might as well just call that function and have classname updates "promoted" to attribute updates.

cc/ @adamziel since the PHPDoc for class_name_updates_to_attributes_updates says

	 * This method is only meant to run right before the attribute updates are applied.
	 * The behavior in all other cases is undefined.

Although it looks okay to me to call it here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to call $this->apply_attribute_updates() here as well to ensure they get applied before any successive class updates occur, lest we lose enqueued class name updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, you're right! While class_name_updates_to_attributes_updates does take an existing class attribute into account, it only looks at $this->attributes['class'] to do so -- but not into $lexical_updates:

$existing_class = isset( $this->attributes['class'] )
? substr( $this->html, $this->attributes['class']->value_starts_at, $this->attributes['class']->value_length )
: '';

So we do need to call $this->apply_attribute_updates() as you say (unless we wanna change class_name_updates_to_attributes_updates to take $lexical_updates['class'] into account -- but I guess we'd rather not).

I'll write a test case and will add a $this->apply_attribute_updates() call 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will change existing semantics of add_class and set_attribute -- see:

/**
* When both set_attribute('class', $value) and add_class( $different_value ) are called,
* the final class name should be $value. In other words, the `add_class` call should be ignored,
* and the `set_attribute` call should win. This holds regardless of the order in which these methods
* are called.
*
* @ticket 56299
*
* @covers add_class
* @covers set_attribute
* @covers get_updated_html
*/
public function test_set_attribute_takes_priority_over_add_class() {
$p = new WP_HTML_Tag_Processor( self::HTML_WITH_CLASSES );
$p->next_tag();
$p->add_class( 'add_class' );
$p->set_attribute( 'class', 'set_attribute' );
$this->assertSame(
'<div class="set_attribute" id="first"><span class="not-main bold with-border" id="second">Text</span></div>',
$p->get_updated_html(),
'Calling get_updated_html after updating first tag\'s attributes did not return the expected HTML'
);
$p = new WP_HTML_Tag_Processor( self::HTML_WITH_CLASSES );
$p->next_tag();
$p->set_attribute( 'class', 'set_attribute' );
$p->add_class( 'add_class' );
$this->assertSame(
'<div class="set_attribute" id="first"><span class="not-main bold with-border" id="second">Text</span></div>',
$p->get_updated_html(),
'Calling get_updated_html after updating second tag\'s attributes did not return the expected HTML'
);
}

Adding the apply_attribute_updates call thus breaks that test (plus a few other ones, going to look into those now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering something like this
diff --git a/lib/experimental/html/class-wp-html-tag-processor.php b/lib/experimental/html/class-wp-html-tag-processor.php
index e9c42bf419..9edd6a4105 100644
--- a/lib/experimental/html/class-wp-html-tag-processor.php
+++ b/lib/experimental/html/class-wp-html-tag-processor.php
@@ -1126,14 +1126,24 @@ class WP_HTML_Tag_Processor {
         * @see $lexical_updates
         */
        private function class_name_updates_to_attributes_updates() {
-               if ( count( $this->classname_updates ) === 0 || isset( $this->lexical_updates['class'] ) ) {
+               if ( count( $this->classname_updates ) === 0 ) {
                        $this->classname_updates = array();
                        return;
                }
 
-               $existing_class = isset( $this->attributes['class'] )
-                       ? substr( $this->html, $this->attributes['class']->value_starts_at, $this->attributes['class']->value_length )
-                       : '';
+               if ( isset( $this->lexical_updates['class'] ) ) {
+                       $existing_class_attr  = trim( $this->lexical_updates['class']->text );
+                       $existing_class_value = substr( $existing_class_attr, strlen( 'class' ) + 2, -1 );
+                       $existing_class       = html_entity_decode( $existing_class_value );
+               } elseif ( isset( $this->attributes['class'] ) ) {
+                       $existing_class = substr(
+                               $this->html,
+                               $this->attributes['class']->value_starts_at,
+                               $this->attributes['class']->value_length
+                       );
+               } else {
+                       $existing_class = '';
+               }
 
                /**
                 * Updated "class" attribute value.

But that will also change the semantics (two test cases -- both of which are about the add_class/set_attribute interaction AFAICS).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a reason to change the behavior? I think it's probably important to ensure that set_attribute and remove_attribute updates wipe out class builder method changes, as I can reason about what it means to call add_class after setting the class attribute, but I can't reason an obvious outcome of calling add_class before the transactional swap of set_attribute

Copy link
Contributor

@adamziel adamziel Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an obvious outcome of calling add_class before the transactional swap of set_attribute

It is confusing 🤔 I'd like set_attribute to cancel any add_class/remove_class calls that happened before. I'd be fine with an undefined behavior accompanied by a warning, too, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamziel in my latest commit I restored that defined behavior. in other words, I'm currently thinking the change in behavior wasn't necessary and so I've removed it.

Copy link
Contributor Author

@ockham ockham Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing the behavior @dmsnell!

FWIW, you didn't exactly revert my change 😬: The original behavior -- prior to my changes -- was that set_attribute( 'class', 'abc' ) would "prevail" over add_class( 'xyz' ) even if the add_class was called after the set_attribute, thus resulting in class="abc". This was covered by this unit test (this is trunk!):

/**
* When both set_attribute('class', $value) and add_class( $different_value ) are called,
* the final class name should be $value. In other words, the `add_class` call should be ignored,
* and the `set_attribute` call should win. This holds regardless of the order in which these methods
* are called.
*
* @ticket 56299
*
* @covers add_class
* @covers set_attribute
* @covers get_updated_html
*/
public function test_set_attribute_takes_priority_over_add_class() {
$p = new WP_HTML_Tag_Processor( self::HTML_WITH_CLASSES );
$p->next_tag();
$p->add_class( 'add_class' );
$p->set_attribute( 'class', 'set_attribute' );
$this->assertSame(
'<div class="set_attribute" id="first"><span class="not-main bold with-border" id="second">Text</span></div>',
$p->get_updated_html(),
'Calling get_updated_html after updating first tag\'s attributes did not return the expected HTML'
);
$p = new WP_HTML_Tag_Processor( self::HTML_WITH_CLASSES );
$p->next_tag();
$p->set_attribute( 'class', 'set_attribute' );
$p->add_class( 'add_class' );
$this->assertSame(
'<div class="set_attribute" id="first"><span class="not-main bold with-border" id="second">Text</span></div>',
$p->get_updated_html(),
'Calling get_updated_html after updating second tag\'s attributes did not return the expected HTML'
);
}

I came across this when I was addressing the (apparently) missing $this->apply_attribute_updates() call that you'd pointed out. I realized that simply adding that would introduce other problems, so I worked around those, noting that that changed the outcome of subsequent set_attribute( 'class', 'abc' ) and add_class( 'xyz' ) calls. I did like that my change would have them result in class="abc xyz", if the add_class was called after the set_attribute -- I still consider this an improvement over the original behavior 👍

OTOH, it meant that if add_class( 'xyz' ) was called before set_attribute( 'class', 'abc' ), it would also result in class="abc xyz". I didn't pay enough attention to that consequence, which is clearly not the behavior a user would expect 👎

Thanks to your change, we now have both cases behave as expected 🎉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying @ockham - you were right, I didn't understand that this was broken before the PR. The intention was right, the implementation was wrong.


// If we have an update for this attribute, return the updated value.
$enqueued_value = $this->get_enqueued_attribute_value( $comparable );
if ( false !== $enqueued_value ) {
return $enqueued_value;
}

if ( ! isset( $this->attributes[ $comparable ] ) ) {
return null;
}

$attribute = $this->attributes[ $comparable ];

/*
* This flag distinguishes an attribute with no value
* from an attribute with an empty string value. For
ockham marked this conversation as resolved.
Show resolved Hide resolved
* unquoted attributes this could look very similar.
* It refers to whether an `=` follows the name.
*
* e.g. <div boolean-attribute empty-attribute=></div>
* ¹ ²
* 1. Attribute `boolean-attribute` is `true`.
* 2. Attribute `empty-attribute` is `""`.
*/
if ( true === $attribute->is_true ) {
return true;
}
Expand Down Expand Up @@ -1582,7 +1685,7 @@ public function set_attribute( $name, $value ) {
$updated_attribute = "{$name}=\"{$escaped_new_value}\"";
}

/**
/*
* > There must never be two or more attributes on
* > the same start tag whose names are an ASCII
* > case-insensitive match for each other.
Expand Down Expand Up @@ -1628,6 +1731,14 @@ public function set_attribute( $name, $value ) {
' ' . $updated_attribute
);
}

/*
* Any calls to update the `class` attribute directly should wipe out any
* enqueued class changes from `add_class` and `remove_class`.
*/
if ( 'class' === $comparable_name && ! empty( $this->classname_updates ) ) {
ockham marked this conversation as resolved.
Show resolved Hide resolved
$this->classname_updates = array();
}
}

/**
Expand All @@ -1638,7 +1749,11 @@ public function set_attribute( $name, $value ) {
* @param string $name The attribute name to remove.
*/
public function remove_attribute( $name ) {
/**
if ( $this->is_closing_tag ) {
return false;
}

/*
* > There must never be two or more attributes on
* > the same start tag whose names are an ASCII
* > case-insensitive match for each other.
Expand All @@ -1647,7 +1762,20 @@ public function remove_attribute( $name ) {
* @see https://html.spec.whatwg.org/multipage/syntax.html#attributes-2:ascii-case-insensitive
*/
$name = strtolower( $name );
if ( $this->is_closing_tag || ! isset( $this->attributes[ $name ] ) ) {

/*
* Any calls to update the `class` attribute directly should wipe out any
* enqueued class changes from `add_class` and `remove_class`.
*/
if ( 'class' === $name && count( $this->classname_updates ) !== 0 ) {
$this->classname_updates = array();
}

// If we updated an attribute we didn't originally have, remove the enqueued update and move on.
if ( ! isset( $this->attributes[ $name ] ) ) {
if ( isset( $this->lexical_updates[ $name ] ) ) {
unset( $this->lexical_updates[ $name ] );
}
return false;
}

Expand Down
Loading