Skip to content


WIP: Refactor ImageCDN parsing to rely on HTML API instead of RegExps
Browse files Browse the repository at this point in the history
The introduction of the HTML API into WordPress 6.2 offers a new method
of matching and modifying HTML. In this patch we're replacing code that
attempts to parse the input HTML and extract images that are direct
children of an anchor ("A" tag), then read and modify them based on
the values of their attributes and computed Photon properties.

In the previous code the `Image_CDN` class scanned the entire HTML
document to generate a list of PREG image match objects, then iterated
over those matches and performed string-replace operations on them.

Now the class does a pass from start to finish, visting each image
tag along the way, and making the appropriate modifications. Extra
care is taken to ensure that only images that are the single child
of a link are matched.

In this change the values of the `tag` key in some of the filters
has changed from the initial matched HTML snippet to the name of the
image tag, which could be `IMG` or `AMP-IMG` or `AMP-ANIM`. An update
to the Tag Processor or a custom sub-class thereof could provide the
original HTML snippet and match the existing behavior, but that hasn't
been done in this patch yet given the author's uncertainty about the
use and value of those snippets.
  • Loading branch information
dmsnell committed Aug 26, 2023
1 parent b5db3fd commit ba4718b
Showing 1 changed file with 311 additions and 244 deletions.

0 comments on commit ba4718b

Please sign in to comment.