Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend urlReplacementPatterns to details items in diff UI #154

Open
gravi2 opened this issue Dec 12, 2019 · 7 comments
Open

Extend urlReplacementPatterns to details items in diff UI #154

gravi2 opened this issue Dec 12, 2019 · 7 comments
Labels

Comments

@gravi2
Copy link
Contributor

gravi2 commented Dec 12, 2019

@patrickhulce
In our builds (e.g PR builds or nightly builds), we have different build specific URLs for page resources ( e.g CSS, JS, Images, etc). And these changes from build to build, for the same page been loaded. For example:

In Build#1, top page (https://www.example.com/some/path) will load following resources:
https://cdn.example.com/build-1/app.css
https://cdn.example.com/build-1/app.js

In Build#2, top page (https://www.example.com/some/path) will load following resources:
https://cdn.example.com/build-2/app.css
https://cdn.example.com/build-2/app.js

As you can see the resource URLs differ only by the build- values.

When we run lhci against the top URL (https://www.example.com/some/path), the lhci dashboard shows 4 lines when comparing the performance of 2 builds. It will show something like following:

https://cdn.example.com/build-1/app.css Removed
https://cdn.example.com/build-1/app.js Removed
https://cdn.example.com/build-2/app.css Added
https://cdn.example.com/build-2/app.js Added

As you can see the files are the same but been served at different URLs for various cache busting reasons. It makes the reading of comparison report a bit difficult.

I was hoping to find support in LHCI (or lighthouse) that allows us to regex replace the resource URLs, so that we can canonicalize them BEFORE they are inserted into the LHCI server for analysis. I see that lhci upload supports urlReplacementPatterns but its seems like it only works on the top URL i.e the page been tested. It does not impact/change the resource URLs.

Please let me know if there is a way to achieve the above. Or if it would be a good idea and within the scope of lighthouse-ci, for a PR.

Also wondering if others are facing similar issues when using bundlers like webpack which may have different bundle hashes between the builds.

@patrickhulce
Copy link
Collaborator

but its seems like it only works on the top URL i.e the page been tested. It does not impact/change the resource URLs.

Yeah this was intentional to avoid losing any information in the report itself, but we can consider applying it to the entire report too.

The logic for the linking items in the table diff is more aggressive and different than the report URL itself.

function replaceNondeterministicStrings(s) {
return (
s
// YouTube Embeds
.replace(/www-embed-player-[0-9a-z]+/i, 'www-embed-player')
.replace(/player_ias-[0-9a-z]+/i, 'player_ias')
// UUIDs
.replace(/\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/gi, 'UUID')
// localhost Ports
.replace(/:[0-9]{3,5}\//, ':PORT/')
// Hash components embedded in filenames
.replace(/(\.|-)[0-9a-f]{8}\.(js|css|woff|html|png|jpeg|jpg|svg)/i, '$1HASH.$2')
);
}

I think my preference might be to thread the urlReplacementPatterns all the way through to the diff UI rather than replace at upload time to preserve that data.

@patrickhulce patrickhulce changed the title Regex replace resource URLs Extend urlReplacementPatterns to details items in diff UI Dec 12, 2019
@patrickhulce patrickhulce added enhancement New feature or request P2 labels Dec 12, 2019
@patrickhulce patrickhulce added P1 and removed P2 labels Jan 13, 2020
@patrickhulce
Copy link
Collaborator

OK after revisiting this later. I'm not 100% convinced threading through the build urlReplacementPatterns is the best move. I'm still leaning that way but here are some options I see.

  • Upload urlReplacementPatterns along with each build.
    Pros: rogue builds can't tank the whole project, straightforward to implement, configured in a single place
    Cons: historical results can't be fixed, stores a lot of duplicate data
  • Configure a new property uiDiffReplacementPatterns at the project level
    Pros: consistent behavior across all builds, historical results can benefit, no duplicate storage
    Cons: replacement patterns must be configured in two places (at build time for page URLs and in project for the subresource), requires admin privileges (Server auth / user management #85 Add project management UI #86)
  • Configure this as a localStorage user setting in the UI
    Pros: no migrations required, users can adjust to their preferences, easy to implement, follows more closely user-suggested behavior Add an option to ignore query string on static resources #215
    Cons: replacement patterns must be configured in two places (at build time for page URLs and in diff UI for the subresources), different users will see different diffs/a link is not all the state anymore

Any preferences between these @gravi2? And in the meantime are there any patterns that build-1 and build-2 follow that might be generally applicable we should add? (like they're long hash path components or something)

@boomaker
Copy link

boomaker commented May 7, 2020

And in the meantime are there any patterns that build-1 and build-2 follow that might be generally applicable we should add? (like they're long hash path components or something)

Taking my example, I have different hostnames for each branches, so diffing the master with a branch commit highlights these duplicates and complicates diff reading:

image

The first option looks more consistant with the actual behavior for me. We don't expect to keep a long history anyway.

@KnisterPeter
Copy link
Contributor

@patrickhulce We have the same problem. We are building with webpack and hashing the assets which are then compared in lighthouse ci. Therefore we have diffs in each build regarding all asset-urls.

I would perfer the option Configure a new property uiDiffReplacementPatterns at the project level you described above, because I think its the most easy to understand option.
And managing the replacement patterns twice could be reduced if the config file is a js file instead of just json or yaml. This is already supported.

@patrickhulce
Copy link
Collaborator

Thanks @KnisterPeter !

And managing the replacement patterns twice could be reduced if the config file is a js file instead of just json or yaml.

I'm not sure how this would work. As things work right now, uiDiffReplacementPatterns would need to be configured in the server admin UI of the specific project you're working on. It would be a property of a database record rather than a string in a JS file. There's no real concept of specific projects getting a config file atm. Do you have a proposal for how that might work?

@KnisterPeter
Copy link
Contributor

Oh, my bad. I thought it might be possible to configure the uiDiffReplacementPatterns in the lighthouserc.js as well.
But yes I understand the requirement, since the client/uploader and the server/viewer are separated and do not share a config file. 😿

@AlesJiranek
Copy link

AlesJiranek commented Jun 13, 2024

Hello, is there something happening regarding this issue? Our project is currently facing the exactly same problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants