Add a html-only gatherer #9756

benschwarz · 2019-09-30T00:51:08Z

Feature request summary

At the moment custom plugins aren't able to create their own gatherers, and there isn't a gatherer to retrieve the final HTML that a page produced.

Ref: ampproject/amp-toolbox#509

My proposal is to add a gatherer that returns document.documentElement.outerHTML, or something like it.

What is the motivation or use case for changing this?

Plugins are unable to retrieve the document HTML from the browser.

How is this beneficial to Lighthouse?

Ergonomic plugin authoring

The text was updated successfully, but these errors were encountered:

brendankenny · 2019-09-30T01:10:29Z

I think the amp validator would need the initial html, wouldn't it? So this could be calling driver.getRequestContent with the main resource request.

A side benefit is that most of the time the html request would be of pretty reasonable size (at least compared to many ScriptElements), a lot better than the easily monstrous (css in js, etc etc) outerHTML.

benschwarz · 2019-09-30T04:06:57Z

So this could be calling driver.getRequestContent with the main resource request.

Yah. That'd also account for redirections etc etc 👍

I think the amp validator would need the initial html, wouldn't it?

I'm not really qualified to say whether amp would want the initial HTML or the fully realised result, but I'm sure @alabiaga or @cramforce could add some guidance here

ithinkihaveacat · 2019-09-30T08:20:29Z

The AMP validator wants the equivalent of Chrome's "view page source." So driver.getRequestContent would be the right thing in this case?

patrickhulce · 2019-09-30T14:47:11Z

Ah yeah the initial HTML document is a much easier ask than the final rendered HTML content :)

This sounds doable as a default gatherer IMO!

alabiaga · 2019-09-30T17:51:40Z

I'm not really qualified to say whether amp would want the initial HTML or the fully realised result

It is only the initial HTML that is validated against. That is if I am understanding correctly how the terminology is being used. In the following example: https://playground.amp.dev/?url=https%3A%2F%2Fpreview.amp.dev%2Fdocumentation%2Fexamples%2Fcomponents%2Famp-list&format=websites&_gl=1*1na28s7*_ga*YW1wLWdtNHdvMm5XNUs3dWRqYWFKZHp3eHc.

The markup on the left is what is validated. The realized and final html is the output on the right.

When I created the gatherer, it was because I wasn't aware that there was an async version of the audit call, thus I did things in the gatherer and just passed it to the audit call via an artifact. In the gatherer, I don't recall ever using driver.getRequestContent as I wasn't sure about the ID to pass. If I remember, I went about it two ways. One was was via issuing a command to use the chrome dev tools protocol, via DOM.getDocument. but I definitely went a simpler route, looking at an existing gatherer as example. I think document was just globally available or the html document was definitely a property of some context.

brendankenny · 2019-10-16T16:47:20Z

The AMP plugin can now use artifacts.MainDocumentContent on master/next release

benschwarz mentioned this issue Sep 30, 2019

Lighthouse plugin feedback ampproject/amp-toolbox#509

Open

This was referenced Oct 2, 2019

[meta] Lighthouse 6.0 Burndown #9774

Closed

core(gather): add new MainDocumentContent public artifact #9781

Merged

brendankenny closed this as completed in #9781 Oct 16, 2019

ashubagri mentioned this issue Nov 30, 2023

[Snyk] Security upgrade puppeteer-core from 2.1.1 to 19.7.3 ashubagri/lighthouse#12

Closed

HijadelUFO mentioned this issue Dec 1, 2023

[Snyk] Security upgrade puppeteer-core from 2.1.1 to 19.7.3 HijadelUFO/lighthouse#69

Open

thexdesk mentioned this issue Dec 1, 2023

[Snyk] Security upgrade puppeteer-core from 2.1.1 to 19.7.3 thexdesk/lighthouse#32

Open

ryan-ally mentioned this issue Dec 1, 2023

[Snyk] Security upgrade puppeteer-core from 2.1.1 to 19.7.3 ryan-ally/lighthouse#51

Open

MaxMood96 mentioned this issue Dec 2, 2023

[Snyk] Security upgrade puppeteer-core from 2.1.1 to 19.7.3 MaxMood96/lighthouse#113

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a html-only gatherer #9756

Add a html-only gatherer #9756

benschwarz commented Sep 30, 2019

brendankenny commented Sep 30, 2019

benschwarz commented Sep 30, 2019

ithinkihaveacat commented Sep 30, 2019

patrickhulce commented Sep 30, 2019

alabiaga commented Sep 30, 2019 •

edited

Loading

brendankenny commented Oct 16, 2019

Add a html-only gatherer #9756

Add a html-only gatherer #9756

Comments

benschwarz commented Sep 30, 2019

brendankenny commented Sep 30, 2019

benschwarz commented Sep 30, 2019

ithinkihaveacat commented Sep 30, 2019

patrickhulce commented Sep 30, 2019

alabiaga commented Sep 30, 2019 • edited Loading

brendankenny commented Oct 16, 2019

alabiaga commented Sep 30, 2019 •

edited

Loading