core(lantern): Remove min task duration from lantern graphs #9910

warrengm · 2019-11-01T21:01:17Z

This PR removes the minimum task duration from the page dependency graph in order to retain necessary edges. For example, this fixes at least two cases of missing paths in the lantern graph:

Script A creates an iframe that issues request B
An XHR load event handler of request A fires request B

In both cases the Lantern graph lacks the A -> B whenever the intermediate CPU nodes are fast (< 10 ms). We have observed wrong simulation results due to these issues in the Publisher Ads Audits plugin.

I suspect this PR may reduce some variance in cases where sites have some tasks that waffle around the 10 ms threshold (i.e. a particular task may non-deterministically included in the graph).

To mitigate run time, I minimize the graph size by pruning short tasks while inserting edges to retain paths between dependencies and dependents. For example, a -> b -> c becomes a -> c if b is a short task.

brendankenny · 2019-11-01T21:13:18Z

Feedback would be appreciated on whether I'm missing any important considerations.

we tried this somewhere recently and it slowed down Lighthouse execution time significantly if we dropped the minimum completely. I can't find where that was though. @patrickhulce, do you remember?

patrickhulce · 2019-11-02T04:54:46Z

ya some friends running LH tried this and it doubled total runtime, there are a lot of tasks under 10ms :/ #9627 (comment)

I think we do need to start including some tasks under our threshold, but we need to be more selective about it. we've already identified that the first paint/layout/parsehtml events should be saved regardless of length (#9627 (comment)), are there other specific patterns of tasks that we could be saving without removing the optimization entirely @warrengm that would help you out? :)

patrickhulce · 2019-11-02T04:56:13Z

The other alternative here is to try to radically improve the runtime of simulation with large numbers of tasks. There's likely some low hanging fruit we could tackle to improve things and lower the threshold but I doubt there's enough to let us include all tasks, we run quite a few simulations per run.

connorjclark · 2019-11-02T07:58:09Z

we run quite a few simulations per run.

can we look into parallelizing simulation? wasm / workers?

patrickhulce · 2019-11-02T18:39:27Z

can we look into parallelizing simulation? wasm / workers?

doing a Promise.all on audit computation and making the simulation somehow async off the main thread would certainly solve that problem, but the complexity of introducing workers to our flow seems like an order of magnitude above being slightly more careful here right now :)

certainly an interesting long-term solution to explore though as we increase the amount of analysis with js parsing and other heavy handed audit tasks

warrengm · 2019-11-04T17:43:19Z

I'll look into this a bit more. I think it should be relatively easy to determine which tasks should be kept by checking which tasks end up in the unfiltered lantern graph between network nodes. But I don't have a good sense on how much that will reduce the size. Hopefully we can do a simple filter on task name without having more complicated heuristics.

I briefly considered adding some external controls for callers. For example LCP simulations probably don't need the entire graph, but I'm not sure that's worth the complexity.

Any algorithm optimizations SGTM but I'm not so familiar with the LH codebase so I'll leave that to you all

warrengm · 2019-11-04T19:28:12Z

@patrickhulce What are the main factors into lantern speed? For example, does the number of children in a CPUNode matter? Or is the runtime dominate purely by the number of edges and nodes in the graph?

I think most of the events we care about are children of a RunTask event node but I think we could potentially prune the children if that would have a positive impact on runtime.

patrickhulce · 2019-11-04T22:20:14Z

What are the main factors into lantern speed?

Largely dominated by the number of nodes in the graph. A CPUNode is created for every toplevel main thread task above the threshold, so by keeping that threshold high we limit the number of nodes in the graph that can slow things down.

For example, does the number of children in a CPUNode matter?

Only a marginal effect. It's used when cloning and for determining a multiplier but it's not nearly as dominant as node count.

I think most of the events we care about are children of a RunTask event node but I think we could potentially prune the children if that would have a positive impact on runtime.

Based on this and your original problem statement it sounds like we basically only want to look at these nodes for the purposes of creating relationships between other network nodes. I definitely see how this could meaningful improve accuracy so how about this....

Remove the threshold for CPU tasks on initial graph creation to build out the edges.
Before returning the created graph, do a pass on the graph to filter out small CPU nodes but preserve the dependencies that node would have created between other nodes.

i.e.

NetworkNode (/xhr-1.json) <-- CPUNode (XHR Callback, 2ms) <-- NetworkNode (/xhr-2.json)

becomes

NetworkNode (/xhr-1.json) <-- NetworkNode (/xhr-2.json)

instead of the situation today

NetworkNode (/xhr-1.json)
NetworkNode (/xhr-2.json)

warrengm · 2019-11-07T18:20:42Z

Started working on (1). I made some changes so the version you looked at is slightly out of date now, but MLB and vine are still two of the most affected sites.

For MLB (which is now more accurate), most of the shift in TTFCPUI seems to come from new TimerInstall and TimerFire nodes.
For vine, the regression in TTI are due some new edges between network nodes, since we're now no longer ignoring some XHRReadyStateChange events
- This pushes back the last long task (from a ResourceReceivedData event) due to extra dependencies. This seems like correct behavior even though it's less accurate.

From what I can tell, the new short nodes are always pruned in my current implementation so the effect seems to be purely from introducing new edges.

warrengm · 2019-11-07T20:13:31Z

I did some analysis by gradually introducing the new edges and got the following. My initial reaction is that the results seem acceptable. The biggest regression (in % terms) appears to be Vine, but the new paths between network nodes and ResourceReceivedData CPU nodes appear reasonable.

Adding short timer events

Metric                    p50 (% Error)        p90 (% Error)        p95 (% Error)       
roughEstimateOfFCP        16.2% ↔  0.0%        47.6% ↔  0.0%        61.0% ↔  0.0%       
roughEstimateOfFMP        16.4% ↔  0.0%        49.7% ↔  0.0%        57.8% ↔  0.0%       
roughEstimateOfTTFCPUI    22.2% ↔  0.0%        54.5% ↔  0.0%        75.0% ↔  0.0%       
roughEstimateOfTTI        23.6% ↔  0.0%        57.0% ↔  0.0%        107.1% ↔  0.0%      
roughEstimateOfSI         34.9% ↔  0.0%        75.0% ↔  0.0%        83.1% ↔  0.0%       
 
 ------- Fixes Summary -------
TTFCPUI on http://www.rakuten.ne.jp/ - 21867 (real) 14376 (cur) 13257 (prev)
TTFCPUI on https://weather.com/ - 10046 (real) 22804 (cur) 23104 (prev)
TTI on http://www.rakuten.ne.jp/ - 32059 (real) 23769 (cur) 23108 (prev)
TTFCPUI on http://www.metrolyrics.com/ - 27321 (real) 39256 (cur) 39593 (prev)
 
 ------- Regression Summary -------
TTI on http://www.skype.com/ - 7108 (real) 7654 (cur) 7554 (prev)
TTFCPUI on http://www.dawn.com/ - 8296 (real) 13859 (cur) 13751 (prev)
TTI on http://www.dawn.com/ - 12021 (real) 16330 (cur) 16175 (prev)
TTI on http://www.ynet.com/ - 4546 (real) 8153 (cur) 8096 (prev)
 
 ------- Bucket Summary -------
Good:      243   ↔  0
OK:        180   ↔  0
Bad:       71    ↔  0
 
 ------- % Error Summary -------
p50:       20.6% ↔  0.0%
p90:       56.1% ↔  0.0%
p95:       78.1% ↔  0.0%
 
 
Comparing to master computed values...
❌  FAIL    16 change(s) between expected and computed!

Adding short Timer, XHR, and ResourceSendRequest events

Metric                    p50 (% Error)        p90 (% Error)        p95 (% Error)       
roughEstimateOfFCP        16.2% ↔  0.0%        47.6% ↔  0.0%        61.0% ↔  0.0%       
roughEstimateOfFMP        16.4% ↔  0.0%        49.7% ↔  0.0%        57.8% ↔  0.0%       
roughEstimateOfTTFCPUI    21.3% ↓  0.9%        53.5% ↓  1.0%        75.0% ↔  0.0%       
roughEstimateOfTTI        23.6% ↔  0.0%        57.0% ↔  0.0%        107.1% ↔  0.0%      
roughEstimateOfSI         34.9% ↔  0.0%        75.0% ↔  0.0%        83.1% ↔  0.0%       
 
 ------- Fixes Summary -------
TTFCPUI on http://www.mlb.com/ - 23530 (real) 26390 (cur) 8141 (prev)
TTFCPUI on http://www.rakuten.ne.jp/ - 21867 (real) 14376 (cur) 13257 (prev)
TTFCPUI on https://vine.co/ - 7685 (real) 8276 (cur) 6822 (prev)
TTFCPUI on http://www.t-online.de/ - 20325 (real) 23033 (cur) 23649 (prev)
 
 ------- Regression Summary -------
TTI on https://vine.co/ - 7685 (real) 8572 (cur) 7846 (prev)
TTFCPUI on http://www.verizonwireless.com/ - 12468 (real) 14246 (cur) 13945 (prev)
TTI on http://www.skype.com/ - 7108 (real) 7691 (cur) 7554 (prev)
TTI on http://www.orange.fr/ - 14159 (real) 15495 (cur) 15250 (prev)
 
 ------- Bucket Summary -------
Good:      244   ↔  1
OK:        180   ↔  0
Bad:       70    ↓  1
 
 ------- % Error Summary -------
p50:       20.4% ↓  0.2%
p90:       55.4% ↓  0.6%
p95:       78.1% ↔  0.0%

Comparing to master computed values...
❌  FAIL    28 change(s) between expected and computed!

Adding in all new edges

Metric                    p50 (% Error)        p90 (% Error)        p95 (% Error)       
roughEstimateOfFCP        16.2% ↔  0.0%        47.6% ↔  0.0%        61.0% ↔  0.0%       
roughEstimateOfFMP        16.2% ↓  0.2%        49.7% ↔  0.0%        57.8% ↔  0.0%       
roughEstimateOfTTFCPUI    21.3% ↓  0.9%        53.5% ↓  1.0%        75.0% ↔  0.0%       
roughEstimateOfTTI        23.6% ↔  0.0%        57.0% ↔  0.0%        107.2% ↔  0.1%      
roughEstimateOfSI         34.9% ↔  0.0%        75.0% ↔  0.0%        83.1% ↔  0.0%       
 
 ------- Fixes Summary -------
TTFCPUI on http://www.mlb.com/ - 23530 (real) 26178 (cur) 8141 (prev)
FMP on http://www.metacafe.com/ - 1665 (real) 1849 (cur) 2048 (prev)
TTFCPUI on http://www.rakuten.ne.jp/ - 21867 (real) 14376 (cur) 13257 (prev)
TTFCPUI on https://vine.co/ - 7685 (real) 8276 (cur) 6822 (prev)
 
 ------- Regression Summary -------
TTI on https://vine.co/ - 7685 (real) 8572 (cur) 7846 (prev)
TTFCPUI on http://www.verizonwireless.com/ - 12468 (real) 14246 (cur) 13945 (prev)
TTI on http://www.skype.com/ - 7108 (real) 7691 (cur) 7554 (prev)
TTI on http://www.orange.fr/ - 14159 (real) 15495 (cur) 15250 (prev)
 
 ------- Bucket Summary -------
Good:      245   ↔  2
OK:        179   ↔  1
Bad:       70    ↓  1
 
 ------- % Error Summary -------
p50:       20.4% ↓  0.2%
p90:       55.4% ↓  0.6%
p95:       78.1% ↔  0.0%
 
 
Comparing to master computed values...
❌  FAIL    31 change(s) between expected and computed!

patrickhulce · 2019-11-08T04:49:40Z

Awesome job @warrengm! This is much much better accuracy-wise was it just the move to the second pass that improved things since I took a look?

I'd say we can upgrade this from a draft then, get some tests in, split off the Set changes, and re-analyze the run time to see if it's workable without major algo changes 👍

warrengm · 2019-11-14T15:49:51Z

I've been busy with some other work but I've started working on unit tests now. In the meantime, here is the result of some performance testing in aggregate. There don't seem to be any major performance issue.

Running on all assets

Master:

multitime -q -n 40 node lighthouse-core/scripts/lantern/run-on-all-assets.js 
===> multitime results
1: -q node lighthouse-core/scripts/lantern/run-on-all-assets.js
            Mean        Std.Dev.    Min         Median      Max
real        77.964      2.622       73.685      77.929      83.838      
user        102.721     1.762       99.885      102.474     106.364     
sys         8.601       0.201       8.033       8.602       9.174

My branch:

multitime -q -n 40 node lighthouse-core/scripts/lantern/run-on-all-assets.js 
===> multitime results
1: -q node lighthouse-core/scripts/lantern/run-on-all-assets.js
            Mean        Std.Dev.    Min         Median      Max
real        75.776      1.630       73.086      75.526      81.881      
user        101.777     1.132       99.776      101.645     105.934     
sys         8.665       0.215       8.226       8.660       9.254

mlb.com

Master

===> multitime results
            Mean        Std.Dev.    Min         Median      Max
real        1.028       0.056       0.922       1.038       1.123       
user        1.354       0.053       1.259       1.344       1.446       
sys         0.104       0.023       0.054       0.101       0.16

My branch

===> multitime results
            Mean        Std.Dev.    Min         Median      Max
real        1.008       0.060       0.882       1.016       1.132       
user        1.359       0.051       1.260       1.363       1.479       
sys         0.095       0.019       0.055       0.095       0.141

weather.com

Master

            Mean        Std.Dev.    Min         Median      Max
real        1.163       0.103       0.996       1.169       1.500       
user        1.515       0.087       1.351       1.523       1.716       
sys         0.100       0.015       0.073       0.096       0.136

My branch

            Mean        Std.Dev.    Min         Median      Max
real        1.134       0.077       0.998       1.132       1.316       
user        1.517       0.067       1.367       1.512       1.665       
sys         0.097       0.018       0.053       0.096       0.152

patrickhulce

great news and awesome job @warrengm! this tentatively LGTM!

let's undraft it and let it spin on travis for the accuracy+timing diff there too

lighthouse-core/test/computed/page-dependency-graph-test.js

package.json

patrickhulce · 2019-11-18T20:15:12Z

package.json

@@ -134,6 +134,7 @@
    "configstore": "^3.1.1",
    "cssstyle": "1.2.1",
    "details-element-polyfill": "^2.4.0",
+    "global": "^4.4.0",


lighthouse-core/lib/dependency-graph/base-node.js

Co-Authored-By: Patrick Hulce <patrick.hulce@gmail.com>

patrickhulce

Thanks so much for the huge contribution here @warrengm! Your patience throughout has been very much appreciated too :D

Everything looks great locally for me! 🎉

Remove min task duration from page graph in lantern

a767712

googlebot added the cla: yes label Nov 1, 2019

vercel bot deployed to staging November 1, 2019 21:01 View deployment

warrengm changed the title ~~(misc) Remove min task duration from page graph in lantern~~ type(misc) Remove min task duration from page graph in lantern Nov 1, 2019

warrengm changed the title ~~type(misc) Remove min task duration from page graph in lantern~~ type(misc) Remove min task duration from langtern page graphs Nov 1, 2019

warrengm changed the title ~~type(misc) Remove min task duration from langtern page graphs~~ type(misc): Remove min task duration from lantern graphs Nov 1, 2019

warrengm changed the title ~~type(misc): Remove min task duration from lantern graphs~~ misc(lantern): Remove min task duration from lantern graphs Nov 4, 2019

warrengm added 2 commits November 4, 2019 15:36

Prune children events in lantern graphs

e15ffd7

Only include CPU nodes with children in the graph

acde65b

vercel bot deployed to staging November 4, 2019 20:49 View deployment

Remove args.data filter on children

bb58576

vercel bot deployed to staging November 4, 2019 21:01 View deployment

warrengm added 2 commits November 4, 2019 16:07

Ensure long events are included

3bc6ebd

Revert print statement

b7245bc

vercel bot deployed to staging November 4, 2019 21:07 View deployment

Update comments

3b859a7

vercel bot deployed to staging November 4, 2019 21:11 View deployment

brendankenny mentioned this pull request Nov 5, 2019

[meta] Lighthouse 6.0 Burndown #9774

Closed

54 tasks

warrengm added 3 commits November 6, 2019 14:14

Improve type checking

0592885

Optimization: Use sets to track dependents and dependencies

91b7ec1

Prune the simulation graph of some short CPU events

ddf83c7

vercel bot deployed to staging November 6, 2019 19:55 View deployment

Stop pruning child events

e2e7e10

warrengm changed the title ~~misc(lantern): Remove min task duration from lantern graphs~~ core(lantern): Remove min task duration from lantern graphs Nov 7, 2019

Revert minor diff

12e2653

vercel bot deployed to staging November 7, 2019 20:14 View deployment

Clean up changes

5fdea7b

vercel bot deployed to staging November 7, 2019 20:23 View deployment

Revert Set changes in base-node

0ffed0c

warrengm added 2 commits November 18, 2019 13:23

Fix existing tests

5c8a927

Add tests for pruning

c58b22a

vercel bot deployed to staging November 18, 2019 19:30 View deployment

patrickhulce reviewed Nov 18, 2019

View reviewed changes

patrickhulce marked this pull request as ready for review November 18, 2019 20:17

patrickhulce requested review from brendankenny and paulirish as code owners November 18, 2019 20:17

Address review comments

04b9e7e

vercel bot temporarily deployed to staging November 18, 2019 20:22 Inactive

Update lighthouse-core/lib/dependency-graph/base-node.js

d12772e

Co-Authored-By: Patrick Hulce <patrick.hulce@gmail.com>

vercel bot deployed to staging November 18, 2019 20:22 View deployment

Update lantern expecatations

0ff9818

vercel bot deployed to staging November 21, 2019 16:28 View deployment

Merge master

f035f20

vercel bot deployed to staging November 21, 2019 22:45 View deployment

warrengm requested review from patrickhulce and connorjclark November 25, 2019 20:08

patrickhulce approved these changes Nov 25, 2019

View reviewed changes

patrickhulce merged commit 6c44281 into GoogleChrome:master Nov 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core(lantern): Remove min task duration from lantern graphs #9910

core(lantern): Remove min task duration from lantern graphs #9910

warrengm commented Nov 1, 2019 •

edited

Loading

brendankenny commented Nov 1, 2019

patrickhulce commented Nov 2, 2019

patrickhulce commented Nov 2, 2019

connorjclark commented Nov 2, 2019

patrickhulce commented Nov 2, 2019

warrengm commented Nov 4, 2019

warrengm commented Nov 4, 2019

patrickhulce commented Nov 4, 2019 •

edited

Loading

warrengm commented Nov 7, 2019 •

edited

Loading

warrengm commented Nov 7, 2019

patrickhulce commented Nov 8, 2019

warrengm commented Nov 14, 2019

patrickhulce left a comment

patrickhulce Nov 18, 2019

warrengm Nov 18, 2019

patrickhulce left a comment •

edited

Loading

core(lantern): Remove min task duration from lantern graphs #9910

core(lantern): Remove min task duration from lantern graphs #9910

Conversation

warrengm commented Nov 1, 2019 • edited Loading

brendankenny commented Nov 1, 2019

patrickhulce commented Nov 2, 2019

patrickhulce commented Nov 2, 2019

connorjclark commented Nov 2, 2019

patrickhulce commented Nov 2, 2019

warrengm commented Nov 4, 2019

warrengm commented Nov 4, 2019

patrickhulce commented Nov 4, 2019 • edited Loading

warrengm commented Nov 7, 2019 • edited Loading

warrengm commented Nov 7, 2019

Adding short timer events

Adding short Timer, XHR, and ResourceSendRequest events

Adding in all new edges

patrickhulce commented Nov 8, 2019

warrengm commented Nov 14, 2019

Running on all assets

mlb.com

weather.com

patrickhulce left a comment

Choose a reason for hiding this comment

patrickhulce Nov 18, 2019

Choose a reason for hiding this comment

warrengm Nov 18, 2019

Choose a reason for hiding this comment

patrickhulce left a comment • edited Loading

Choose a reason for hiding this comment

warrengm commented Nov 1, 2019 •

edited

Loading

patrickhulce commented Nov 4, 2019 •

edited

Loading

warrengm commented Nov 7, 2019 •

edited

Loading

patrickhulce left a comment •

edited

Loading