-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(lantern): Remove min task duration from lantern graphs #9910
Conversation
we tried this somewhere recently and it slowed down Lighthouse execution time significantly if we dropped the minimum completely. I can't find where that was though. @patrickhulce, do you remember? |
ya some friends running LH tried this and it doubled total runtime, there are a lot of tasks under 10ms :/ #9627 (comment) I think we do need to start including some tasks under our threshold, but we need to be more selective about it. we've already identified that the first paint/layout/parsehtml events should be saved regardless of length (#9627 (comment)), are there other specific patterns of tasks that we could be saving without removing the optimization entirely @warrengm that would help you out? :) |
The other alternative here is to try to radically improve the runtime of simulation with large numbers of tasks. There's likely some low hanging fruit we could tackle to improve things and lower the threshold but I doubt there's enough to let us include all tasks, we run quite a few simulations per run. |
can we look into parallelizing simulation? wasm / workers? |
doing a certainly an interesting long-term solution to explore though as we increase the amount of analysis with js parsing and other heavy handed audit tasks |
I'll look into this a bit more. I think it should be relatively easy to determine which tasks should be kept by checking which tasks end up in the unfiltered lantern graph between network nodes. But I don't have a good sense on how much that will reduce the size. Hopefully we can do a simple filter on task name without having more complicated heuristics. I briefly considered adding some external controls for callers. For example LCP simulations probably don't need the entire graph, but I'm not sure that's worth the complexity. Any algorithm optimizations SGTM but I'm not so familiar with the LH codebase so I'll leave that to you all |
@patrickhulce What are the main factors into lantern speed? For example, does the number of children in a I think most of the events we care about are children of a |
Largely dominated by the number of nodes in the graph. A CPUNode is created for every toplevel main thread task above the threshold, so by keeping that threshold high we limit the number of nodes in the graph that can slow things down.
Only a marginal effect. It's used when cloning and for determining a multiplier but it's not nearly as dominant as node count.
Based on this and your original problem statement it sounds like we basically only want to look at these nodes for the purposes of creating relationships between other network nodes. I definitely see how this could meaningful improve accuracy so how about this....
i.e.
becomes
instead of the situation today
|
Started working on (1). I made some changes so the version you looked at is slightly out of date now, but MLB and vine are still two of the most affected sites.
From what I can tell, the new short nodes are always pruned in my current implementation so the effect seems to be purely from introducing new edges. |
I did some analysis by gradually introducing the new edges and got the following. My initial reaction is that the results seem acceptable. The biggest regression (in % terms) appears to be Vine, but the new paths between network nodes and Adding short timer events
Adding short Timer, XHR, and ResourceSendRequest events
Adding in all new edges
|
Awesome job @warrengm! This is much much better accuracy-wise was it just the move to the second pass that improved things since I took a look? I'd say we can upgrade this from a draft then, get some tests in, split off the |
I've been busy with some other work but I've started working on unit tests now. In the meantime, here is the result of some performance testing in aggregate. There don't seem to be any major performance issue. Running on all assetsMaster:
My branch:
mlb.comMaster
My branch
weather.comMaster
My branch
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great news and awesome job @warrengm! this tentatively LGTM!
let's undraft it and let it spin on travis for the accuracy+timing diff there too
package.json
Outdated
@@ -134,6 +134,7 @@ | |||
"configstore": "^3.1.1", | |||
"cssstyle": "1.2.1", | |||
"details-element-polyfill": "^2.4.0", | |||
"global": "^4.4.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted
Co-Authored-By: Patrick Hulce <patrick.hulce@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the huge contribution here @warrengm! Your patience throughout has been very much appreciated too :D
Everything looks great locally for me! 🎉
This PR removes the minimum task duration from the page dependency graph in order to retain necessary edges. For example, this fixes at least two cases of missing paths in the lantern graph:
A
creates an iframe that issues requestB
A
fires requestB
In both cases the Lantern graph lacks the
A -> B
whenever the intermediate CPU nodes are fast (< 10 ms). We have observed wrong simulation results due to these issues in the Publisher Ads Audits plugin.I suspect this PR may reduce some variance in cases where sites have some tasks that waffle around the 10 ms threshold (i.e. a particular task may non-deterministically included in the graph).
To mitigate run time, I minimize the graph size by pruning short tasks while inserting edges to retain paths between dependencies and dependents. For example,
a -> b -> c
becomesa -> c
ifb
is a short task.