Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(predictive-perf): add CPU estimation #3162

Merged
merged 2 commits into from
Sep 16, 2017
Merged

Conversation

patrickhulce
Copy link
Collaborator

@patrickhulce patrickhulce commented Aug 28, 2017

in the name of fast mode #2703

broken out just the CPU estimation portion from #2720

@@ -15,6 +15,9 @@ const Node = require('../gather/computed/dependency-graph/node.js');
const SCORING_POINT_OF_DIMINISHING_RETURNS = 1700;
const SCORING_MEDIAN = 10000;

// Any CPU task of 20 ms or more will end up being a critical long task on mobile
const CRITICAL_LONG_TASK_THRESHOLD = 20000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies but can we do this value in ms?

and just * 1000 down in its use? i figure its a bit nicer if all our const values have consistent units.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing, done

@@ -13,6 +13,7 @@ const DEFAULT_MAXIMUM_CONCURRENT_REQUESTS = 10;
const DEFAULT_RESPONSE_TIME = 30;
const DEFAULT_RTT = 150;
const DEFAULT_THROUGHPUT = 1600 * 1024; // 1.6 Mbps
const DEFAULT_CPU_MULTIPLIER = 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this number come from?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was the multiplier from emulation adjusted upwards to compensate for not including all relevant tasks, but it's moved down closer to the emulation value in #3163

/**
* @return {boolean}
*/
isRenderBlocking() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* @return {string}
*/
get resourceType() {
return this._record._resourceType && this._record._resourceType._name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on other comment... you can just compare against WebInspector.resourceTypes.* so this guy i dont think is needed..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep will do


case 'InvalidateLayout':
case 'ScheduleStyleRecalculation':
stackTraceUrls.forEach(url => addDependencyOnUrl(node, url));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it makes sense to generally do this to all events that have a stack trace?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but I wanted to explicitly call out which events we're relying on and do some experimentation as to which are the most informative for when we reduce the set.

tl;dr 90% of the value comes from EvaluateScript, XHRReadyStateChange, and ResourceSendRequest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg. what about a default case for the switch statement?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on board for #3163 once I measure impact :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k. i think there are ~3 we agreed to punt to 3163.. i forget what they were tho.. hopefully you got a hold of em. :)

Copy link
Collaborator Author

@patrickhulce patrickhulce Sep 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Refactor the signature of simulateDownloadUntil to accept an object
  2. Add a _nodeState Node -> ProgressEnum map to track the nodes through estimator
  3. The resourceType business
  4. Network initiators improvements
  5. this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hero. 👍 💯

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. verifyValidDepGraph :)

for (const evt of node.children) {
if (!evt.args.data) continue;

const url = evt.args.data.url;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like it's possible for this to be undefined in some cases? (does ScheduleStyleRecalculation always a url? or EvaluateScript for eval()?)
i guess you could early return in addDependencyOnUrl to avoid some work?

Copy link
Collaborator Author

@patrickhulce patrickhulce Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep done

const url = evt.args.data.url;
const stackTraceUrls = (evt.args.data.stackTrace || []).map(l => l.url).filter(Boolean);

switch (evt.name) {
Copy link
Member

@paulirish paulirish Sep 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few other events that mighttttttttt be worthwhile?

actually here's a list of main thread trace events with a url in the data:

   1 "name":"CommitLoad"
   8 "name":"EvaluateScript"
  26 "name":"FunctionCall"
   1 "name":"InvalidateLayout"
   2 "name":"Layout"
 179 "name":"PaintImage"
   7 "name":"ParseHTML"
   1 "name":"RenderFrameImpl::didStartProvisionalLoad"
  61 "name":"ResourceSendRequest"
   3 "name":"ScheduleStyleRecalculation"
  12 "name":"TimerInstall"
   7 "name":"TimerRemove"
   1 "name":"TracingStartedInPage"
   2 "name":"UpdateLayoutTree"
   1 "name":"XHRLoad"
   4 "name":"XHRReadyStateChange"
   9 "name":"v8.compile"

# i got this via this command on one of my traces:
# ag '61815.*?775.*?"name":".*?".*?"url":".*?"' news.google.com_2017-09-01_18-14-51-0.trace.json -o  | ag -o '"name":".*?"' | sort | uniq -c
  • ParseHTML seems good
  • PaintImage seems good?
  • UpdateLayoutTree sometimes has a stack trace, which indicates a forced layout. Not sure if the semantics of that matches...
  • DecodeImage and ResizeImage i think happen off the main thread..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to add reliance on PaintImage since it seems to have a larger "unthrottled view obfuscates relationships" effect, i.e. the most common reason PaintImage is included in this task is because the image happened to finish downloading then not because this task necessarily relied on the image being available.

UpdateLayoutTree, ParseHTML I can add 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. as discussed offline, right now the graph is identifying which work is DEFINITELY required before each node. Images don't really fall into this definition.
Later work could introduce a pessimistic graph where relationships like these images being painted are included. But i agree it's unclear that would add value.

return node.endTime <= fmp;
if (node.endTime > fmp) return false;
// Include CPU tasks that performed a layout
if (node.type === Node.TYPES.CPU) return node.didPerformLayout();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw what's special about these?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're included in pessimistic estimate because they frequently capture large CPU work that was needed to render something meaningful and the resources that might not have been render-blocking that were required to perform it. example: a client-side rendered page with a large script in the body

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha.. so i guess this is more semantically like node.containsMeaningfulWork()

.. and in which case i'm wondering if there are any other work payloads besides layout that also fit into the definition. Style/Paint probably do, but I can imagine including them wouldn't change the results so much...

@patrickhulce
Copy link
Collaborator Author

FYI it's rebased and easier to read now :)

*/
static linkNetworkNodes(rootNode, networkNodeOutput) {
networkNodeOutput.nodes.forEach(node => {
const initiators = PageDependencyGraphArtifact.getNetworkInitiators(node.record);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there is a case where these initiators kind of lie to you, as far as dependencies are concerned.

example:

  • a page has a stylesheet. in that stylesheet are background images and webfonts that will be required.
  • the styles have not be calculated yet, but they are downloaded. (yes, chrome should probably try to recalc style/scan/preload these resources immediately after download, but it doesn't)
  • an unrelated script on the page runs and forces layout. style recalc is forced because layout.
  • recalc runs and the requests for images and fonts begin.
  • their initiators have a stack trace including this unrelated script (and whatever other unrelated scripts are in the call stack)

unfortunately this happens pretty often.


to avoid misattributing the dependency i think we'd want to exclude a stack trace if the request began during a StyleRecalc main thread event. :)

the good news is brendan and i couldn't think of another case where the call stack shouldn't be considered dependencies.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little lost on which initiators you're talking about here, so you're saying that image and font network records have in their initiator information a stack trace that's not relevant and so I should cross-reference that against CPU task information to see if we're in a stylerecalc? This feels like the initiator should be fixed instead though because the initiator isn't the script it's the stylesheet, right?

Copy link
Member

@paulirish paulirish Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image and font network records have in their initiator information a stack trace that's not relevant

yes correct

I should cross-reference that against CPU task information to see if we're in a stylerecalc

yah that's the solution that comes to mind.

This feels like the initiator should be fixed instead though because the initiator isn't the script it's the stylesheet, right?

If we want to be really specific there are two initiator-type like ideas:

  1. the URL source resource (bad name sorry). There is a specific resource that contains that URL (like a stylesheet). Obviously a request can't be made to it without this file being present.
  2. the initiating task. - the main thread activity that caused the browser to request this file. There is sometimes a JS callstack associated with this, but other times it's an async task like the browser just catching up to styles it hasn't computed yet.

Oftentimes 1 and 2 point to the same url. (a script requesting another script). But this forced layout case is the primary example of when they diverge. IMO developers want both pieces of information.

I'd tweak your statement to say the initiator should be fixed via breaking into the two components. Then we could clearly ask for the URL source resource and it'd give us a stylesheet rather than whatever JS forced a stylerecalc to happen.


starbucks.com is an example of this behavior. see the first google webfont's initiator is a line of JS that calls offsetWidth.

screen shot 2017-09-12 at 10 37 44 am

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhhhh I see, gotcha. Yeah then let's dig into this with the rest of the pending initiator improvements for #3163

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review!

mostly style stuff

* @param {!Array<TraceEvent>=} children
*/
constructor(parentEvent, children = []) {
super(`${parentEvent.tid}.${parentEvent.ts}`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pull this out as a named variable to make it clearer what's being constructed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

class CPUNode extends Node {
/**
* @param {!TraceEvent} parentEvent
* @param {!Array<TraceEvent>=} children
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"children" is a bit confusing in context of an object in a different hierarchy. I don't have a great alternative suggestion, though. childEvents might be clearer while not being too far off? subEvents?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

childEvents sg 👍

}

/**
* @param {!Node} node
* @param {!Object} values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

real def if possible

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


return rootNode;
}

/**
* @param {!Node} rootNode
* @return {number}
* @return {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need a jsdoc string. estimateGraph is less self explanatory vs computeGraphDuration

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah along with the other host of refactors for post #3163, estimator is just wrong at this point, simulator makes a lot more sense based on what it's evolved into since the time is easily achieved from the node information

* @return {!Node}
*/
static createGraph(traceOfTab, networkRecords) {
const networkNodeOutput = this.getNetworkNodes(networkRecords);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: static, so should use PageDependencyGraphArtifact.getNetworkNodes (and for below)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though not sure why all these are static but Estimator is not :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estimator stores and tracks lots of state but these do not :), done

@@ -254,22 +321,28 @@ class Estimator {

/**
* Estimates the time taken to process all of the graph's nodes.
* @return {number}
* @return {{timeInMs: number, nodeTiming: !Map<!Node, Object>}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// add root node to queue
this._markNodeAsInQueue(rootNode, totalElapsedTime);

// loop as long as we have nodes in the queue or currently in process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

progress :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh done :)

@@ -148,21 +197,31 @@ class Estimator {
* @param {number} totalElapsedTime
*/
_startNodeIfPossible(node, totalElapsedTime) {
if (node.type !== Node.TYPES.NETWORK) return;
if (node.type === Node.TYPES.CPU) {
// Start a CPU task if there's no other CPU task in process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarity, does this mean for now e.g. all timer callbacks will be run as soon as possible as their dependency URLs are loaded, regardless of when they're actually scheduled? along with other tasks scheduled by chrome itself

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, timer callbacks are available to be processed as soon as their dependencies have been met (including their installing task), opportunity for future improvement could be to require a minimum time between the installer and the callback

@@ -221,9 +288,17 @@ class Estimator {
* @param {number} totalElapsedTime
*/
_updateProgressMadeInTimePeriod(node, timePeriodLength, totalElapsedTime) {
const timingData = this._nodeTiming.get(node);
const isFinished = timingData.estimatedTimeElapsed === timePeriodLength;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like one and only one node should be finishing per estimateWithDetails loop to make reasoning about progress easier (if thinking of it as a sweep line), but you may be thinking differently

one option: _findNextNodeCompletionTime could be _findNextCompletedNode, and _updateProgressMadeInTimePeriod could finish only that one and use its timing for totalElapsedTime

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the network logic happens in essentially round trip blocks rather than on a packet simulation level, it's possible to have multiple requests finishing at exactly the same time and get updated all at once and this simplified things to not have to worry about 0 elapsed time

if you think that's confusing though I'm open to a future refactor since that would line up nicely with information stored on each node

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you think that's confusing though I'm open to a future refactor since that would line up nicely with information stored on each node

yeah, I think it would generally be easier to reason about if there was only one node finishing at any one point (where that point is defined by actual time + some ordering of the nodes)


// Try to add all its dependents to the queue
for (const dependent of node.getDependents()) {
// Skip this node if one of its dependencies hasn't finished yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Skip dependent node if one...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@brendankenny
Copy link
Member

The result of things seems like a mixture of parallel data structures (e.g. nodes and their timing information). Would it makes sense to have timing information right on the nodes? And then the different estimated dependency graphs would be clones with their own timing information from each parallel reality of different network/cpu conditions.

@patrickhulce
Copy link
Collaborator Author

The result of things seems like a mixture of parallel data structures (e.g. nodes and their timing information). Would it makes sense to have timing information right on the nodes? And then the different estimated dependency graphs would be clones with their own timing information from each parallel reality of different network/cpu conditions.

Yeah Paul and I talked this over a bit in the network PR. The primary reason for avoiding attaching the information to the node was that the timing data is an artifact of the specific simulation and not an intrinsic property of the graph which is meant to reflect the actual page load that was observed. The practicality of throwing the info onto the node though is pretty compelling, so maintaining that distinction with a get/setSimulatedData might be worth it in a refactor PR

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah Paul and I talked this over a bit in the network PR. The primary reason for avoiding attaching the information to the node was that the timing data is an artifact of the specific simulation and not an intrinsic property of the graph which is meant to reflect the actual page load that was observed. The practicality of throwing the info onto the node though is pretty compelling, so maintaining that distinction with a get/setSimulatedData might be worth it in a refactor PR

yeah, I see it both ways. I think it would be pretty sweet to have different copies of the graph around, each loaded under different conditions, where the original's conditions just happen to be the real world. This would be great for asking questions about just that version of the graph and e.g. visualizations.

We also might (will?) get to the point where the dependency graph's topology can also be speculative (e.g. what would happen if I preloaded this resource instead of waiting for its initiators to initiate it), so ergonomics of the data structure may change yet again :)

@@ -221,9 +288,17 @@ class Estimator {
* @param {number} totalElapsedTime
*/
_updateProgressMadeInTimePeriod(node, timePeriodLength, totalElapsedTime) {
const timingData = this._nodeTiming.get(node);
const isFinished = timingData.estimatedTimeElapsed === timePeriodLength;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you think that's confusing though I'm open to a future refactor since that would line up nicely with information stored on each node

yeah, I think it would generally be easier to reason about if there was only one node finishing at any one point (where that point is defined by actual time + some ordering of the nodes)

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay

LGTM 🕸 🖥 🐎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants