Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(perf): add timing instrumentation to measure runtime perf #3745

Merged
merged 40 commits into from
Oct 25, 2018

Conversation

paulirish
Copy link
Member

This work fixes #2304 (and is mentioned in #3719)

We make use of marky which generates user timing entries for our instrumentation. We then use a variant of tdresser/performance-observer-tracing to create a trace from those measures.

usage:

$ lighthouse https://flipkart.com --output=json --output-path=flipkart.json

# ........

$ yarn timing flipkart.json

  > Timing trace file saved to: /Users/paulirish/code/lighthouse/flipkart.json.run-timing.trace.json
  > Open this file in chrome://tracing

✨  Done in 0.23s.

image


Since lighthouse-logger gets some changes you'll need to npm link it to absorb them.

Here's the trace in the screenshot f you want to poke around it: flipkart.json.run-timing.trace.json.txt

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome awesome stuff 💣 ⏲ 💥

@@ -97,7 +97,9 @@ class Driver {
* @return {!Promise<null>}
*/
connect() {
return this._connection.connect();
const status = {str: 'Connecting to browser', id: 'connect'};
log.log('status', status.str, status.id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.time?

@@ -269,6 +269,7 @@ class Config {
* @param {string=} configPath The absolute path to the config file, if there is one.
*/
constructor(configJSON, configPath) {
log.marky.mark('config-create');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer log.time here :)

@@ -104,6 +106,16 @@ class Log {
Log._logToStdErr(`${prefix}:${level || ''}`, [method, snippet]);
}

static time({str, id, args=[]}, level='log') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you feel about msg instead of str?

lighthouseResults.timing.total = endTime - startTime;
lighthouseResults.timing.total = totalEntry.duration;

fs.writeFileSync('generateTrace.html', `
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like it needs fixin :)

@@ -35,15 +37,25 @@ module.exports = function(url, flags = {}, configJSON) {
// Use ConfigParser to generate a valid config file
const config = new Config(configJSON, flags.configPath);

log.marky.mark('ConnectionSetup');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.time?

@@ -25,13 +25,15 @@ class Runner {

// List of top-level warnings for this Lighthouse run.
const lighthouseRunWarnings = [];
log.marky.mark('runner.run');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.time?

@@ -151,6 +160,12 @@ class Runner {
score = report.score;
}

log.timeEnd(status);
const timings = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's just keep as timing


const jsonStr = `
{ "traceEvents": [
${events.map(evt => JSON.stringify(evt)).join(',\n')}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha how about JSON.stringify({traceEvents: events}, null, 2) :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the benefit of stringifying them individually is 1 line per trace event:

image

sooooooo........

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright alright

@paulirish
Copy link
Member Author

updated.

@@ -234,13 +252,20 @@ class GatherRunner {
.then(_ => recordTrace && driver.beginTrace(options.flags))
// Navigate.
.then(_ => GatherRunner.loadPage(driver, options))
.then(_ => log.log('statusEnd', status));
.then(_ => log.timeEnd(status, 'log'));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the level is log? that's weird haha

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shrug. i suppose its equivalent level is actually "info"/"informational", but that's otherwise hidden as an implementation detail of lighthouse-logger.

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl looks good to me!

It might be worth noodling on the timing IDs so a hierarchy is a little more self-evident.

a sample of what this might look like with namespaces colon/hyphen concatenated?

lh:init:config:create
lh:gather:beforePass:Styles
lh:audit:unused-css-rules

lh
|- init
    |- config
    |- connect
|- runner
    |- gathering
    |- auditing
|- gather
    |- setup
    |- beforePass:<gatherer name>
    |- load
    |- pass:<gatherer name>
    |- afterPass:<gatherer name>
    |- <gatherer name> *not part of trace, just timing object*
|- audit
    |- <audit name>
|- report
    |- ...
|- cleanup

@@ -3,6 +3,7 @@
"version": "1.0.1",
"license": "Apache-2.0",
"dependencies": {
"debug": "^2.6.8"
"debug": "^2.6.8",
"marky": "^1.2.0"
Copy link
Collaborator

@patrickhulce patrickhulce Nov 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lighthouse depends on lighthouse-logger explicitly so we don't need to update root package.json too or should we add in root too?

@paulirish
Copy link
Member Author

paulirish commented Nov 15, 2017

  • add checks for asserting nesting & never-ending trace events

@paulirish
Copy link
Member Author

updated based on feedback.

image
zoom
image
sorted by selftime
image

example generated trace: example.timing.json.zip

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

still somewhat nervous about the marky dep being in logger, we'll need to publish logger and bump our dep on it before publishing next LH I guess?

@@ -0,0 +1,134 @@
/**
* @license Copyright 2016 Google Inc. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2017?

@@ -269,6 +269,8 @@ class Config {
* @param {string=} configPath The absolute path to the config file, if there is one.
*/
constructor(configJSON, configPath) {
const status = {msg: 'Create config', id: 'config-create'};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lh:init:config maybe?

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking good.

I'm still wondering about nesting, though. It seems like that's ok (vs overlapping)? e.g. a gatherer time is inside gatherRunner is inside runner is inside total

If that's ok, should we rely more on nesting for naming context rather than do it manually? e.g. a gatherer doesn't need all of lh:gather:pass:gather-name because it's already nested in those things.

@@ -207,5 +219,6 @@ class Log {
}

Log.events = new Emitter();
Log.marky = marky;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since marky is a singleton which will persist between runs, maybe we should wrap in a method on Log to also clear? like

static getTimeEntries() {
  const entries = marky.getEntries();
  marky.clear();
  return entries;
}

@@ -71,6 +74,8 @@ class Runner {
if (validPassesAndAudits) {
// Set up the driver and run gatherers.
opts.driver = opts.driverMock || new Driver(connection);
run = run.then(_ => log.timeEnd(runnerStatus));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this need to end in all execution paths?

const path = require('path');

/**
* Technically, it's fine for usertiming measures to overlap, however non-async events make
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand overlapping vs async events here. Seems like they are separate concerns?

traceFilePath = path.resolve(process.cwd(), 'run-timing.trace.json');
}
fs.writeFileSync(traceFilePath, jsonStr, 'utf8');
process.stdout.write(`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to use console.log? (with eslint line disable)

*/
function saveTraceFromCLI() {
const printErrorAndQuit = msg => {
process.stderr.write(`ERROR:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.error?

* Takes filename of LHR object. The primary entrypoint on CLI
*/
function saveTraceFromCLI() {
const printErrorAndQuit = msg => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pull out as a top level function?

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should smoke tests assert at least some timing stuff exists? Or one of the runner/index tests that runs (but no gathering) has all the timings that make sense? It seems like it could be easy to lose a time or a timeEnd or misplace in a promise chain or something, and not notice it happened

@@ -0,0 +1,134 @@
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs tests :)

@paulirish
Copy link
Member Author

Since we're taking a closer look at the LHR these days.. here's what the timing object looks as of this branch:

image

Looking at it with today's eyes, I have two proposals:

  1. remove 'entryType' from the entries
  2. only ship a few select high-level timings we'd be comfortable including in LHR-Lite (v1 or v2) (e.g. loadBlank, loadTargetPage, gathering, auditing, total). This means these other timings are not AS accessible via JSON_EXTRACT, though it may be doable?. (No idea if bigquery supports that).

@patrickhulce
Copy link
Collaborator

img

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⏱ 📏 :its_happening:
Last bit, otherwise LGTM!


// Summarize all the timings and drop onto the LHR
try {
log.timeEnd(runnerStatus);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marky looks like it will only throw if runnerStatus.id hadn't been started, but that looks like that isn't possible here. Still need the try/catch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was happening in a -GA run?

@@ -0,0 +1,57 @@
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should really organize scripts/ at some point, but not necessary in this PR :)


const fs = require('fs');
const path = require('path');
const {createTraceString} = require('../lib/timing-trace-saver');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this and timing-trace-saver.js just be the same file? (we can do in a follow up)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was at one point, but i separated it to write tests.
i mean i guess we could have lighthouse-core/test/scripts/... but it seemed weird. :)

typings/artifacts.d.ts Outdated Show resolved Hide resolved
typings/artifacts.d.ts Show resolved Hide resolved
// Any Timing entries in saved artifacts will have a different timeOrigin than the auditing phase
// The `gather` prop is read later in generate-timing-trace and they're added to a separate track of trace events
artifacts.Timing.forEach(entry => (entry.gather = true));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this happen in gather-runner.js? After line 399

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only want to break the trace into two tracks if they're done at different times. (Classic -G then -A workflow)

So that's why we assign here to force a diff track.

Copy link
Member Author

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx patrick very much for pushing this!


const fs = require('fs');
const path = require('path');
const {createTraceString} = require('../lib/timing-trace-saver');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was at one point, but i separated it to write tests.
i mean i guess we could have lighthouse-core/test/scripts/... but it seemed weird. :)


// Summarize all the timings and drop onto the LHR
try {
log.timeEnd(runnerStatus);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was happening in a -GA run?

@patrickhulce
Copy link
Collaborator

no problemo @paulirish I'll let you finish up :)

@connorjclark
Copy link
Collaborator

connorjclark commented Oct 24, 2018

RE: Error: Failed to execute \'measure\' on \'Performance\': The mark \'start lh:runner:run\' does not exist.

Marky did us wrong :'(
image

Marky is stubbing a global "performance" variable: lighthouse/node_modules/marky/dist/marky.js

One glaring difference ...

performance is defined (browser)

exports.clear = function () {
perf.clearMarks();
perf.clearMeasures();
};

not defined (node)

exports.clear = function () { entries = []; };

gather-runner clears the marks before the first runner mark ends. Hence the error.

  1. Expose time entires in lighthouse-logger
  2. Don't clear the marks in gather-runner. Do extra processing in runner to dedupe marks.

@connorjclark connorjclark mentioned this pull request Oct 24, 2018
2 tasks
@connorjclark
Copy link
Collaborator

Sent a PR to marky.

@paulirish
Copy link
Member Author

lgtm

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last things, then LGTM ⏱⏲⌚️:)

package.json Outdated
@@ -53,6 +53,8 @@
"fast": "yarn start --disable-device-emulation --throttlingMethod=provided",
"deploy-viewer": "cd lighthouse-viewer && gulp deploy",
"bundlesize": "bundlesize",
"plots-smoke": "bash plots/test/smoke.sh",
"timing": "node lighthouse-core/scripts/generate-timing-trace.js",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call this something more? timing-trace?

@@ -119,6 +121,11 @@ class Runner {
categories = ReportScoring.scoreAllCategories(runOpts.config.categories, resultsById);
}

log.timeEnd(resultsStatus);

// Summarize all the timings and drop onto the LHR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should drop this comment now

...timingEntriesFromArtifacts,
...timingEntriesFromRunner,
].map(entry => /** @type {[string, PerformanceEntry]} */ ([entry.name, entry]));
const timingEntries = Array.from(new Map(timingEntriesKeyValues).values());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to get a comment in here about why we're doing this/need to do this. Basically this handles both the auditMode case where gatherer entries need to be merged in and the gather/audit case where timingEntriesFromRunner contains all entries from this run, including those also in timingEntriesFromArtifacts

@@ -304,6 +306,16 @@ declare global {
}[];
}

export interface MeasureEntry {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should maybe make this

export interface MeasureEntry extends PerformanceEntry {
  /** Whether timing entry was collected during artifact gathering. */
  gather?: boolean;
}

to make the intended interchangeability clearer

@connorjclark
Copy link
Collaborator

@brendankenny done

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waiting for the green, but LGTM!

@paulirish we should probably add a "how to profile" section to CONTRIBUTING or wherever, but can do that at a later time :)

@brendankenny
Copy link
Member

:rip_van_winkle:

@brendankenny brendankenny merged commit 888bd6d into master Oct 25, 2018
@brendankenny brendankenny deleted the timing branch October 25, 2018 18:15
@patrickhulce
Copy link
Collaborator

this merge deserves its own partay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Measure the execution time of each task more accurately
5 participants