Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly use a known profiling format for timings ? #43804

Closed
lqd opened this issue Aug 11, 2017 · 10 comments
Closed

Possibly use a known profiling format for timings ? #43804

lqd opened this issue Aug 11, 2017 · 10 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@lqd
Copy link
Member

lqd commented Aug 11, 2017

With the new -Z trans-time-graph from #43506, soon to be joined by -Z profile-queries from #43345, and their impact on time-passes, would it make sense to output such timings in an existing profiling format instead of ad-hoc ones ?

A couple formats come to mind:

  • the one used by the Gecko profiler and its frontend
  • or Chrome's tracing format + frontend

They both seem to support multiple processes/threads, and the frontend tools are pretty powerful in filtering, sorting, etc.

Chrome's format looks easy to generate, and Aras P had a couple good articles about how they use it at Unity:

Gecko's and perf-html looks more complex/complete and has a seriously impressive polished UI (but might have some Firefox related concepts, ie JS or C++ contexts one can see in the UI). To see how it looks, here's an example of a big trace from gecko+stylo (I think, I saw this in the servo irc chan)

I think both frontends could be used on perf.rlo. I know Chrome's trace viewer can be compiled to a single (huge) html file one can use outside of Chrome to trace json timing files. And I think perf-html is client-side only as well.

(As an aside, could it also be interesting to output to such formats with a more fine-grained profiling data successor to time-passes — and not just for the 2 HTML outputs mentioned above ? I think this would be extremely useful to see and track hotspots on perf.rlo, which IIRC will probably switch to only showing totals without passes data)

cc @michaelwoerister, @eddyb, @nikomatsakis

@michaelwoerister
Copy link
Member

With respect to -Ztrans-time-graph this sounds like an excellent idea! I'm not yet sure whether -Ztime-passes can still be useful, even if we make it handle multiple threads. Due to the compiler's semi-lazy nature, all kinds of things can happen during a "pass". This needs some discussion. It might be better to just always use a real profiler.

@matthewhammer's -Z profile-queries is more specialized and its output could probably not be viewed/processed by a general purpose tool.

cc @Mark-Simulacrum @alexcrichton

@Mark-Simulacrum Mark-Simulacrum added C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 11, 2017
@arielb1
Copy link
Contributor

arielb1 commented Aug 13, 2017

It might be better to just always use a real profiler.

That would still have the same spilllover problems. A real profiler would show that type-checking uses the time for type-checking + everything is forced. Similarly, -Z time-passes is still useful, because most of resolution happens in "name resolution", most of type-checking happens in "item-bodies checking".

@michaelwoerister
Copy link
Member

A real profiler would show that type-checking uses the time for type-checking + everything is forced.

What I mean is that a profiler collect way more information that can be processed in various ways afterwards with something like flame graphs or perf-focus. This allows get the cost of a some function, even if it is called on demand.

Similarly, -Z time-passes is still useful, because most of resolution happens in "name resolution", most of type-checking happens in "item-bodies checking".

I see two options here:

  1. Try to make -Ztime-passes more useful again by forcing as many queries as possible during their "home passes" so that the statement above is mostly true.
  2. Improve our tooling around real profilers so that we can easily extract information from the set of profiler samples.

Option (2) seems more future-proof to me, although option (1) is more platform independent.

@arielb1
Copy link
Contributor

arielb1 commented Aug 14, 2017

@michaelwoerister

Getting a flame graph of queries out of a normal profiler run on a rustc with debuginfo-lines is pretty easy.

@michaelwoerister
Copy link
Member

Yes, I know. That's way I'm not sure if we should put effort into maintaining -Ztime-passes.

@arielb1
Copy link
Contributor

arielb1 commented Aug 14, 2017

It's useful for getting a quick review of how things are going, especially on "customer machines".

@michaelwoerister
Copy link
Member

I agree, it's useful for that.

@steveklabnik
Copy link
Member

It's been a few years; did anything come out of the discussion? Is this bug useful?

@Mark-Simulacrum
Copy link
Member

This bug doesn't seem to be useful so I'm going to close - we're discussing profiling elsewhere as part of the self profile working group.

@michaelwoerister
Copy link
Member

And we plan to support a format that both https://perf-html.io/ and the Chromium tools support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants