-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect pystats in subprocesses #263
Comments
The tricky thing for case two is that we don't really have a way of knowing if we've "caught" all of the Python subprocesses (especially the ones that use library code). We just need to either notice that the benchmark stats seem off, or know exactly what the benchmark is doing. I think the current default for stats builds is to not collect stats by default, and require an environment variable to enable them at startup. Could we flip that? So collect stats by default, but turn them off with an env var? That way all new subprocesses will collect stats by default, without us needing to do anything. And pyperf/pyperformance can explicitly disable stats collection at startup for its subprocesses and only turn them on around the invocation of the benchmark itself. The one catch is that the main pyperformance invocation needs to be called with the env var set (to avoid collecting stats from pyperformance itself). But that seems straightforward enough to deal with - our benchmarking infrastructure can be updated to set everything correctly, and we could update pyperformance/pyperf to exit with a helpful error message if it's called on a stats build without the env var. |
I'm a little worried this isn't great UX, but it's a balance, right? Just to throw an alternative out there, importing pyperf could monkeypatch |
We’re probably some of the only people who actually use the stats builds, so I doubt many people would be frustrated by the error message (or be surprised by the subprocess magic). I still prefer the error message, I think. Not only a bit simpler and less error-prone, but I can’t think of a way to do the subprocess solution without “leaking” an additional stats file for the parent process. I actually just thought of a third option... could the initial process call |
Yeah, that's a good solution. |
There are two ways in which "real work" that we'd want to measure ends up in a subprocess. Neither of these cases presently collects pystats in the subprocess.
Using the
pyperf.Runner.bench_command
API (eg.python_startup
). These should be handled automatically by this update to pyperf, we just need a new pyperf release and then update pyperformance and bench_runner to use it.The benchmark itself fires off a subprocess to run something like a webserver (eg.
djangocms
). These benchmarks will need to be individually updated to enable pystats in the subprocess, under the right circumstances. It should basically amount to checking for--hook pystats
on the commandline and then setting the envvarPYTHONSTATS=1
in the subprocess. I can't think of a way to do that automatically for all of these, but there aren't that many, we should just update them individually. (This also requires the pyperf update above, first).Cc: @brandtbucher
The text was updated successfully, but these errors were encountered: