Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize formatter with hot cache #8135

Closed
konstin opened this issue Oct 23, 2023 · 2 comments
Closed

Optimize formatter with hot cache #8135

konstin opened this issue Oct 23, 2023 · 2 comments
Labels
formatter Related to the formatter

Comments

@konstin
Copy link
Member

konstin commented Oct 23, 2023

There's a lot of optimization opportunities with the formatter with a hot cache (the example is apache airflow with rayon removed):

  • Sorting the result takes a long time so does building the result vec, even though we know its approximate size
  • Don't read source files in formatter when there is a cache hit #8132
  • Deserializing the cache takes noticeable time even if we just need 4k paths plus timestamps
  • We first iterate over each directory only to check the timestamps later, could those be combined?

image

Instructions for generating the flamegraph:

@konstin konstin added the formatter Related to the formatter label Oct 23, 2023
@charliermarsh
Copy link
Member

Fixing the sort here.

charliermarsh added a commit that referenced this issue Oct 24, 2023
## Summary

Related to #8135.

If we're not printing a `--diff`, or a summary of `--check` changes, we
can avoid sorting the list of results. Further, when sorting, we only
need to sort a small subset of the entries, in the common case (i.e., in
general, it's much more likely that a file is formatted than not).

## Test Plan

Local benchmarks suggest a 5-10% speedup on the cached behavior:

```
❯ hyperfine --warmup 3 "./target/release/ruff format ../airflow" "./target/release/sort format ../airflow"
Benchmark 1: ./target/release/ruff format ../airflow
  Time (mean ± σ):      70.3 ms ±   5.2 ms    [User: 52.1 ms, System: 59.0 ms]
  Range (min … max):    68.3 ms … 101.7 ms    42 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: ./target/release/sort format ../airflow
  Time (mean ± σ):      66.0 ms ±   1.4 ms    [User: 48.3 ms, System: 58.4 ms]
  Range (min … max):    64.7 ms …  71.8 ms    44 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  './target/release/sort format ../airflow' ran
    1.07 ± 0.08 times faster than './target/release/ruff format ../airflow'
```
@charliermarsh
Copy link
Member

We did a bunch of stuff here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

No branches or pull requests

2 participants