perf: `RuleTable::any_enabled` #10971

MichaReiser · 2024-04-16T08:40:52Z

Summary

This PR improves the performance of RuleTable::any_enabled which is called frequently in expression checking to determine if a specific set of rules is enabled.

The old implementation used enabled.intersects(RuleSet::from_iter(rules)) to test if the enabled set and the tested rules overlap.
This worked fine when we had few rules but is now becoming a performance bottleneck when bumping RuleSet from 13 to 14 usizes because each call zero initializes a 14 * 8=112 bytes large array on the stack, sets the rule indexes and then computes if the sets overlap.

The new implementation avoids constructing a RuleSet using from_iter based on the assumption that we mainly query any_enabled with a few rules. This avoids writing 112 bytes on each call.

This should make the any_enabled check independent of the size of the RuleSet.

Test Plan

cargo test

codspeed-hq · 2024-04-16T08:46:11Z

CodSpeed Performance Report

Merging #10971 will improve performances by 20.14%

_{Comparing perf-rules-any-enabled (bac6d21) with main (f779bab)}

Summary

⚡ 6 improvements
✅ 24 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`perf-rules-any-enabled`	Change
⚡	`linter/all-rules[large/dataset.py]`	82.4 ms	79 ms	+4.32%
⚡	`linter/default-rules[large/dataset.py]`	19.3 ms	16.1 ms	+20.14%
⚡	`linter/default-rules[numpy/ctypeslib.py]`	4.1 ms	3.7 ms	+13.11%
⚡	`linter/default-rules[numpy/globals.py]`	636.8 µs	595.2 µs	+6.98%
⚡	`linter/default-rules[pydantic/types.py]`	8.6 ms	7.7 ms	+11.52%
⚡	`linter/default-rules[unicode/pypinyin.py]`	1.5 ms	1.3 ms	+16.67%

github-actions · 2024-04-16T08:53:48Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser · 2024-04-16T09:02:59Z

crates/ruff_linter/src/registry/rule_set.rs

+        let mut i = 0;
+
+        while i < rules.len() {
+            any |= self.contains(rules[i]);


I intentionally use an bitwise OR here to avoid any branching.

I'd bet it would compile down to the same code if you used an if. :-) But I actually find this pretty readable as it is personally.

Acutally, it does not

The iter version has a jump

mov rax, qword ptr [rsp - 112] movzx edx, byte ptr [rsp - 116] mov cl, 1 bt rax, rdx jb .LBB0_2 movzx ecx, byte ptr [rsp - 114] bt rax, rcx setb cl

The loop version does not (but it requires more instructions)

mov byte ptr [rsp - 117], cl lea rax, [rsp - 117] movzx ecx, byte ptr [rsp - 114] mov edx, 1 shl edx, cl movzx ecx, byte ptr [rsp - 116] bts edx, ecx test dword ptr [rsp - 112], edx setne byte ptr [rsp - 117]

https://godbolt.org/z/oWWzqc6s1

Yeah I just meant, if self.contains(rules[i]) { any = true }.

It makes sense that the iter version has an extra jump because of the short circuit.

ah yeah, that probably compiles down to the same

BurntSushi · 2024-04-16T11:45:34Z

crates/ruff_linter/src/registry/rule_set.rs

+        let mut i = 0;
+
+        while i < rules.len() {
+            any |= self.contains(rules[i]);


I'd bet it would compile down to the same code if you used an if. :-) But I actually find this pretty readable as it is personally.

BurntSushi · 2024-04-16T11:46:40Z

crates/ruff_linter/src/registry/rule_set.rs

+        let mut any = false;
+        let mut i = 0;
+
+        while i < rules.len() {


It looks like this entire function could just be written as rules.iter().any(|r| self.contains(r))? Did you try that and it was slower? (I believe it also has the benefit of short circuiting, which may or may not help depending on the typical length of rules.)

The typical length is like 1-8 rules (where 8 is rare). The downside is that the function can't be const anymore ;).

But the performance is about the same. So lets use any as it is easier to understand.

Edit: Codspeed disagrees. The shift version is slightly faster (23% speedup instead of 20%)

Ah I missed the const requirement.

zanieb · 2024-04-16T13:18:31Z

Woo!

perf: RuleTable::any_enabled

bac6d21

MichaReiser added the performance Potential performance improvement label Apr 16, 2024

MichaReiser assigned BurntSushi Apr 16, 2024

MichaReiser commented Apr 16, 2024

View reviewed changes

BurntSushi reviewed Apr 16, 2024

View reviewed changes

MichaReiser force-pushed the perf-rules-any-enabled branch from e94a8ee to bac6d21 Compare April 16, 2024 12:11

MichaReiser enabled auto-merge (squash) April 16, 2024 12:11

MichaReiser merged commit d4e140d into main Apr 16, 2024
33 checks passed

MichaReiser deleted the perf-rules-any-enabled branch April 16, 2024 12:20

BrewTestBot mentioned this pull request Apr 18, 2024

ruff 0.4.0 Homebrew/homebrew-core#169465

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: `RuleTable::any_enabled` #10971

perf: `RuleTable::any_enabled` #10971

MichaReiser commented Apr 16, 2024 •

edited

Loading

codspeed-hq bot commented Apr 16, 2024 •

edited

Loading

github-actions bot commented Apr 16, 2024 •

edited

Loading

MichaReiser Apr 16, 2024

BurntSushi Apr 16, 2024

MichaReiser Apr 16, 2024

BurntSushi Apr 16, 2024

MichaReiser Apr 16, 2024

BurntSushi Apr 16, 2024

BurntSushi Apr 16, 2024

MichaReiser Apr 16, 2024 •

edited

Loading

BurntSushi Apr 16, 2024

zanieb commented Apr 16, 2024

perf: RuleTable::any_enabled #10971

perf: RuleTable::any_enabled #10971

Conversation

MichaReiser commented Apr 16, 2024 • edited Loading

Summary

Test Plan

codspeed-hq bot commented Apr 16, 2024 • edited Loading

Merging #10971 will improve performances by 20.14%

Summary

Benchmarks breakdown

github-actions bot commented Apr 16, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zanieb commented Apr 16, 2024

perf: `RuleTable::any_enabled` #10971

perf: `RuleTable::any_enabled` #10971

MichaReiser commented Apr 16, 2024 •

edited

Loading

codspeed-hq bot commented Apr 16, 2024 •

edited

Loading

github-actions bot commented Apr 16, 2024 •

edited

Loading

`ruff-ecosystem` results

MichaReiser Apr 16, 2024 •

edited

Loading