[ty] Garbage-collect reachability constraints #19414

dcreager · 2025-07-17T21:55:44Z

This is a follow-on to #19410 that further reduces the memory usage of our reachability constraints. When finishing the building of a use-def map, we walk through all of the "final" states and mark only those reachability constraints as "used". We then throw away the interior TDD nodes of any reachability constraints that weren't marked as used.

(This helps because we build up quite a few intermediate TDD nodes when constructing complex reachability constraints. These nodes can never be accessed if they were only used as an intermediate TDD node. The marking step ensures that we keep any nodes that ended up being referred to in some accessible use-def map state.)

carljm · 2025-07-17T22:10:01Z

Looks like we are somehow failing to mark some nodes as used that we later try to query?

dcreager · 2025-07-17T22:20:59Z

Looks like we are somehow failing to mark some nodes as used that we later try to query?

blerp blorp. Glad I added that assertion!

github-actions · 2025-07-17T22:25:30Z

`mypy_primer` results

No ecosystem changes detected ✅

Memory usage changes were detected when running on open source projects

trio (https://github.com/python-trio/trio)
- TOTAL MEMORY USAGE: ~176MB
+ TOTAL MEMORY USAGE: ~167MB
-     memo fields = ~138MB
+     memo fields = ~131MB

sphinx (https://github.com/sphinx-doc/sphinx)
- TOTAL MEMORY USAGE: ~301MB
+ TOTAL MEMORY USAGE: ~287MB
-     memo fields = ~236MB
+     memo fields = ~224MB

prefect (https://github.com/PrefectHQ/prefect)
- TOTAL MEMORY USAGE: ~626MB
+ TOTAL MEMORY USAGE: ~596MB
-     memo fields = ~490MB
+     memo fields = ~467MB

codspeed-hq · 2025-07-17T23:24:50Z

CodSpeed WallTime Performance Report

Merging #19414 will not alter performance

_{Comparing dcreager/smoosh-reachability (848c573) with main (dc66019)}

Summary

✅ 7 untouched benchmarks

github-actions · 2025-07-17T23:25:14Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

warning: Detected debug build without --no-cache.
error: Failed to parse examples/mcp/databricks_mcp_cookbook.ipynb:12:1:8: Simple statements must be separated by newlines or semicolons

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

ruff format --preview

warning: Detected debug build without --no-cache.
error: Failed to parse examples/mcp/databricks_mcp_cookbook.ipynb:12:1:8: Simple statements must be separated by newlines or semicolons

* main: [ty] Use `…` as the "cut" indicator in diagnostic rendering (#19420) [ty] synthesize __setattr__ for frozen dataclasses (#19307) [ty] Fixed bug in semantic token provider for parameters. (#19418) [ty] Shrink reachability constraints (#19410) Move JUnit rendering to `ruff_db` (#19370)

dcreager · 2025-07-18T14:31:09Z

I've fixed all of the test failures here and tweaked how we keep track of the mapping between indexes in the pre-filtered and post-filtered vectors here. The ecosystem memory usage is showing a slight memory increase for some small projects — this is expected since the index mapping adds a slight overhead if there are not a lot of reachability constraints that are thrown away. But @MichaReiser reports that we are seeing a 3× reduction in memory usage for very large projects. I don't know if we have a way to get some of those into CI, but I even without that I consider this PR to be a win.

dcreager · 2025-07-18T14:54:31Z

The ecosystem memory usage is showing a slight memory increase for some small projects

Okay and now I fixed this by shrinking a vector that we were shrinking before

dcreager · 2025-07-18T17:13:24Z

Somehow requesting Micha's review earlier removed everyone else! I've added you all back not because I specifically want 👀 from everyone, but just in case any of you want to take a look at this. It does have implications on the finish step of the semantic index builder.

carljm

Looks good!

MichaReiser

Nice. There's a small regression on the instrumented benchmarks but the walltime benchmarks are mostly neutral which makes this a huge win. Thank you!

MichaReiser · 2025-07-19T07:28:52Z

crates/ty_python_semantic/src/rank.rs

+    }
+}
+
+impl get_size2::GetSize for RankBitBox {


Isn't this the same as what the automatically derived GetSize implementation gives you

I have to use a helper function to get the size of the bits field since bitvec::BitBox doesn't implement GetSize. I found a way to use the derive impl with a size_fn helper for just that one field.

Cargo.toml

crates/ty_python_semantic/src/rank.rs

MichaReiser · 2025-07-19T14:18:23Z

crates/ty_python_semantic/src/rank.rs

+
+    /// Returns the number of bits _before_ (and not including) the given index that are set.
+    pub(crate) fn rank(&self, index: usize) -> u32 {
+        let chunk_index = index / CHUNK_SIZE;


Hmm, too sad that as_raw_slice isn't const. I wonder if it is worth to mark this function and get_bit as #[inline]

Will #[inline] be necessary if this is only used within the same crate?

It still acts as a hint to the compiler. But yes, it's less useful compared to cross crate inlining without LTO.

👍 Got it! Added

crates/ty_python_semantic/src/semantic_index/use_def/place_state.rs

crates/ty_python_semantic/src/semantic_index/use_def.rs

MichaReiser · 2025-07-19T14:25:31Z

I don't know if we have a way to get some of those into CI, but I even without that I consider this PR to be a win.

You can get more detailed numbers locally using TY_MEMORY_REPORT=full

* main: (25 commits) [ty] Sync vendored typeshed stubs (#19461) [ty] Extend tuple `__len__` and `__bool__` special casing to also cover tuple subclasses (#19289) [ty] bump docstring-adder pin (#19458) [ty] Disallow assignment to `Final` class attributes (#19457) Update dependency ruff to v0.12.4 (#19442) Update pre-commit hook astral-sh/ruff-pre-commit to v0.12.4 (#19443) Update rui314/setup-mold digest to 702b190 (#19441) Update taiki-e/install-action action to v2.56.19 (#19448) Update Rust crate strum_macros to v0.27.2 (#19447) Update Rust crate strum to v0.27.2 (#19446) Update Rust crate rand to v0.9.2 (#19444) Update Rust crate serde_json to v1.0.141 (#19445) Fix `unreachable` panic in parser (#19183) [`ruff`] Support byte strings (`RUF055`) (#18926) [ty] Avoid second lookup for `infer_maybe_standalone_expression` (#19439) [ty] Implemented "go to definition" support for import statements (#19428) [ty] Avoid secondary tree traversal to get call expression for keyword arguments (#19429) [ty] Add goto definition to playground (#19425) [ty] Add support for `@warnings.deprecated` (#19376) [ty] make `del x` force local resolution of `x` in the current scope (#19389) ...

…ability * origin/main: [ty] Expansion of enums into unions of literals (#19382) [ty] Avoid rechecking the entire project when changing the opened files (#19463) [ty] Add warning for unknown `TY_MEMORY_REPORT` value (#19465)

* main: (76 commits) Move fix suggestion to subdiagnostic (#19464) [ty] Implement non-stdlib stub mapping for classes and functions (#19471) [ty] Disallow illegal uses of `ClassVar` (#19483) [ty] Disallow `Final` in function parameter/return-type annotations (#19480) [ty] Extend `Final` test suite (#19476) [ty] Minor change to diagnostic message for invalid Literal uses (#19482) [ty] Detect illegal non-enum attribute accesses in Literal annotation (#19477) [ty] Reduce size of `TypeInference` (#19435) Run MD tests for Markdown-only changes (#19479) Revert "[ty] Detect illegal non-enum attribute accesses in Literal annotation" [ty] Detect illegal non-enum attribute accesses in Literal annotation [ty] Added semantic token support for more identifiers (#19473) [ty] Make tuple subclass constructors sound (#19469) [ty] Pass down specialization to generic dataclass bases (#19472) [ty] Garbage-collect reachability constraints (#19414) [ty] Implicit instance attributes declared `Final` (#19462) [ty] Expansion of enums into unions of literals (#19382) [ty] Avoid rechecking the entire project when changing the opened files (#19463) [ty] Add warning for unknown `TY_MEMORY_REPORT` value (#19465) [ty] Sync vendored typeshed stubs (#19461) ...

dcreager added 3 commits July 17, 2025 17:30

track which reachability constraints are actually used

9e5ba62

throw away unused reachability constraints

d588a1e

clippy

820e6ba

dcreager requested review from AlexWaygood, carljm and sharkdp as code owners July 17, 2025 21:55

dcreager added internal An internal refactor or improvement ty Multi-file analysis & type inference labels Jul 17, 2025

dcreager requested review from MichaReiser and removed request for AlexWaygood, carljm and sharkdp July 17, 2025 21:55

carljm marked this pull request as draft July 17, 2025 22:10

mark some other nodes as used

40c3945

dcreager added 3 commits July 17, 2025 19:07

use bitvec rank

4590c28

clippy

1141131

use vec

a7d6d72

dcreager added 6 commits July 17, 2025 19:33

document

39df2aa

this is expensive, let's debug_assert it

ef06cef

use simpler rank bitvec

fe3ac74

fix wasm

1f3bddf

fix wasm for real

a823a10

dcreager marked this pull request as ready for review July 18, 2025 14:31

use a boxed slice here

441d49f

remove unneeded clippy allow

72a2d82

dcreager requested review from AlexWaygood, carljm and sharkdp July 18, 2025 17:12

carljm approved these changes Jul 18, 2025

View reviewed changes

MichaReiser approved these changes Jul 19, 2025

View reviewed changes

AlexWaygood removed their request for review July 21, 2025 13:34

dcreager added 7 commits July 21, 2025 10:35

use size_fn with GetSize

597e7ad

don't use default bitvec features

ce9f6e2

document overhead

b0cfc4f

rename to finish

9260f4f

inline some stuff

7090b7e

dcreager merged commit 88de572 into main Jul 21, 2025
37 checks passed

dcreager deleted the dcreager/smoosh-reachability branch July 21, 2025 18:16

[ty] Garbage-collect reachability constraints #19414

[ty] Garbage-collect reachability constraints #19414

Uh oh!

Conversation

dcreager commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carljm commented Jul 17, 2025

Uh oh!

dcreager commented Jul 17, 2025

Uh oh!

github-actions bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

codspeed-hq bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed WallTime Performance Report

Merging #19414 will not alter performance

Summary

Uh oh!

github-actions bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

dcreager commented Jul 18, 2025

Uh oh!

dcreager commented Jul 18, 2025

Uh oh!

dcreager commented Jul 18, 2025

Uh oh!

carljm left a comment

Choose a reason for hiding this comment

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

MichaReiser Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

dcreager Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MichaReiser Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

dcreager Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

MichaReiser Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

dcreager Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichaReiser commented Jul 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dcreager commented Jul 17, 2025 •

edited

Loading

github-actions bot commented Jul 17, 2025 •

edited

Loading

`mypy_primer` results

codspeed-hq bot commented Jul 17, 2025 •

edited

Loading

github-actions bot commented Jul 17, 2025 •

edited

Loading

`ruff-ecosystem` results