-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Reduce size of TypeInference
#19435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
…dings`, and `declarations`
f824979 to
2443fa5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, you got me. I tried to sneak in a change here. I removed countme. I re-added it to ty_python_semantic to gather some statistics and then removed it when cleaning up this PR, which is when I realized that I missed to remove these countme fields in ruff_db. We no longer need to use countme because TY_MEMORY_REPORT exposes the same information (and much more!)
|
|
|
||
| /// Map based on a `Vec`. It doesn't enforce | ||
| /// uniqueness on insertion. Instead, it relies on the caller | ||
| /// that elements are uniuqe. For example, the way we visit definitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// that elements are uniuqe. For example, the way we visit definitions | |
| /// that elements are unique. For example, the way we visit definitions |
sharkdp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the detailed analysis here and the sequence of improvement. Fantastic results!
My main concern here is not the code duplication, but rather: is this the right point in time for all of these optimizations?
For example: HashMap => VecMap change: The analysis you did is correct now, but some of the details may change with future changes to ty's behavior, or future changes to these core data types. Will we re-evaluate this tradeoff in the future?
Another example: the Boxed extra secitons: Making changes to these code sections now comes with increased friction: do I need to add this new field to the struct itself, or to the extra section? If I add another usage site of an existing field on extra, is the memory-performance tradeoff still correct?
I realize that these are annoying questions/concerns. Optimizations often have attached maintenance costs. My gut feeling is that some optimizations here may be premature (like the HashMap => VecMap change), but I might be wrong. And I'm definitely not opposed to introducing any of these optimizations right now.
| /// The inferred types for a scope region. | ||
| #[derive(Debug, Eq, PartialEq, salsa::Update, get_size2::GetSize)] | ||
| pub(crate) struct TypeInference<'db> { | ||
| pub(crate) struct ScopeInference<'db> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if you also considered making TypeInference and TypeInferenceBuilder generic over the inference region kind (scope, expression, definition)? This might avoid some of the code duplication, but could be a bit tricky/annoying to set up without enum support in const generics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid making TypeInferenceBuilder generic because it would lead to a lot of monomorphization and also just complicates the type signature overall.
My first version had a TypeInference enum but I then realized that it's actually never needed (and we would still need to unwrap at the caller side to get the narrower type that we can put into the queries. But maybe you had something else in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid making
TypeInferenceBuildergeneric because it would lead to a lot of monomorphization and also just complicates the type signature overall.
Thanks. I figured you already thought about it.
My first version had a
TypeInferenceenum but I then realized that it's actually never needed (and we would still need to unwrap at the caller side to get the narrower type that we can put into the queries. But maybe you had something else in mind?
I only said enum because I would be inclined to make these structs generic over a { Scope, Expression, Definition } enum. That works in C++, but not (yet?) with Rust const generics. Anyway, thanks for your response.
|
These are fair concerns. We can definetely decide to defer some of those changes until later. I did take some pre-cautions that hopefully will help us catch potential regressions:
I added
We can still default to adding them to the main struct if we don't feel certain and defer the decision to move them to extra later. But I suspect that the person adding a new field will often have a good sense for how frequently the feature for which they're adding the field is used (they definetely have a bertter understanding than someone who has to trace through the code). If I add another usage site of an existing field on extra, is the memory-performance tradeoff still correct? This will hopefully show up both in performance profile and memory usage. Which should then allow us to move the field (from or to extra). Overall, I think this sets up the infrastructure we need and moving any field is fairly trivial if a later analysis turns out that some assumptions have changed. |
I saw that — I was more concerned about the fact that this change relies on the number of entries being small, because the linear scan would otherwise be slow. But I see that you also added tracing output for this, so you also considered that.
That makes sense, thanks. I'm fine with merging this PR as-is. |
* main: (76 commits) Move fix suggestion to subdiagnostic (#19464) [ty] Implement non-stdlib stub mapping for classes and functions (#19471) [ty] Disallow illegal uses of `ClassVar` (#19483) [ty] Disallow `Final` in function parameter/return-type annotations (#19480) [ty] Extend `Final` test suite (#19476) [ty] Minor change to diagnostic message for invalid Literal uses (#19482) [ty] Detect illegal non-enum attribute accesses in Literal annotation (#19477) [ty] Reduce size of `TypeInference` (#19435) Run MD tests for Markdown-only changes (#19479) Revert "[ty] Detect illegal non-enum attribute accesses in Literal annotation" [ty] Detect illegal non-enum attribute accesses in Literal annotation [ty] Added semantic token support for more identifiers (#19473) [ty] Make tuple subclass constructors sound (#19469) [ty] Pass down specialization to generic dataclass bases (#19472) [ty] Garbage-collect reachability constraints (#19414) [ty] Implicit instance attributes declared `Final` (#19462) [ty] Expansion of enums into unions of literals (#19382) [ty] Avoid rechecking the entire project when changing the opened files (#19463) [ty] Add warning for unknown `TY_MEMORY_REPORT` value (#19465) [ty] Sync vendored typeshed stubs (#19461) ...
* main: [ty] Use `ThinVec` for sub segments in `PlaceExpr` (#19470) [ty] Splat variadic arguments into parameter list (#18996) [`flake8-pyi`] Skip fix if all `Union` members are `None` (`PYI016`) (#19416) Skip notebook with errors in ecosystem check (#19491) [ty] Consistent use of American english (in rules) (#19488) [ty] Support iterating over enums (#19486) Fix panic for illegal `Literal[…]` annotations with inner subscript expressions (#19489) Move fix suggestion to subdiagnostic (#19464) [ty] Implement non-stdlib stub mapping for classes and functions (#19471) [ty] Disallow illegal uses of `ClassVar` (#19483) [ty] Disallow `Final` in function parameter/return-type annotations (#19480) [ty] Extend `Final` test suite (#19476) [ty] Minor change to diagnostic message for invalid Literal uses (#19482) [ty] Detect illegal non-enum attribute accesses in Literal annotation (#19477) [ty] Reduce size of `TypeInference` (#19435) Run MD tests for Markdown-only changes (#19479) Revert "[ty] Detect illegal non-enum attribute accesses in Literal annotation" [ty] Detect illegal non-enum attribute accesses in Literal annotation [ty] Added semantic token support for more identifiers (#19473) [ty] Make tuple subclass constructors sound (#19469)
Summary
This PR shrinks the size of the cached
infer_*queries in memory. This PR doesn't change what data we store (with a few exceptions). The main improvement is to shrink the size ofTypeInferenceitself. This is very impactful because ty creates a lot (e.g. from a large project, the count of allinfer_queries adds up to 10'929'221) of type inference results. For large projects, even a reduction by 8 bytes can result to a meaningful impact)You can go through the individual commits if you're curious how I landed on the current design. If you're not, these are the changes I made in this PR:
TypeInferenceintoExpressionInference,DefinitionInference, andScopeInference. This has the benefit that each region can store exactly the information it needs. For example,ExpressionInferenceonly needs to store the fallback type, the expression types, diagnostics, and bindings but it doesn't need to store declarations or deferred. This required me to inline allTypeInferencefields intoTypeInferenceBuilder.Inferencetype, split out the less common fields into anExtrastruct and store that as aOption<Box<Extra>>on theInferencetype. For example, most inference regions have no diagnostics. Therefore, move that field to theExtratype so that we only pay the cost of 24 bytes for the diagnosticsVecif a region has diagnostics. Another example is thatbindingsare very uncommon in expression scopes, that's why they're stored on theExtratype too.scopebehind#[cfg(debug_assertions)]. It's only used to ensure that we don't accidentially merge inference results from different scopes during type inference building. We don't need this in production.FxHashMapwithBox<[(Key, Value)]and use a linear scan. I gathered some numbers on a large project and noticed thatdeclarations,bindings, anddeferredall tend to be very small (less than 10 items). In which case aVecwith a linear scan can give very similar performance characteristics as using aHashMap(but is smaller). This might even save us some time during building the data structures because our tree walking guarantees that all inserted keys are unique.cycle_fallback_typeto abool. We always fallback toNeverand aboolis smallerThe main downside is that this overall leads to more code and some code duplication. The duplicate code is fairly trivial, which is why I don't consider this a concern.
Besides memory improvements, I do find that different
Inferencetype help clarify which information is needed when (during building, vs for different regions).Closes astral-sh/ty#495
Test Plan
I ran ty on a large project with
TY_MEMORY_REPORT=fulland compared the numbers between different versions:Initial
These are the numbers from main:
This PR
infer_scope_types: -430MBinfer_definition_types: -1079MBinfer_expression_types: -550MBIn total, this is a reduction by almost 2GB. Getting this number down further will be trickier and it might make sense to see if the salsa metadata can be reduced (now larger than the actual data for
infer_expression_types) or if some queries can be removed entirely or be reduced in number (see astral-sh/ty#855)Memory reports by change
Performance
The change is mostly performance neutral. Codspeed shows a few 1% regressions for instrumented benchmarks but it also shows a 1% improvement for walltime benchmarks... Overall, the change is neutral in performance.