[ty] Add environment variable to dump Salsa memory usage stats #18928

ibraheemdev · 2025-06-24T23:53:06Z

Summary

Setting TY_MEMORY_REPORT=full will generate and print a memory usage report to the CLI after a ty check run:

=======SALSA STRUCTS=======
`Definition`                                       metadata=7.24MB   fields=17.38MB  count=181062
`Expression`                                       metadata=4.45MB   fields=5.94MB   count=92804
`member_lookup_with_policy_::interned_arguments`   metadata=1.97MB   fields=2.25MB   count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
    metadata=11.46MB  fields=88.86MB  count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
    metadata=24.52MB  fields=86.68MB  count=146018
`File -> ruff_db::parsed::ParsedModule`
    metadata=0.12MB   fields=69.06MB  count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
    struct metadata = 29.00MB
    struct fields = 35.68MB
    memo metadata = 103.87MB
    memo fields = 409.06MB

Eventually, we should integrate these numbers into CI in some form. The one limitation currently is that heap allocations in salsa structs (e.g. interned values) are not tracked, but memoized values should have full coverage. We may also want a peak memory usage counter (that accounts for non-salsa memory), but that is relatively simple to profile manually (e.g. time -v ty check) and would require a compile-time option to avoid runtime overhead.

Depends on salsa-rs/salsa#925.

Cargo.toml

github-actions · 2025-06-24T23:59:10Z

`mypy_primer` results

No ecosystem changes detected ✅

github-actions · 2025-06-25T00:07:21Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

crates/ty/src/lib.rs

MichaReiser

This is great

I've a few smaller nits. The only downside of the design is that it's very easy to forget the heap_size attribute on a salsa query which will result in under counting. That makes me wonder if we should change the design in salsa so that the stack and heap_size is reported separately for each query (we can show a total as well) and the heap_size would be Unknown if the heap_size attribute isn't set. This would make it more appearant where heap_size attributes are missing (compared to, ah, this query doesn't allocate much)

Cargo.toml

crates/ruff_db/src/files.rs

crates/ruff_db/src/source.rs

crates/ty/src/lib.rs

crates/ty_python_semantic/src/semantic_index/expression.rs

crates/ty_python_semantic/src/semantic_index/use_def.rs

crates/ty_python_semantic/src/types.rs

crates/ty_python_semantic/src/util/get_size.rs

MichaReiser · 2025-06-25T06:22:34Z

crates/ty/src/lib.rs

+    if std::env::var("TY_MEMORY_REPORT").as_deref() == Ok("1") {
+        salsa_memory_dump(&db);
+    }


It would be nice if:

We would print a condenced memory report when running with -vv

We would print the full memory report when running with -vvv

You can get the verbosity from args.verbosity.

I would probably skip the environment variable for now. If we don't, make sure to add it here https://github.com/astral-sh/ty/blob/main/docs/reference/env.md

I think having to run -vvv is a little difficult because of the amount of tracing logs you have to wait for :) I kept the environment variable but added short and full options. We might eventually have to add an option for mypy primer that keeps the diff less sensitive to minor changes.

Cargo.toml

crates/ty/src/lib.rs

Cargo.toml

crates/ty/src/lib.rs

* main: [ty] Add builtins to completions derived from scope (#18982) [ty] Don't add incorrect subdiagnostic for unresolved reference (#18487) [ty] Simplify `KnownClass::check_call()` and `KnownFunction::check_call()` (#18981) [ty] Add micro-benchmark for #711 (#18979) [`flake8-annotations`] Make `ANN401` example error out-of-the-box (#18974) [`flake8-async`] Make `ASYNC110` example error out-of-the-box (#18975) [pandas]: Fix issue on `non pandas` dataframe `in-place` usage (PD002) (#18963) [`pylint`] Fix `PLC0415` example (#18970) [ty] Add environment variable to dump Salsa memory usage stats (#18928) [`pylint`] Fix `PLW0108` autofix introducing a syntax error when the lambda's body contains an assignment expression (#18678) Bump 0.12.1 (#18969) [`FastAPI`] Add fix safety section to `FAST002` (#18940) [ty] Add regression test for leading tab mis-alignment in diagnostic rendering (#18965) [ty] Resolve python environment in `Options::to_program_settings` (#18960) [`ruff`] Fix false positives and negatives in `RUF010` (#18690) [ty] Fix rendering of long lines that are indented with tabs [ty] Add regression test for diagnostic rendering panic [ty] Move venv and conda env discovery to `SearchPath::from_settings` (#18938)

## Summary Print the [new salsa memory usage dumps](#18928) in mypy primer CI runs to help us catch memory regressions. The numbers are rounded to the nearest power of 1.1 (about a 5% threshold between buckets) to avoid overly sensitive diffs.

ibraheemdev requested a review from carljm as a code owner June 24, 2025 23:53

ibraheemdev added the internal An internal refactor or improvement label Jun 24, 2025

ibraheemdev requested a review from AlexWaygood as a code owner June 24, 2025 23:53

ibraheemdev added the ty Multi-file analysis & type inference label Jun 24, 2025

ibraheemdev requested review from MichaReiser, dcreager, dhruvmanila and sharkdp as code owners June 24, 2025 23:53

ibraheemdev commented Jun 24, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

ibraheemdev force-pushed the ibraheem/memory-usage-dump branch from 3582d8c to bc95677 Compare June 24, 2025 23:55

dhruvmanila reviewed Jun 25, 2025

View reviewed changes

crates/ty/src/lib.rs Outdated Show resolved Hide resolved

MichaReiser approved these changes Jun 25, 2025

View reviewed changes

AlexWaygood removed their request for review June 25, 2025 08:04

dhruvmanila reviewed Jun 26, 2025

View reviewed changes

crates/ty/src/lib.rs Outdated Show resolved Hide resolved

MichaReiser approved these changes Jun 26, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

crates/ty/src/lib.rs Outdated Show resolved Hide resolved

ibraheemdev added 4 commits June 26, 2025 14:33

add environment variable to dump salsa memory usage stats

539e2bf

update get-size2 version

2f5a0e4

add short memory report option

869a5ea

update salsa

ea3bb1a

ibraheemdev force-pushed the ibraheem/memory-usage-dump branch 4 times, most recently from 007f92f to 23844f1 Compare June 26, 2025 18:45

ibraheemdev enabled auto-merge (squash) June 26, 2025 18:46

move salsa memory dump to ProjectDatabase

c59855c

ibraheemdev force-pushed the ibraheem/memory-usage-dump branch from 23844f1 to c59855c Compare June 26, 2025 21:24

ibraheemdev merged commit 6f7b1c9 into main Jun 26, 2025
35 checks passed

ibraheemdev deleted the ibraheem/memory-usage-dump branch June 26, 2025 21:27

ibraheemdev mentioned this pull request Jun 27, 2025

[ty] Print salsa memory usage totals in mypy primer CI runs #18973

Merged

MichaReiser mentioned this pull request Jun 27, 2025

Document TY_MEMORY_REPORT astral-sh/ty#714

Closed

[ty] Add environment variable to dump Salsa memory usage stats #18928

[ty] Add environment variable to dump Salsa memory usage stats #18928

Uh oh!

Conversation

ibraheemdev commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

github-actions bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

github-actions bot commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichaReiser Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

ibraheemdev Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ibraheemdev commented Jun 24, 2025 •

edited

Loading

github-actions bot commented Jun 24, 2025 •

edited

Loading

`mypy_primer` results

github-actions bot commented Jun 25, 2025 •

edited

Loading

`ruff-ecosystem` results