Skip to content

Commit fdd0de7

Browse files
committed
[ty] Add caching for submodule completion suggestions
This change makes it so we aren't doing a directory traversal every time we ask for completions from a module. Specifically, submodules that aren't attributes of their parent module can only be discovered by looking at the directory tree. But we want to avoid doing a directory scan unless we think there are changes. To make this work, this change does a little bit of surgery to `FileRoot`. Previously, a `FileRoot` was only used for library search paths. Its revision was bumped whenever a file in that tree was added, deleted or even modified (to support the discovery of `pth` files and changes to its contents). This generally seems fine since these are presumably dependency paths that shouldn't change frequently. In this change, we add a `FileRoot` for the project. But having the `FileRoot`'s revision bumped for every change in the project makes caching based on that `FileRoot` rather ineffective. That is, cache invalidation will occur too aggressively. To the point that there is little point in adding caching in the first place. To mitigate this, a `FileRoot`'s revision is only bumped on a change to a child file's contents when the `FileRoot` is a `LibrarySearchPath`. Otherwise, we only bump the revision when a file is created or added. The effect is that, at least in VS Code, when a new module is added or removed, this change is picked up and the cache is properly invalidated. Other LSP clients with worse support for file watching (which seems to be the case for the CoC vim plugin that I use) don't work as well. Here, the cache is less likely to be invalidated which might cause completions to have stale results. Unless there's an obvious way to fix or improve this, I propose punting on improvements here for now.
1 parent 1fd9103 commit fdd0de7

File tree

7 files changed

+228
-43
lines changed

7 files changed

+228
-43
lines changed

crates/ruff_db/src/files.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -375,12 +375,25 @@ impl File {
375375
}
376376

377377
/// Refreshes the file metadata by querying the file system if needed.
378+
///
379+
/// This also "touches" the file root associated with the given path.
380+
/// This means that any Salsa queries that depend on the corresponding
381+
/// root's revision will become invalidated.
378382
pub fn sync_path(db: &mut dyn Db, path: &SystemPath) {
379383
let absolute = SystemPath::absolute(path, db.system().current_directory());
380384
Files::touch_root(db, &absolute);
381385
Self::sync_system_path(db, &absolute, None);
382386
}
383387

388+
/// Refreshes *only* the file metadata by querying the file system if needed.
389+
///
390+
/// This specifically does not touch any file root associated with the
391+
/// given file path.
392+
pub fn sync_path_only(db: &mut dyn Db, path: &SystemPath) {
393+
let absolute = SystemPath::absolute(path, db.system().current_directory());
394+
Self::sync_system_path(db, &absolute, None);
395+
}
396+
384397
/// Increments the revision for the virtual file at `path`.
385398
pub fn sync_virtual_path(db: &mut dyn Db, path: &SystemVirtualPath) {
386399
if let Some(virtual_file) = db.files().try_virtual_file(path) {

crates/ruff_db/src/files/file_root.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pub struct FileRoot {
2323
pub path: SystemPathBuf,
2424

2525
/// The kind of the root at the time of its creation.
26-
kind_at_time_of_creation: FileRootKind,
26+
pub kind_at_time_of_creation: FileRootKind,
2727

2828
/// A revision that changes when the contents of the source root change.
2929
///

crates/ty/tests/file_watching.rs

Lines changed: 112 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ use ty_project::metadata::pyproject::{PyProject, Tool};
1515
use ty_project::metadata::value::{RangedValue, RelativePathBuf};
1616
use ty_project::watch::{ChangeEvent, ProjectWatcher, directory_watcher};
1717
use ty_project::{Db, ProjectDatabase, ProjectMetadata};
18-
use ty_python_semantic::{ModuleName, PythonPlatform, resolve_module};
18+
use ty_python_semantic::{Module, ModuleName, PythonPlatform, resolve_module};
1919

2020
struct TestCase {
2121
db: ProjectDatabase,
@@ -1877,3 +1877,114 @@ fn rename_files_casing_only() -> anyhow::Result<()> {
18771877

18781878
Ok(())
18791879
}
1880+
1881+
/// This tests that retrieving submodules from a module has its cache
1882+
/// appropriately invalidated after a file is created.
1883+
#[test]
1884+
fn submodule_cache_invalidation_created() -> anyhow::Result<()> {
1885+
let mut case = setup([("lib.py", ""), ("bar/__init__.py", ""), ("bar/foo.py", "")])?;
1886+
let module = resolve_module(case.db(), &ModuleName::new("bar").unwrap()).expect("`bar` module");
1887+
let get_submodules = |db: &dyn Db, module: &Module| {
1888+
let mut names = module
1889+
.all_submodules(db)
1890+
.iter()
1891+
.map(|name| name.as_str().to_string())
1892+
.collect::<Vec<String>>();
1893+
names.sort();
1894+
names.join("\n")
1895+
};
1896+
1897+
insta::assert_snapshot!(
1898+
get_submodules(case.db(), &module),
1899+
@"foo",
1900+
);
1901+
1902+
std::fs::write(case.project_path("bar/wazoo.py").as_std_path(), "")?;
1903+
let changes = case.stop_watch(|_: &ChangeEvent| true);
1904+
case.apply_changes(changes, None);
1905+
1906+
insta::assert_snapshot!(
1907+
get_submodules(case.db(), &module),
1908+
@r"
1909+
foo
1910+
wazoo
1911+
",
1912+
);
1913+
1914+
Ok(())
1915+
}
1916+
1917+
/// This tests that retrieving submodules from a module has its cache
1918+
/// appropriately invalidated after a file is deleted.
1919+
#[test]
1920+
fn submodule_cache_invalidation_deleted() -> anyhow::Result<()> {
1921+
let mut case = setup([
1922+
("lib.py", ""),
1923+
("bar/__init__.py", ""),
1924+
("bar/foo.py", ""),
1925+
("bar/wazoo.py", ""),
1926+
])?;
1927+
let module = resolve_module(case.db(), &ModuleName::new("bar").unwrap()).expect("`bar` module");
1928+
let get_submodules = |db: &dyn Db, module: &Module| {
1929+
let mut names = module
1930+
.all_submodules(db)
1931+
.iter()
1932+
.map(|name| name.as_str().to_string())
1933+
.collect::<Vec<String>>();
1934+
names.sort();
1935+
names.join("\n")
1936+
};
1937+
1938+
insta::assert_snapshot!(
1939+
get_submodules(case.db(), &module),
1940+
@r"
1941+
foo
1942+
wazoo
1943+
",
1944+
);
1945+
1946+
std::fs::remove_file(case.project_path("bar/wazoo.py").as_std_path())?;
1947+
let changes = case.stop_watch(|_: &ChangeEvent| true);
1948+
case.apply_changes(changes, None);
1949+
1950+
insta::assert_snapshot!(
1951+
get_submodules(case.db(), &module),
1952+
@"foo",
1953+
);
1954+
1955+
Ok(())
1956+
}
1957+
1958+
/// This tests that retrieving submodules from a module has its cache
1959+
/// appropriately invalidated after a file is created and then deleted.
1960+
#[test]
1961+
fn submodule_cache_invalidation_created_then_deleted() -> anyhow::Result<()> {
1962+
let mut case = setup([("lib.py", ""), ("bar/__init__.py", ""), ("bar/foo.py", "")])?;
1963+
let module = resolve_module(case.db(), &ModuleName::new("bar").unwrap()).expect("`bar` module");
1964+
let get_submodules = |db: &dyn Db, module: &Module| {
1965+
let mut names = module
1966+
.all_submodules(db)
1967+
.iter()
1968+
.map(|name| name.as_str().to_string())
1969+
.collect::<Vec<String>>();
1970+
names.sort();
1971+
names.join("\n")
1972+
};
1973+
1974+
insta::assert_snapshot!(
1975+
get_submodules(case.db(), &module),
1976+
@"foo",
1977+
);
1978+
1979+
std::fs::write(case.project_path("bar/wazoo.py").as_std_path(), "")?;
1980+
std::fs::remove_file(case.project_path("bar/wazoo.py").as_std_path())?;
1981+
let changes = case.stop_watch(|_: &ChangeEvent| true);
1982+
case.apply_changes(changes, None);
1983+
1984+
insta::assert_snapshot!(
1985+
get_submodules(case.db(), &module),
1986+
@"foo",
1987+
);
1988+
1989+
Ok(())
1990+
}

crates/ty_project/src/db.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ use crate::{DEFAULT_LINT_REGISTRY, DummyReporter};
99
use crate::{ProgressReporter, Project, ProjectMetadata};
1010
use ruff_db::Db as SourceDb;
1111
use ruff_db::diagnostic::Diagnostic;
12-
use ruff_db::files::{File, Files};
12+
use ruff_db::files::{File, FileRootKind, Files};
1313
use ruff_db::system::System;
1414
use ruff_db::vendored::VendoredFileSystem;
1515
use salsa::Event;
@@ -74,6 +74,13 @@ impl ProjectDatabase {
7474
let program_settings = project_metadata.to_program_settings(db.system(), db.vendored())?;
7575
Program::from_settings(&db, program_settings);
7676

77+
// This adds a file root for the project itself. This enables
78+
// tracking of when changes are made to the files in a project
79+
// at the directory level. At time of writing (2025-07-17),
80+
// this is used for caching completions for submodules.
81+
db.files()
82+
.try_add_root(&db, &project_metadata.root, FileRootKind::Project);
83+
7784
db.project = Some(
7885
Project::from_metadata(&db, project_metadata)
7986
.map_err(|error| anyhow::anyhow!("{}", error.pretty(&db)))?,

crates/ty_project/src/db/changes.rs

Lines changed: 49 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ use std::collections::BTreeSet;
66

77
use crate::walk::ProjectFilesWalker;
88
use ruff_db::Db as _;
9-
use ruff_db::files::{File, Files};
9+
use ruff_db::file_revision::FileRevision;
10+
use ruff_db::files::{File, FileRootKind, Files};
1011
use ruff_db::system::SystemPath;
1112
use rustc_hash::FxHashSet;
1213
use salsa::Setter;
@@ -57,12 +58,6 @@ impl ProjectDatabase {
5758
let mut synced_files = FxHashSet::default();
5859
let mut sync_recursively = BTreeSet::default();
5960

60-
let mut sync_path = |db: &mut ProjectDatabase, path: &SystemPath| {
61-
if synced_files.insert(path.to_path_buf()) {
62-
File::sync_path(db, path);
63-
}
64-
};
65-
6661
for change in changes {
6762
tracing::trace!("Handle change: {:?}", change);
6863

@@ -92,13 +87,55 @@ impl ProjectDatabase {
9287

9388
match change {
9489
ChangeEvent::Changed { path, kind: _ } | ChangeEvent::Opened(path) => {
95-
sync_path(self, &path);
90+
if synced_files.insert(path.to_path_buf()) {
91+
let absolute =
92+
SystemPath::absolute(&path, self.system().current_directory());
93+
File::sync_path_only(self, &absolute);
94+
if let Some(root) = self.files().root(self, &absolute) {
95+
match root.kind_at_time_of_creation(self) {
96+
// When a file inside the root of
97+
// the project is changed, we don't
98+
// want to mark the entire root as
99+
// having changed too. In theory it
100+
// might make sense to, but at time
101+
// of writing, the file root revision
102+
// on a project is used to invalidate
103+
// the submodule files found within a
104+
// directory. If we bumped the revision
105+
// on every change within a project,
106+
// then this caching technique would be
107+
// effectively useless.
108+
//
109+
// It's plausible we should explore
110+
// a more robust cache invalidation
111+
// strategy that models more directly
112+
// what we care about. For example, by
113+
// keeping track of directories and
114+
// their direct children explicitly,
115+
// and then keying the submodule cache
116+
// off of that instead. ---AG
117+
FileRootKind::Project => {}
118+
FileRootKind::LibrarySearchPath => {
119+
root.set_revision(self).to(FileRevision::now());
120+
}
121+
}
122+
}
123+
}
96124
}
97125

98126
ChangeEvent::Created { kind, path } => {
99127
match kind {
100-
CreatedKind::File => sync_path(self, &path),
128+
CreatedKind::File => {
129+
if synced_files.insert(path.to_path_buf()) {
130+
File::sync_path(self, &path);
131+
}
132+
}
101133
CreatedKind::Directory | CreatedKind::Any => {
134+
let absolute =
135+
SystemPath::absolute(&path, self.system().current_directory());
136+
if let Some(root) = self.files().root(self, &absolute) {
137+
root.set_revision(self).to(FileRevision::now());
138+
}
102139
sync_recursively.insert(path.clone());
103140
}
104141
}
@@ -138,7 +175,9 @@ impl ProjectDatabase {
138175
};
139176

140177
if is_file {
141-
sync_path(self, &path);
178+
if synced_files.insert(path.to_path_buf()) {
179+
File::sync_path(self, &path);
180+
}
142181

143182
if let Some(file) = self.files().try_system(self, &path) {
144183
project.remove_file(self, file);

crates/ty_python_semantic/src/module_resolver/module.rs

Lines changed: 44 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ pub struct Module {
1717
inner: Arc<ModuleInner>,
1818
}
1919

20+
#[salsa::tracked]
2021
impl Module {
2122
pub(crate) fn file_module(
2223
name: ModuleName,
@@ -97,11 +98,15 @@ impl Module {
9798
///
9899
/// The names returned correspond to the "base" name of the module.
99100
/// That is, `{self.name}.{basename}` should give the full module name.
100-
pub fn all_submodules(&self, db: &dyn Db) -> Vec<Name> {
101-
self.all_submodules_inner(db).unwrap_or_default()
101+
pub fn all_submodules<'db>(&self, db: &'db dyn Db) -> &'db [Name] {
102+
self.clone()
103+
.all_submodules_inner(db, ())
104+
.as_deref()
105+
.unwrap_or_default()
102106
}
103107

104-
fn all_submodules_inner(&self, db: &dyn Db) -> Option<Vec<Name>> {
108+
#[salsa::tracked(returns(ref))]
109+
fn all_submodules_inner(self, db: &dyn Db, _dummy: ()) -> Option<Vec<Name>> {
105110
fn is_submodule(
106111
is_dir: bool,
107112
is_file: bool,
@@ -136,32 +141,42 @@ impl Module {
136141
);
137142

138143
Some(match path.parent()? {
139-
SystemOrVendoredPathRef::System(parent_directory) => db
140-
.system()
141-
.read_directory(parent_directory)
142-
.inspect_err(|err| {
143-
tracing::debug!(
144-
"Failed to read {parent_directory:?} when looking for \
145-
its possible submodules: {err}"
146-
);
147-
})
148-
.ok()?
149-
.flatten()
150-
.filter(|entry| {
151-
let ty = entry.file_type();
152-
let path = entry.path();
153-
is_submodule(
154-
ty.is_directory(),
155-
ty.is_file(),
156-
path.file_name(),
157-
path.extension(),
158-
)
159-
})
160-
.filter_map(|entry| {
161-
let stem = entry.path().file_stem()?;
162-
is_identifier(stem).then(|| Name::from(stem))
163-
})
164-
.collect(),
144+
SystemOrVendoredPathRef::System(parent_directory) => {
145+
// Read the revision on the corresponding file root to
146+
// register an explicit dependency on this directory
147+
// tree. When the revision gets bumped, the cache
148+
// that Salsa creates does for this routine will be
149+
// invalidated.
150+
if let Some(root) = db.files().root(db, parent_directory) {
151+
let _ = root.revision(db);
152+
}
153+
154+
db.system()
155+
.read_directory(parent_directory)
156+
.inspect_err(|err| {
157+
tracing::debug!(
158+
"Failed to read {parent_directory:?} when looking for \
159+
its possible submodules: {err}"
160+
);
161+
})
162+
.ok()?
163+
.flatten()
164+
.filter(|entry| {
165+
let ty = entry.file_type();
166+
let path = entry.path();
167+
is_submodule(
168+
ty.is_directory(),
169+
ty.is_file(),
170+
path.file_name(),
171+
path.extension(),
172+
)
173+
})
174+
.filter_map(|entry| {
175+
let stem = entry.path().file_stem()?;
176+
is_identifier(stem).then(|| Name::from(stem))
177+
})
178+
.collect()
179+
}
165180
SystemOrVendoredPathRef::Vendored(parent_directory) => db
166181
.vendored()
167182
.read_directory(parent_directory)

crates/ty_python_semantic/src/semantic_model.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ impl<'db> SemanticModel<'db> {
8686
};
8787
let ty = Type::module_literal(self.db, self.file, &submodule);
8888
completions.push(Completion {
89-
name: submodule_basename,
89+
name: submodule_basename.clone(),
9090
ty,
9191
builtin,
9292
});

0 commit comments

Comments
 (0)