Skip to content

Commit a467e7c

Browse files
authored
[red-knot] Case sensitive module resolver (#16521)
## Summary This PR implements the first part of #16440. It ensures that Red Knot's module resolver is case sensitive on all systems. This PR combines a few approaches: 1. It uses `canonicalize` on non-case-sensitive systems to get the real casing of a path. This works for as long as no symlinks or mapped network drives (the windows `E:\` is mapped to `\\server\share` thingy). This is the same as what Pyright does 2. If 1. fails, fall back to recursively list the parent directory and test if the path's file name matches the casing exactly as listed in by list dir. This is the same approach as CPython takes in its module resolver. The main downside is that it requires more syscalls because, unlike CPython, we Red Knot needs to invalidate its caches if a file name gets renamed (CPython assumes that the folders are immutable). It's worth noting that the file watching test that I added that renames `lib.py` to `Lib.py` currently doesn't pass on case-insensitive systems. Making it pass requires some more involved changes to `Files`. I plan to work on this next. There's the argument that landing this PR on its own isn't worth it without this issue being addressed. I think it's still a good step in the right direction even when some of the details on how and where the path case sensitive comparison is implemented. ## Test plan I added multiple integration tests (including a failing one). I tested that the `case-sensitivity` detection works as expected on Windows, MacOS and Linux and that the fast-paths are taken accordingly.
1 parent a128ca7 commit a467e7c

File tree

14 files changed

+543
-27
lines changed

14 files changed

+543
-27
lines changed

crates/red_knot/src/main.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ fn run_check(args: CheckCommand) -> anyhow::Result<ExitStatus> {
8080
countme::enable(verbosity.is_trace());
8181
let _guard = setup_tracing(verbosity)?;
8282

83+
tracing::debug!("Version: {}", version::version());
84+
8385
// The base path to which all CLI arguments are relative to.
8486
let cwd = {
8587
let cwd = std::env::current_dir().context("Failed to get the current working directory")?;

crates/red_knot/tests/file_watching.rs

Lines changed: 80 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
#![allow(clippy::disallowed_names)]
2-
31
use std::collections::HashSet;
42
use std::io::Write;
53
use std::time::{Duration, Instant};
@@ -16,7 +14,7 @@ use ruff_db::source::source_text;
1614
use ruff_db::system::{
1715
OsSystem, System, SystemPath, SystemPathBuf, UserConfigDirectoryOverrideGuard,
1816
};
19-
use ruff_db::Upcast;
17+
use ruff_db::{Db as _, Upcast};
2018
use ruff_python_ast::PythonVersion;
2119

2220
struct TestCase {
@@ -1790,3 +1788,82 @@ fn changes_to_user_configuration() -> anyhow::Result<()> {
17901788

17911789
Ok(())
17921790
}
1791+
1792+
/// Tests that renaming a file from `lib.py` to `Lib.py` is correctly reflected.
1793+
///
1794+
/// This test currently fails on case-insensitive systems because `Files` is case-sensitive
1795+
/// but the `System::metadata` call isn't. This means that
1796+
/// Red Knot considers both `Lib.py` and `lib.py` to exist when only `lib.py` does
1797+
///
1798+
/// The incoming change events then are no-ops because they don't change either file's
1799+
/// status nor does it update their last modified time (renaming a file doesn't bump it's
1800+
/// last modified timestamp).
1801+
///
1802+
/// Fixing this requires to either make `Files` case-insensitive and store the
1803+
/// real-case path (if it differs) on `File` or make `Files` use a
1804+
/// case-sensitive `System::metadata` call. This does open the question if all
1805+
/// `System` calls should be case sensitive. This would be the most consistent
1806+
/// but might be hard to pull off.
1807+
///
1808+
/// What the right solution is also depends on if Red Knot itself should be case
1809+
/// sensitive or not. E.g. should `include="src"` be case sensitive on all systems
1810+
/// or only on case-sensitive systems?
1811+
///
1812+
/// Lastly, whatever solution we pick must also work well with VS Code which,
1813+
/// unfortunately ,doesn't propagate casing-only renames.
1814+
/// <https://github.com/rust-lang/rust-analyzer/issues/9581>
1815+
#[ignore]
1816+
#[test]
1817+
fn rename_files_casing_only() -> anyhow::Result<()> {
1818+
let mut case = setup([("lib.py", "class Foo: ...")])?;
1819+
1820+
assert!(
1821+
resolve_module(case.db(), &ModuleName::new("lib").unwrap()).is_some(),
1822+
"Expected `lib` module to exist."
1823+
);
1824+
assert_eq!(
1825+
resolve_module(case.db(), &ModuleName::new("Lib").unwrap()),
1826+
None,
1827+
"Expected `Lib` module not to exist"
1828+
);
1829+
1830+
// Now rename `lib.py` to `Lib.py`
1831+
if case.db().system().case_sensitivity().is_case_sensitive() {
1832+
std::fs::rename(
1833+
case.project_path("lib.py").as_std_path(),
1834+
case.project_path("Lib.py").as_std_path(),
1835+
)
1836+
.context("Failed to rename `lib.py` to `Lib.py`")?;
1837+
} else {
1838+
// On case-insensitive file systems, renaming a file to a different casing is a no-op.
1839+
// Rename to a different name first
1840+
std::fs::rename(
1841+
case.project_path("lib.py").as_std_path(),
1842+
case.project_path("temp.py").as_std_path(),
1843+
)
1844+
.context("Failed to rename `lib.py` to `temp.py`")?;
1845+
1846+
std::fs::rename(
1847+
case.project_path("temp.py").as_std_path(),
1848+
case.project_path("Lib.py").as_std_path(),
1849+
)
1850+
.context("Failed to rename `temp.py` to `Lib.py`")?;
1851+
}
1852+
1853+
let changes = case.stop_watch(event_for_file("Lib.py"));
1854+
case.apply_changes(changes);
1855+
1856+
// Resolving `lib` should now fail but `Lib` should now succeed
1857+
assert_eq!(
1858+
resolve_module(case.db(), &ModuleName::new("lib").unwrap()),
1859+
None,
1860+
"Expected `lib` module to no longer exist."
1861+
);
1862+
1863+
assert!(
1864+
resolve_module(case.db(), &ModuleName::new("Lib").unwrap()).is_some(),
1865+
"Expected `Lib` module to exist"
1866+
);
1867+
1868+
Ok(())
1869+
}

crates/red_knot_python_semantic/resources/mdtest/import/basic.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,3 +140,27 @@ import b.foo # error: [unresolved-import] "Cannot resolve import `b.foo`"
140140

141141
```py
142142
```
143+
144+
## Long paths
145+
146+
It's unlikely that a single module component is as long as in this example, but Windows treats paths
147+
that are longer than 200 and something specially. This test ensures that Red Knot can handle those
148+
paths gracefully.
149+
150+
```toml
151+
system = "os"
152+
```
153+
154+
`AveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPath/__init__.py`:
155+
156+
```py
157+
class Foo: ...
158+
```
159+
160+
```py
161+
from AveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPathAveryLongPath import (
162+
Foo,
163+
)
164+
165+
reveal_type(Foo()) # revealed: Foo
166+
```

crates/red_knot_python_semantic/resources/mdtest/import/case_sensitive.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,7 @@
11
# Case Sensitive Imports
22

33
```toml
4-
# TODO: This test should use the real file system instead of the memory file system.
5-
# but we can't change the file system yet because the tests would then start failing for
6-
# case-insensitive file systems.
7-
#system = "os"
4+
system = "os"
85
```
96

107
Python's import system is case-sensitive even on case-insensitive file system. This means, importing

crates/red_knot_python_semantic/src/module_resolver/path.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,10 @@ impl ModulePath {
6363
self.relative_path.pop()
6464
}
6565

66+
pub(super) fn search_path(&self) -> &SearchPath {
67+
&self.search_path
68+
}
69+
6670
#[must_use]
6771
pub(super) fn is_directory(&self, resolver: &ResolverContext) -> bool {
6872
let ModulePath {

crates/red_knot_python_semantic/src/module_resolver/resolver.rs

Lines changed: 85 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -604,14 +604,29 @@ fn resolve_name(db: &dyn Db, name: &ModuleName) -> Option<(SearchPath, File, Mod
604604
/// resolving modules.
605605
fn resolve_file_module(module: &ModulePath, resolver_state: &ResolverContext) -> Option<File> {
606606
// Stubs have precedence over source files
607-
module
607+
let file = module
608608
.with_pyi_extension()
609609
.to_file(resolver_state)
610610
.or_else(|| {
611611
module
612612
.with_py_extension()
613613
.and_then(|path| path.to_file(resolver_state))
614-
})
614+
})?;
615+
616+
// For system files, test if the path has the correct casing.
617+
// We can skip this step for vendored files or virtual files because
618+
// those file systems are case sensitive (we wouldn't get to this point).
619+
if let Some(path) = file.path(resolver_state.db).as_system_path() {
620+
let system = resolver_state.db.system();
621+
if !system.case_sensitivity().is_case_sensitive()
622+
&& !system
623+
.path_exists_case_sensitive(path, module.search_path().as_system_path().unwrap())
624+
{
625+
return None;
626+
}
627+
}
628+
629+
Some(file)
615630
}
616631

617632
fn resolve_package<'a, 'db, I>(
@@ -1842,4 +1857,72 @@ not_a_directory
18421857
let a_module = resolve_module(&db, &a_module_name).unwrap();
18431858
assert_eq!(a_module.file().path(&db), &system_site_packages_location);
18441859
}
1860+
1861+
#[test]
1862+
#[cfg(unix)]
1863+
fn case_sensitive_resolution_with_symlinked_directory() -> anyhow::Result<()> {
1864+
use anyhow::Context;
1865+
use ruff_db::system::OsSystem;
1866+
1867+
let temp_dir = tempfile::TempDir::new()?;
1868+
let root = SystemPathBuf::from_path_buf(
1869+
temp_dir
1870+
.path()
1871+
.canonicalize()
1872+
.context("Failed to canonicalized path")?,
1873+
)
1874+
.expect("UTF8 path for temp dir");
1875+
1876+
let mut db = TestDb::new();
1877+
1878+
let src = root.join("src");
1879+
let a_package_target = root.join("a-package");
1880+
let a_src = src.join("a");
1881+
1882+
db.use_system(OsSystem::new(&root));
1883+
1884+
db.write_file(
1885+
a_package_target.join("__init__.py"),
1886+
"class Foo: x: int = 4",
1887+
)
1888+
.context("Failed to write `a-package/__init__.py`")?;
1889+
1890+
db.write_file(src.join("main.py"), "print('Hy')")
1891+
.context("Failed to write `main.py`")?;
1892+
1893+
// The symlink triggers the slow-path in the `OsSystem`'s `exists_path_case_sensitive`
1894+
// code because canonicalizing the path for `a/__init__.py` results in `a-package/__init__.py`
1895+
std::os::unix::fs::symlink(a_package_target.as_std_path(), a_src.as_std_path())
1896+
.context("Failed to symlink `src/a` to `a-package`")?;
1897+
1898+
Program::from_settings(
1899+
&db,
1900+
ProgramSettings {
1901+
python_version: PythonVersion::default(),
1902+
python_platform: PythonPlatform::default(),
1903+
search_paths: SearchPathSettings {
1904+
extra_paths: vec![],
1905+
src_roots: vec![src],
1906+
custom_typeshed: None,
1907+
python_path: PythonPath::KnownSitePackages(vec![]),
1908+
},
1909+
},
1910+
)
1911+
.expect("Valid program settings");
1912+
1913+
// Now try to resolve the module `A` (note the capital `A` instead of `a`).
1914+
let a_module_name = ModuleName::new_static("A").unwrap();
1915+
assert_eq!(resolve_module(&db, &a_module_name), None);
1916+
1917+
// Now lookup the same module using the lowercase `a` and it should resolve to the file in the system site-packages
1918+
let a_module_name = ModuleName::new_static("a").unwrap();
1919+
let a_module = resolve_module(&db, &a_module_name).expect("a.py to resolve");
1920+
assert!(a_module
1921+
.file()
1922+
.path(&db)
1923+
.as_str()
1924+
.ends_with("src/a/__init__.py"),);
1925+
1926+
Ok(())
1927+
}
18451928
}

crates/red_knot_server/src/system.rs

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ use lsp_types::Url;
77
use ruff_db::file_revision::FileRevision;
88
use ruff_db::system::walk_directory::WalkDirectoryBuilder;
99
use ruff_db::system::{
10-
DirectoryEntry, FileType, GlobError, Metadata, OsSystem, PatternError, Result, System,
11-
SystemPath, SystemPathBuf, SystemVirtualPath, SystemVirtualPathBuf,
10+
CaseSensitivity, DirectoryEntry, FileType, GlobError, Metadata, OsSystem, PatternError, Result,
11+
System, SystemPath, SystemPathBuf, SystemVirtualPath, SystemVirtualPathBuf,
1212
};
1313
use ruff_notebook::{Notebook, NotebookError};
1414

@@ -136,6 +136,10 @@ impl System for LSPSystem {
136136
self.os_system.canonicalize_path(path)
137137
}
138138

139+
fn path_exists_case_sensitive(&self, path: &SystemPath, prefix: &SystemPath) -> bool {
140+
self.os_system.path_exists_case_sensitive(path, prefix)
141+
}
142+
139143
fn read_to_string(&self, path: &SystemPath) -> Result<String> {
140144
let document = self.system_path_to_document_ref(path)?;
141145

@@ -219,6 +223,10 @@ impl System for LSPSystem {
219223
fn as_any_mut(&mut self) -> &mut dyn Any {
220224
self
221225
}
226+
227+
fn case_sensitivity(&self) -> CaseSensitivity {
228+
self.os_system.case_sensitivity()
229+
}
222230
}
223231

224232
fn not_a_text_document(path: impl Display) -> std::io::Error {

crates/red_knot_test/src/db.rs

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ use red_knot_python_semantic::lint::{LintRegistry, RuleSelection};
33
use red_knot_python_semantic::{default_lint_registry, Db as SemanticDb};
44
use ruff_db::files::{File, Files};
55
use ruff_db::system::{
6-
DbWithWritableSystem, InMemorySystem, OsSystem, System, SystemPath, SystemPathBuf,
7-
WritableSystem,
6+
CaseSensitivity, DbWithWritableSystem, InMemorySystem, OsSystem, System, SystemPath,
7+
SystemPathBuf, WritableSystem,
88
};
99
use ruff_db::vendored::VendoredFileSystem;
1010
use ruff_db::{Db as SourceDb, Upcast};
@@ -211,6 +211,15 @@ impl System for MdtestSystem {
211211
self.as_system().read_virtual_path_to_notebook(path)
212212
}
213213

214+
fn path_exists_case_sensitive(&self, path: &SystemPath, prefix: &SystemPath) -> bool {
215+
self.as_system()
216+
.path_exists_case_sensitive(&self.normalize_path(path), &self.normalize_path(prefix))
217+
}
218+
219+
fn case_sensitivity(&self) -> CaseSensitivity {
220+
self.as_system().case_sensitivity()
221+
}
222+
214223
fn current_directory(&self) -> &SystemPath {
215224
self.as_system().current_directory()
216225
}

crates/red_knot_test/src/lib.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,9 @@ fn run_test(
113113
.canonicalize()
114114
.expect("Canonicalizing to succeed");
115115
let root_path = SystemPathBuf::from_path_buf(root_path)
116-
.expect("Temp directory to be a valid UTF8 path");
116+
.expect("Temp directory to be a valid UTF8 path")
117+
.simplified()
118+
.to_path_buf();
117119

118120
db.use_os_system_with_temp_dir(root_path, dir);
119121
}

crates/red_knot_wasm/src/lib.rs

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ use ruff_db::diagnostic::{DisplayDiagnosticConfig, OldDiagnosticTrait};
1111
use ruff_db::files::{system_path_to_file, File};
1212
use ruff_db::system::walk_directory::WalkDirectoryBuilder;
1313
use ruff_db::system::{
14-
DirectoryEntry, GlobError, MemoryFileSystem, Metadata, PatternError, System, SystemPath,
15-
SystemPathBuf, SystemVirtualPath,
14+
CaseSensitivity, DirectoryEntry, GlobError, MemoryFileSystem, Metadata, PatternError, System,
15+
SystemPath, SystemPathBuf, SystemVirtualPath,
1616
};
1717
use ruff_notebook::Notebook;
1818

@@ -260,6 +260,14 @@ impl System for WasmSystem {
260260
Err(ruff_notebook::NotebookError::Io(not_found()))
261261
}
262262

263+
fn path_exists_case_sensitive(&self, path: &SystemPath, _prefix: &SystemPath) -> bool {
264+
self.path_exists(path)
265+
}
266+
267+
fn case_sensitivity(&self) -> CaseSensitivity {
268+
CaseSensitivity::CaseSensitive
269+
}
270+
263271
fn current_directory(&self) -> &SystemPath {
264272
self.fs.current_directory()
265273
}

0 commit comments

Comments
 (0)