-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Encapsulate module resolution logic in module.rs
#11767
Conversation
d514307
to
f1facf4
Compare
|
f1facf4
to
a2bb2e5
Compare
(Note that this PR essentially implements |
@@ -25,6 +25,7 @@ ctrlc = { version = "3.4.4" } | |||
dashmap = { workspace = true } | |||
hashbrown = { workspace = true } | |||
indexmap = { workspace = true } | |||
is-macro = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, nooo xD. This is a huge setback on my quest to remove is-macro
. I prefer writing the is methods manually. Rust-analyzer and other ideas often struggle with what is-macro generats, but that's a personal preference. So, up to you if y ou want to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol, I should have been more opinionated. It hurts that this is merged haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha, I'm sorry. I've never had a problem with Rust-analyzer not being able to understand these generated methods locally!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like what @MichaReiser really means by "rust-analyzer and other IDEs often struggle ..." is "RustRover, which I use, often struggles ..." ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(That said, I'm fine with avoiding this kind of thing -- it's also easier for humans to read explicit methods than a proc macro.)
crates/red_knot/src/main.rs
Outdated
let resolved_search_paths = | ||
ResolvedSearchPathOrder::new(vec![], workspace_search_path, None, None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without having read the implementation. Passing an empty vec and None
twice seems a bit weird. Is this temporary until we e.g. have typing-shed paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading more through the code. Ignore my comment. What's nice about ResolvedSearchPathOrder
is that I think this can be modeled very easily with salsa when we have settings.
All we need to do is to move it into a query that takes the type checker settings and it returns the resolved search paths.
extra_paths | ||
.into_iter() | ||
.map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra)) | ||
.chain(std::iter::once(ModuleSearchPath::new( | ||
workspace_root, | ||
ModuleSearchPathKind::FirstParty, | ||
))) | ||
// TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user | ||
.chain( | ||
custom_typeshed.into_iter().map(|path| { | ||
ModuleSearchPath::new(path, ModuleSearchPathKind::StandardLibrary) | ||
}), | ||
) | ||
.chain(site_packages.into_iter().map(|path| { | ||
ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty) | ||
})) | ||
// TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step | ||
.collect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not using chain
here would be easier to read
extra_paths | |
.into_iter() | |
.map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra)) | |
.chain(std::iter::once(ModuleSearchPath::new( | |
workspace_root, | |
ModuleSearchPathKind::FirstParty, | |
))) | |
// TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user | |
.chain( | |
custom_typeshed.into_iter().map(|path| { | |
ModuleSearchPath::new(path, ModuleSearchPathKind::StandardLibrary) | |
}), | |
) | |
.chain(site_packages.into_iter().map(|path| { | |
ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty) | |
})) | |
// TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step | |
.collect(), | |
let mut paths = extra_paths.into_iter().map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra)); | |
paths.push(ModuleSearchPath::new( | |
workspace_root, | |
ModuleSearchPathKind::FirstParty, | |
))); | |
paths.extend(custom_typeshed.into_iter().map(|path| { | |
ModuleSearchPath::new(path, ModuleSearchPathKind::StandardLibrary) | |
})); | |
paths.extend(site_packages.into_iter().map(|path| { | |
ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty) | |
})); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, to me it seems a fair bit more verbose but no more readable :/
--- a/crates/red_knot/src/module.rs
+++ b/crates/red_knot/src/module.rs
@@ -430,26 +430,40 @@ impl ResolvedSearchPathOrder {
site_packages: Option<PathBuf>,
custom_typeshed: Option<PathBuf>,
) -> Self {
- Self(
+ let mut paths: Vec<ModuleSearchPath> = Vec::with_capacity(
+ extra_paths.len()
+ + 1
+ + usize::from(site_packages.is_some())
+ + usize::from(custom_typeshed.is_some()),
+ );
+
+ paths.extend(
extra_paths
.into_iter()
- .map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra))
- .chain(std::iter::once(ModuleSearchPath::new(
- workspace_root,
- ModuleSearchPathKind::FirstParty,
- )))
- // TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user
- .chain(
- custom_typeshed.into_iter().map(|path| {
- ModuleSearchPath::new(path, ModuleSearchPathKind::StandardLibrary)
- }),
- )
- .chain(site_packages.into_iter().map(|path| {
- ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty)
- }))
- // TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step
- .collect(),
- )
+ .map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra)),
+ );
+
+ paths.push(ModuleSearchPath::new(
+ workspace_root,
+ ModuleSearchPathKind::FirstParty,
+ ));
+
+ // TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user
+ paths.extend(
+ custom_typeshed
+ .into_iter()
+ .map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::StandardLibrary)),
+ );
+
+ paths.extend(
+ site_packages.into_iter().map(|path| {
+ ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty)
+ }),
+ );
+
+ // TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step
+
+ Self(paths)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I probably would forgo the with_capacity
optimisation which you also won't get with std::iter::chain
because ExactSizeIterator
isn't implemented for chained (because it could overflow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which you also won't get with
std::iter::chain
becauseExactSizeIterator
isn't implemented for chained (because it could overflow)
Ahh, I didn't realise that! But it makes sense now that you mention it.
I still kinda like the chain()
personally, though 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could also shrink_to_fit
the paths before returning, but probably doesn't matter given that few of these are ever created.
crates/red_knot/src/module.rs
Outdated
/// - `custom_typeshed` is a path to standard-library typeshed stubs. | ||
/// Currently this has to be a directory that exists on disk. | ||
/// (TODO: fall back to vendored stubs if no custom directory is provided.) | ||
impl ResolvedSearchPathOrder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably call this ResolvedSearchPathOrder
because an instance of it doesn't implement an order, it's the ordered search paths.
An alternative would be that SearchPathOrder
sorts the paths: Given a list of ModuleSearchPath
, sort the paths according to PEP-561 (using the `ModuleSearchPathKind)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably call this
ResolvedSearchPathOrder
Not sure I understand -- that's what it's already called
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you mean, though. I renamed it to OrderedSearchPaths
in 4f0bcf5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry. That should have been *I probably wouldn't *. I didn't sleep well tonight and it is really showing. Sorry for this.
pub fn new( | ||
extra_paths: Vec<PathBuf>, | ||
workspace_root: PathBuf, | ||
site_packages: Option<PathBuf>, | ||
custom_typeshed: Option<PathBuf>, | ||
) -> Self { | ||
Self( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a slightly nicer API (requires a few more allocations but whathever, we build this once) is to use a builder:
ResolvedSearchPathBuilder::workspace(workspace_root)
.extras(extras)
.site_packages(site_packages)
.typeshed(typeshed)
.build()
where ResolvedSearchPathBuilder
is
struct ResolvedSearchPathBuilder {
extras: Vec<PathBuf>,
workspace_root: PathBuf,
typeshed: Option<PathBuf>,
site_packages: Option<PathBuf>
}
But I don't think it realy maters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I considered this, but here I think I prefer to force us to explicitly state whether we're passing in site-packages
or whether we want to run in "isolated mode", ignoring all site-packages
packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did almost comment on the call to OrderedSearchPaths::new
above that it's hard to read/understand the call in code review, without IDE support, and without Python keyword arguments. Not sure how much this matters. Another way to address it would be taking in a config struct as argument instead of just positional args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can ignore it for now. Ultimately what we'll do is to read the settings and take the fields from there (or fall back to other defaults).
|
||
// e.g. site packages | ||
ThirdParty, | ||
/// Project dependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the word "dependency" here confusing. I wouldn't think of first-party code as a "dependency" -- the dependencies probably live in site-packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just carried over the pre-existing comment here, but yeah, agree that this isn't the best terminology. I'll fix it up in a followup.
@@ -25,6 +25,7 @@ ctrlc = { version = "3.4.4" } | |||
dashmap = { workspace = true } | |||
hashbrown = { workspace = true } | |||
indexmap = { workspace = true } | |||
is-macro = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(That said, I'm fine with avoiding this kind of thing -- it's also easier for humans to read explicit methods than a proc macro.)
|
||
// e.g. built-in modules, typeshed | ||
/// e.g. built-in modules, typeshed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the stdlib portion of typeshed
@@ -225,22 +225,23 @@ struct ModuleSearchPathInner { | |||
kind: ModuleSearchPathKind, | |||
} | |||
|
|||
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)] | |||
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash, is_macro::Is)] | |||
pub enum ModuleSearchPathKind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a comment that these are listed here in priority order? And even a link to the relevant page of the typing spec (soon hopefully to be updated to match what we're doing here)?
pub fn new( | ||
extra_paths: Vec<PathBuf>, | ||
workspace_root: PathBuf, | ||
site_packages: Option<PathBuf>, | ||
custom_typeshed: Option<PathBuf>, | ||
) -> Self { | ||
Self( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did almost comment on the call to OrderedSearchPaths::new
above that it's hard to read/understand the call in code review, without IDE support, and without Python keyword arguments. Not sure how much this matters. Another way to address it would be taking in a config struct as argument instead of just positional args.
extra_paths | ||
.into_iter() | ||
.map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra)) | ||
.chain(std::iter::once(ModuleSearchPath::new( | ||
workspace_root, | ||
ModuleSearchPathKind::FirstParty, | ||
))) | ||
// TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user | ||
.chain( | ||
custom_typeshed.into_iter().map(|path| { | ||
ModuleSearchPath::new(path, ModuleSearchPathKind::StandardLibrary) | ||
}), | ||
) | ||
.chain(site_packages.into_iter().map(|path| { | ||
ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty) | ||
})) | ||
// TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step | ||
.collect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could also shrink_to_fit
the paths before returning, but probably doesn't matter given that few of these are ever created.
if module_search_path.kind().is_standard_library() { | ||
package_path = package_path.join(TYPESHED_STDLIB_DIRECTORY); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is so when people point to a custom typeshed, they can just point to the root of the typeshed repo?
What about instead resolving this once upfront in the way we set our stdlib and typeshed-third-party search paths, so we don't have to do it on every resolution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is so when people point to a custom typeshed, they can just point to the root of the typeshed repo?
Yup. That's what mypy and pyright do currently, and it's what I would expect a --custom-typeshed-dir
argument to do.
What about instead resolving this once upfront in the way we set our stdlib and typeshed-third-party search paths, so we don't have to do it on every resolution?
Could do! I decided to go this way so that calling .path()
on the ModuleSearchPath
instance would give you the path for typeshed
rather than the path for typeshed/stdlib
. That felt like it made the most sense here. As discussed elsewhere, I think we're going to have to do different things for each different kind of module search path anyway, so we're not going to be able to treat all paths in the resolved vector the same way even if this module search path pointed to typeshed/stdlib
rather than just typeshed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it seems more useful if .path()
on the ModuleSearchPath
instance points to the actual path imports will be resolved from.
And agreed that foo-stubs
packages mean special handling for resolutions in site-packages specifically, but in general the more work we can do upfront and the less work we can do on resolution of each package, the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, no strong opinion from me on this; I can change this!
This PR encapsulates all module-resolution logic in
module.rs
. Currently aVec
of search paths is passed into the routines inmodule.rs
, but this opens the door to the search paths being passed in an incorrect order. The order of search paths should be maintained as an invariant via a common routine to avoid the possiblity of error.ResolvedSearchPathOrder::new()
implements the module resolution order given at https://typing.readthedocs.io/en/latest/spec/distributing.html#import-resolution-ordering, with some small differences (applying the changes proposed in https://discuss.python.org/t/pep-561-s-module-resolution-order-seems-incorrect/55048).Work towards #11653