Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid non-local definitions in functions #3373

Merged

Conversation

joshtriplett
Copy link
Member

@joshtriplett joshtriplett commented Jan 20, 2023

Summary

Add a warn-by-default lint for items inside functions or expressions that implement methods or traits that are visible outside the function or expression. Consider ramping that lint to deny-by-default for Rust 2024, and evaluating a hard error for 2027.

Motivation

Currently, tools cross-referencing uses and definitions (such as IDEs) must
search inside all function bodies to find potential definitions corresponding
to uses within a function.

Humans cross-referencing such uses and definitions may find themselves
similarly baffled.

With this change, both humans and tools can limit the scope of their search and
avoid looking for definitions inside other functions.

Rendered

Tracking Issue

@joshtriplett joshtriplett added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 20, 2023
@steffahn
Copy link
Member

steffahn commented Jan 20, 2023

With this change, both humans and tools can limit the scope of their search and avoid looking for definitions inside other functions.

Is this really something that editions could give us? I would have thought, given your description, that macros defined in edition <=2021 crates would still be allowed to expand to functions containing externally visible method implementations, even if the macro is called in an edition 2024 crate; at least that's how I remember macros to work with regards to other edition-based changes: Users of the macro are not supposed to be broken, i.e. it should still be possible to use existing macros without needing to re-write / modify the crate defining that macro. If this isn't supposed to be the case, maybe it would help to more clearly clarify this point? Or is this what the drawbacks section hints at?

(By the way, the “Rendered” link is now broken.)

@Lokathor
Copy link
Contributor

If "you" (the thing analyzing rust code) can see that the current crate is 2024 and also the macros you're using are from 2024 then you'll know that you don't have to check function bodies.

Over time most crates are likely to move to new editions, so it will slowly bring folks to the new style.

@steffahn
Copy link
Member

To clarify my previous comment: of course there is improvement for humans, because these externally visible impls will become more rare; but I don't exactly see how much tools can benefit, unless those tools want to incorrectly handle corner cases (and also drop support for older editions).

@Lokathor
Copy link
Contributor

Both tools and humans would have to handle older editions correctly, but there's still a big potential speed benefit as things move to the new style. Humans have to read less and Computers have to crunch less. Taking less time to get the result because you have to look at less stuff to know the right answer is a benefit all on its own.

@c-git
Copy link

c-git commented Jan 20, 2023

If someone has time for the uninitiated, please post an example of this internal definitions that are externally accessible. Not a big need but I'm curious to see what it looks like, I didn't even know that was a thing. I would have assumed that it was local.

@thomcc
Copy link
Member

thomcc commented Jan 20, 2023

If someone has time for the uninitiated, please post an example of this internal definitions that are externally accessible. Not a big need but I'm curious to see what it looks like, I didn't even know that was a thing. I would have assumed that it was local.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=65532b0d964308e36555207fb14c4770

@clarfonthey

This comment was marked as resolved.

@steffahn

This comment was marked as resolved.

@petrochenkov
Copy link
Contributor

If this is done then it would be good to add the same restriction to #[macro_export] macro_rules! { ... } because they also have similar global effect.

@petrochenkov
Copy link
Contributor

@Lokathor

Both tools and humans would have to handle older editions correctly, but there's still a big potential speed benefit as things move to the new style.

It would be good to see some confirmation that the speed benefit indeed exists.

If "you" (the thing analyzing rust code) can see that the current crate is 2024 and also the macros you're using are from 2024 then you'll know that you don't have to check function bodies.

It may be somewhat more complex than it seems because editions are tracked per-token rather than per-macro.

may not define an `impl Type` block unless the `Type` is also nested inside
the same function or closure.
- An item nested inside a function or closure (through any level of nesting)
may not define an `impl Trait for Type` unless either the `Trait` or the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about impl Inner<Outer> for Type or impl Trait for Inner<Outer>?

Copy link
Member

@lnicola lnicola Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Inner and Outer (and you've called that simply because generics are involved) are non-local traits/types, these are just as problematic.

If Inner is a local (declared next to the impl) type or trait and Outer is non-local, I think it's fine to allow them because they won't be visible to the code outside.

I think it boils down to: "looking at some code, can we resolve the methods and names in it without parsing the bodies of the other functions (statics, consts etc.), with the exception of the ancestors of the given code?".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant "Inner" == "local", and "Outer" == "non-local".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only parameters of the trait/type are non-local I think it should be fine? Those impls still can't be observed from the outside since you'd need to be able to mention the trait/type still if I am not mistaken

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, you can actually leak local types to the outside like this: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1921433dd4209e89c7c96949a23e99b8

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither the trait nor the type in that impl are local to the function though, so that should be rejected by these restrictions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And potentially also cases like in https://internals.rust-lang.org/t/overly-strict-coherence-for-constrained-blanket-implementations/18204/8. I think there's like a whole coherence aspect to this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still leak non-opaque types while following the proposed rules. If that seems far fetched, consider -> impl IntoIterator<Item = X> where the IntoIter associated type is not specified (but you're going to generate one).

@joshtriplett
Copy link
Member Author

@petrochenkov wrote:

If this is done then it would be good to add the same restriction to #[macro_export] macro_rules! { ... } because they also have similar global effect.

Good catch. Added.

@joshtriplett
Copy link
Member Author

To clarify my previous comment: of course there is improvement for humans, because these externally visible impls will become more rare; but I don't exactly see how much tools can benefit, unless those tools want to incorrectly handle corner cases (and also drop support for older editions).

According to rust-analyzer folks, r-a currently doesn't handle this case, and doesn't plan to.

@Veykril
Copy link
Member

Veykril commented Jan 20, 2023

According to rust-analyzer folks, r-a currently doesn't handle this case

r-a only handles local impls that are observable from the current or parent scopes, so

struct S;
fn f() {
	impl S {
		fn fun() {}
	}
	S::fun(); // resolves for r-a because the impl is in this scope and therefor cheap to observe
}
fn g() {
    S::fun(); // fails to resolve for r-a because we don't know about the impls defined in other bodies
}

, and doesn't plan to.

correct, for us to support this we would need to know about all impls at all times, which means more memory usage, and even worse meaning we are required to look into all bodies to figure out what something resolves to which makes the IDE anything but snappy. Hence we will presumably never support this.

@afetisov
Copy link

Should this be a hard error, or just a deny-by-default lint? Given that it will still be possible in older editions, and there is a lot of old code out there, the IDEs would either have to handle or ignore that edge case anyway. I doubt that I saw many usages of that pattern in the wild.

The benefits of making it a lint would be that some tricky macro code or rustdoc which use that pattern would still be able to use it. An explicit #[allow(inner_pub_impls)] annotation could also serve as a warning to readers and tools, if they want to handle this edge case.

@lnicola
Copy link
Member

lnicola commented Jan 20, 2023

I doubt that I saw many usages of that pattern in the wild.

I think serde used to do this. It's a single usage, but if an IDE doesn't support serde or some other popular crate, users will be very upset. serde is now using an anonymous const of unit type, like in rust-lang/rust-analyzer#7550.

the IDEs would either have to handle or ignore that edge case anyway

The IDEs are not going to want to handle this case, and allowing it will usually lead a poor IDE experience and user complaints.

The benefits of making it a lint would be that some tricky macro code or rustdoc which use that pattern would still be able to use it.

Do we know some examples of this where the top-level const trick wouldn't help?

@weiznich
Copy link
Contributor

As a maintainer of a potentially affected crate, I miss details about how this should interact with the edition boundary and proc-macros. Proc-macros generate code in the context of the crate with the source that the proc macro is applied to. By definition the author of the proc-macro cannot control the edition of that crate, but only the edition of the proc-macro crate itself. I would like to see details about how you imaging to prevent breaking changes for such proc macro crates compiled with an old edition, but used in a >= 2024 edition rust crate.

I also would like to see a section about alternatives to the "removed" design pattern. After all, proc macro authors did use this pattern for some reasons, so just removing it without replacement feels like restricting what proc macros can do.

I doubt that I saw many usages of that pattern in the wild.

I think serde used to do this. It's a single usage, but if an IDE doesn't support serde or some other popular crate, users will be very upset. serde is now using an anonymous const of unit type, like in rust-lang/rust-analyzer#7550.

Diesel did something like that as well. Similarly to serde we have switched to an anonymous const now, as that's the thing that is currently supported by rust-analyzer.

@afetisov
Copy link

The IDEs are not going to want to handle this case, and allowing it will usually lead a poor IDE experience and user complaints.

As simple test shows that Intellij-Rust already handles this pattern. If RA folks don't want to handle it, that's their own decision. Personally I don't believe that it's as big deal as people say in comments: skipping entire modules is a major win for IDE performance, but with the way impls are designed it's impossible. So you still need to parse all modules and expand all macros if you want to find all possible impls. Once you do it, parsing the bodies of functions and adding external impls to the index is a really minor difference.

There could possibly be IDE performance wins if all items inside of function bodies were forbidden, or perhaps at least if types and impls could not be declared inside of function bodies. But simply forbidding impls of outer types for outer traits has pretty minor wins: you still need to parse all bodies, all impls inside them, and you need to check whether that impl should be marked as error, and there would be some complicated rules, similar to orphan rules, about which exact combinations of types and traits is forbidden. What are the conditions where

impl<T: Trait> Frob<Foo, Item=Bar> for Baz<T, Quux, Elt=Pooh, const N: Moo>

is forbidden? It looks like this rule adds more complexity than takes away.

@lnicola
Copy link
Member

lnicola commented Jan 20, 2023

As simple test shows that Intellij-Rust already handles this pattern.

Yeah, that appears to be true.

There could possibly be IDE performance wins if all items inside of function bodies were forbidden, or perhaps at least if types and impls could not be declared inside of function bodies.

But that's the thing, a type declared in a function won't be visible outside of it.

What are the conditions where impl<T: Trait> Frob<Foo, Item=Bar> for Baz<T, Quux, Elt=Pooh, const N: Moo> is forbidden? It looks like this rule adds more complexity than takes away.

As a first approximation, when none of those are declared inside the current block.

@afetisov
Copy link

afetisov commented Jan 20, 2023

But that's the thing, a type declared in a function won't be visible outside of it.

The type won't be, but the impl will be, and a good IDE should be able to search for trait impls thoroughly.

trait Trait {}

fn f() -> impl Trait {
    struct Foo;
    impl Trait for Foo {}
    
    Foo
}

If the result is passed around a few functions, maybe even cast to a trait object, finding the root function can be quite hard. It's much better if, having the question "what this &dyn Trait could point to", I can just search for impls of Trait.

Point being, this RFC doesn't solve any of the real pain points for impl search. All the really hard parts stay: the unbounded search (any module in any crate could have an impl), the parsing of function bodies and inner items, macro expansion, name resolution in item interfaces (since we need to know which items are local), likely even type inference (if we need to disambiguate default type parameters). At this point what does the proposed restriction give us? We can just add the impl to the index and be done with it, instead of running extra complex checks.

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Jan 10, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Jan 10, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@scottmcm
Copy link
Member

scottmcm commented Jan 10, 2024

I don't actually like this nesting, so I don't have any concerns with having a lint against it.

That said, it's unclear to me how much it's helping a bunch of the motivations if it's only a lint. Looking at

Currently, tools cross-referencing uses and definitions (such as IDEs) must either search inside all function bodies and other expression-containing items to find potential definitions corresponding to uses within another function, or not cross-reference those definitions at all.

Humans cross-referencing such uses and definitions may find themselves similarly baffled.

Both are still problems even with a deny-by-default lint.

So while I don't think it raises to the level of a concern, I'd have been more tempted to postpone this, pending conversation about things like the problem space described in #3527 that want new places to write things and being willing to hard-error it for tools to rely on it.

@dtolnay
Copy link
Member

dtolnay commented Jan 10, 2024

Unresolved questions

Should we flag these definitions in anonymous const items as well, or would that produce unwanted warnings?

As one data point, I do not understand how serde_derive would accommodate this. (Edit: the version where it would also trigger for anonymous const. TC pointed out, the RFC has a carveout for anonymous const.)

Right now #[derive(Serialize)] struct Struct ... expands to:

const _: () = {
    extern crate serde as _serde;
    impl _serde::Serialize for Struct {
        ...
    }
};

I don't know of any other expansion it could use that doesn't rely on impl of non-local trait for non-local type in an expression-containing item.

In particular, changing the macro to produce something like:

impl ::serde::Serialize for Struct {...}

is incompatible with 2015 edition. Releasing serde 2.0 just to drop support for calling serde derives from 2015 edition code is not appealing.

Whereas:

impl serde::Serialize for Struct {...}

means something different and would break a lot of code.

So making the lint apply for anonymous const (as far as hard error in 2027) can't work until something like rust-lang/rust#54363 (comment) is available.

@traviscross
Copy link
Contributor

Note that the proposed RFC does specifically exempt the const _: .. = .. pattern by excluding anonymous const items from its definition of applicable "expression-containing items":

or non-anonymous const items

(Emphasis added.)

It does flag this as an unresolved question:

Should we flag these definitions in anonymous const items as well, or would that produce unwanted warnings?


@rustbot labels -I-lang-nominated

We discussed this today on the T-lang call, and this is now in FCP, so let's unnominate.

@rustbot rustbot removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Jan 10, 2024
@dtolnay
Copy link
Member

dtolnay commented Jan 10, 2024

Good call — I would love to see that unresolved question resolved in the direction of no warning for anonymous const, based on #3373 (comment).

@tmandry
Copy link
Member

tmandry commented Jan 16, 2024

@rfcbot reviewed

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. to-announce and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Jan 20, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Jan 20, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@traviscross
Copy link
Contributor

We've accepted this RFC and it has now been merged.

Thanks to @joshtriplett for putting this together and to all reviewers who provided helpful feedback.

We've opened a tracking issue in rust-lang/rust#120363. Follow along there for further updates.

@GoldsteinE
Copy link

Are there plans to relax the requirements for macros defined in older-edition crates? As it stands, making this deny-by-default in 2024 means that you can’t upgrade to 2024 unless all your macro dependencies upgraded / introduced a mitigation, which kind of goes against the spirit of “crates from different editions should be able to work together”.

I understand that this applies to a lot of possible edition changes, but this particular change introduces a restriction for an extremely widespread pattern. If you have an abandoned dependency somewhere in dependency tree (which is IMHO not uncommon for Rust projects) it may mean that you can’t upgrade to 2024 ever unless you remove the problematic dependency tree branch, which could be really painful.

@Urgau
Copy link
Member

Urgau commented Feb 26, 2024

@GoldsteinE

  • As stated in the RFC the lint is attached to the impl keyword, so if the impl definition is coming from a macro crate under the 2021 edition it will follow the current level (warn-by-default), even if used in a Edition 2024 crate.
  • It was not yet decided (may != will) to bump the lint to deny-by-default in Edition 2024.

In other words, this won't break any older-edition crates.

Please a open issue on rust-lang/rust for further discussions, closed RFC PR are not a good avenue for discussion.

@dhardy
Copy link
Contributor

dhardy commented Mar 3, 2024

This breaks the design of rand::distributions::Standard (for usage inside "expression-containing items"). Copying out the provided doc-example:

use rand::Rng;
use rand::distributions::{Distribution, Standard};

struct MyF32 {
    x: f32,
}

impl Distribution<MyF32> for Standard {
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> MyF32 {
        MyF32 { x: rng.gen() }
    }
}

As has been demonstrated above and in rust-lang/rust#121621, this impl is observable externally, but only via type inference. This type inference is only possible when it is known that no other impl exists. In this particular case, several other impls of Distribution<T> for Standard already exist so such inference is impossible.

@dhardy
Copy link
Contributor

dhardy commented Mar 3, 2024

Motion: re-open this RFC and expand to document the impl ForeignTrait<LocalType> for ForeignType case, which has now come up several times.

Suggestion: this case should be allowed in cases where type inference from an external location cannot resolve LocalType. (Possibly we should add std::marker::PhantomType and consider that any impl parametrised over PhantomType can never have this type inferred.)

Motion: "expression-containing items" should not be a concept as part of the Rust spec. Possibly instead restrict impls to not be more embedded than any trait/type they are defined over, with a single exception for modules.

@RalfJung
Copy link
Member

RalfJung commented Mar 3, 2024

This breaks the design of rand::distributions::Standard (for usage inside "expression-containing items"). Copying out the provided doc-example:

I don't see a non-local definition in that example.

Also, comments in a closed PR are likely just going to be lost. This should be filed as a new issue and referenced from the tracking issue. I will lock this PR as there have been further comments even after an explicit statement that this is the wrong place for such comments.

@rust-lang rust-lang locked and limited conversation to collaborators Mar 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. T-types Relevant to the types team, which will review and decide on the RFC. to-announce
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.