Skip to content

RFC: Assume bounds for generic functions #3802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

DasLixou
Copy link

@DasLixou DasLixou commented Apr 21, 2025

This propsal adds support for #[unsafe(assume)]-ing conditions in where clauses to help with complex generic call stacks and hinting for higher ranked bounds.

Rendered

@jieyouxu jieyouxu added the T-types Relevant to the types team, which will review and decide on the RFC. label Apr 21, 2025
@programmerjake
Copy link
Member

this sounds like just adding late-checked bounds, which isn't necessarily unsafe since the compiler could in theory still do all bounds checking at monomorphization time (more or less exactly how C++ templates work), but it does lead to the almost totally unreadable error messages that C++ templates are infamous for.

@DasLixou
Copy link
Author

this sounds like just adding late-checked bounds, which isn't necessarily unsafe since the compiler could in theory still do all bounds checking at monomorphization time (more or less exactly how C++ templates work), but it does lead to the almost totally unreadable error messages that C++ templates are infamous for.

As mentioned in the RFC, it could help catch some errors, but not all and definitely shouldn't be used by ease, thus unreadable errors aren't really that much of a problem.
It also explains why it is unsafe, because at monomorphization, the compiler isn't aware of lifetimes anymore, which means something like

fn test<T>()
where
    #[unsafe(assume)] T: Cool<'static>
{}

can't be proven or disproven anymore.


`assume`d bounds are just skipped during bounds check and we trust the user.

Later, the compiler could assist with some wrong conditions, like if for example I would pass something in here which doesn't implement `Debug`, the compiler could tell me post-monomorph that this assumed trait bound is not fulfilled for that _specific_ type. But you shouldn't 100% depend on this, as for example lifetimes aren't preserved up to that stage, so any lifetime-dependant condition is completely unchecked, thus making it `unsafe`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a lint, or might it become a hard error? (Hard errors could cause problems if people are using this feature in dead code.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait is it a problem because it might error without being used or because it is eliminated before being errored so that there's no error?

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it might error without being used

This one. E.g., code like this:

if check_that_its_really_debug() {
    unsafe { assume_t_implements_debug::<T>() };
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh that's what you mean by dead code.. yeah then hint is probably better, if not even less..

@programmerjake
Copy link
Member

It also explains why it is unsafe, because at monomorphization, the compiler isn't aware of lifetimes anymore, which means something like

yes, hence why I said theoretically, since, to do full checking, the compiler would have to be rearchitected to keep lifetimes around until monomorphization (which is very unlikely to happen).

@Jules-Bertholet
Copy link
Contributor

this sounds like just adding late-checked bounds, which isn't necessarily unsafe since the compiler could in theory still do all bounds checking at monomorphization time

It's not, because not having a post-mono check means you are not restricted in what you can do in dead code.

@kennytm
Copy link
Member

kennytm commented Apr 22, 2025

this is already possible using (full) specialization and I'd argue it is better to evaluate this under a feature which we had experience.

#![feature(specialization)]

use std::fmt::Display;

fn print<T: Display>(val: T) {
    println!("good! {val}");
}

fn less_restricted<T>(val: T) {
    trait PrintAssumed {
        fn print_assumed(self);
    }
    impl<X> PrintAssumed for X {
        default fn print_assumed(self) {
            unsafe extern "C" {
                #[link_name = "\n\n[ERROR] less_restricted() called without satisfying T: Display\n\n"]
                fn error();
            }
            unsafe {
                error();
            }
        }
    }
    impl<X: Display> PrintAssumed for X {
        fn print_assumed(self) {
            print(self)
        }
    }

    PrintAssumed::print_assumed(val)
}

fn main() {
    print(5);
    less_restricted(5);
    print("hello");
    less_restricted("hello");
    // print(Some("not display")); // compile error
    // less_restricted(Some("not display")); // linker error

    struct OnlyDisplayIfStatic<'a>(&'a str);
    impl Display for OnlyDisplayIfStatic<'static> {
        fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
            write!(f, "static: {}", self.0)
        }
    }

    print(OnlyDisplayIfStatic("static"));
    less_restricted(OnlyDisplayIfStatic("static"));
    let bad = "bad".to_string();
    // print(OnlyDisplayIfStatic(&bad)); // compile error
    less_restricted(OnlyDisplayIfStatic(&bad)); // pass, UB.
}

@DasLixou
Copy link
Author

DasLixou commented Apr 22, 2025

Interesting idea, but especially with your 'static example below you imply that specialization will get lifetime support, somehow.. and I don't think there are that many people who want to make this with the current compiler (if I'm up to date with the debate)

Edit: oh yeah sorry you even wrote UB there, skipped that

@kennytm
Copy link
Member

kennytm commented Apr 22, 2025

Interesting idea, but especially with your 'static example below you imply that specialization will get lifetime support, somehow.. and I don't think there are that many people who want to make this with the current compiler (if I'm up to date with the debate)

It is no different from this RFC itself, which you can't prevent anyone using print_assumed(OnlyDisplayIfStatic(&bad)). Isn't this also the rationale why the attribute is unsafe:

  • ..., as for example lifetimes aren't preserved up to that [post-monomorphization] stage, so any lifetime-dependant condition is completely unchecked, thus making it unsafe.

So #[unsafe(assume)] bound should unconditionally force the function print_assumed must be declared unsafe fn. But a proc-macro generating the specialization above can annotate the same on the generated unsafe fn PrintAssumed::print_assumed.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When implementing a function with a `where` clause, like this one:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about non-functions

trait Foo<U> where #[unsafe(assume)] U: Iterator {
    type Item where #[unsafe(assume)] <U as Iterator>::Item: Into<u16>;
}

impl Foo<U> for U where #[unsafe(assume)] U: Future {
    type Item = U::Output;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:0 interesting idea, should probably also work on those..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like it would be much more difficult to implement/develop coherent semantics for.

@DasLixou
Copy link
Author

@kennytm yeah it isn't very different from the RFC but the problem with specialization is that the thing causing UB isn't causing UB while being unsafe, but rather while being unsound. So how this will work out solely relies on how specialization progresses

@clarfonthey
Copy link

I'm a bit confused on the benefit of these bounds at all. Like, in what circumstances would it be useful to have them either over a regular bound, or no bound at all?

@DasLixou
Copy link
Author

I'm a bit confused on the benefit of these bounds at all. Like, in what circumstances would it be useful to have them either over a regular bound, or no bound at all?

Only on rare occasions, like where the compiler can't verify it itself because of too many indirections or when you really don't want to go through 30 layers of generic functions

@Jules-Bertholet
Copy link
Contributor

I'm a bit confused on the benefit of these bounds at all. Like, in what circumstances would it be useful to have them either over a regular bound, or no bound at all?

Only on rare occasions, like where the compiler can't verify it itself because of too many indirections or when you really don't want to go through 30 layers of generic functions

It would be nice to see a concrete, real-world example.

@Noratrieb
Copy link
Member

This is a feature with very major impact on the type system of the Rust language, and such features are not added lightly. The RFC is very short, containing only a short motivation with very few details. With this, it's hard to extract what exactly the problem it is you're having, and what other solutions there can be.
Then, about these other solutions, the RFC makes no attempt to think of other solutions of this problem at all. Especially for a large change to the type system, there is a very high chance that many other solutions could be done to approach this problem, and they should all be laid out explicitly and evaluated against each other to determine the best solution. It seems very unlikely to me that this approach here would be the winner.

One major feature that relates to this is "implied bounds", RFC Tracking Issue. It's not guaranteed that this will ever land either, but if you do want to solve your problem, that direction seems a lot more promising, so it's probably better to invest your time there instead of pursuing this RFC, which is likely a dead-end (I am not on a relevant team to make the final call about this, but I can't imagine a world in which this is accepted as-is today).

This is an area with many hidden complexities, so working on it will not yield immediate returns as things like this take time, but if you want to work on this, I really recommend looking into alternative approaches like implied bounds, or entirely different directions you may come up with.

@Jules-Bertholet
Copy link
Contributor

One possible use case for this is expressing bounds that the compiler can’t understand yet. For example:

unsafe fn foo<T>(param: T)
where
    // We actually only need `T: for<'a, 'b: 'a> Trait<'a, 'b>`,
    // but rustc can’t understand that atm
    #[unsafe(assume)] T: for<'a, 'b> Trait<'a, 'b>
{
    ... 
}

```rs
pub fn print<T>(val: T)
where
#[unsafe(assume)] T: Debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making the attribute unsafe, would it not make more sense to require the function it is applied to to be unsafe?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well not really.. The user assures the safety by already writing unsafe in the attribute. When we also allow that for bounds in e.g. struct definitions, then there is no way of making that unsafe otherwise. Also, not every function using that must be unsafe, e.g. a type_id_of_static where it get's the typeid of it when it would be 'static would be completely safe.

Copy link
Member

@kennytm kennytm Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DasLixou

Well not really.. The user assures the safety by already writing unsafe in the attribute.

According to the RFC, only the declarator of print knows that T: Debug is an unsafe assumption. The actual user, the caller, does not know this requirement:

  • This makes T still behave like having Debug in the function body, but that isn't the case for the caller. There, it skips the check and just assumes the condition is true.

That means a caller can write this without any unsafe {} to indicate trusted assumption existed:

struct OnlyDebugIfStatic<'a>(&'a u8);
impl Debug for OnlyDebugIfStatic<'static> { ... }

...

let non_static = 0u8;
your_buggy_crate::print(OnlyDebugIfStatic(&non_static)); // UB without any `unsafe` in sight.

When we also allow that for bounds in e.g. struct definitions, then there is no way of making that unsafe otherwise.

That means this feature is flawed

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

`assume`d bounds are just skipped during bounds check and we trust the user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

unsafe fn foo<T>(param: T) -> impl Debug 
where
    #[unsafe(assume)] T: Debug,
{
    param
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. Is this something I should explicitly provide as an example?

@ahicks92
Copy link

I think that whether or not the attribute should be unsafe is less important than first defining what unsafe means here.

In practice if the point is complex bounds--especially if the point is complex higher-ranked bounds for lifetimes--there's no way you'd be able to spot whether it's safe by reading the code. "you, the human, are now the compiler" isn't a good idea (source: I know C++ and was the human compiler checking lifetimes there...).

For things which aren't higher-ranked you can usually get named centralized bounds:

trait NamedBound where Self: Bound {}
impl<T: ?Sized> NamedBound where T: Bound {}

// Use it
fn example<T: NamedBound>(val: T) {
    // code.
}

Which has the added benefit of not requiring you to repeat yourself and also plays nice with feature flags. It does work with lifetimes too. It may work to some limited extent with HRTBs but I've never been crazy enough to try. It can even go in the public API of your crate. For the non-HRTB cases this works and prevents marching up and down the call graph if you need to change a bound.

@DasLixou
Copy link
Author

This is a feature with very major impact on the type system of the Rust language, and such features are not added lightly. The RFC is very short, containing only a short motivation with very few details. With this, it's hard to extract what exactly the problem it is you're having, and what other solutions there can be. Then, about these other solutions, the RFC makes no attempt to think of other solutions of this problem at all. Especially for a large change to the type system, there is a very high chance that many other solutions could be done to approach this problem, and they should all be laid out explicitly and evaluated against each other to determine the best solution. It seems very unlikely to me that this approach here would be the winner.

One major feature that relates to this is "implied bounds", RFC Tracking Issue. It's not guaranteed that this will ever land either, but if you do want to solve your problem, that direction seems a lot more promising, so it's probably better to invest your time there instead of pursuing this RFC, which is likely a dead-end (I am not on a relevant team to make the final call about this, but I can't imagine a world in which this is accepted as-is today).

This is an area with many hidden complexities, so working on it will not yield immediate returns as things like this take time, but if you want to work on this, I really recommend looking into alternative approaches like implied bounds, or entirely different directions you may come up with.

Implied bounds do look interesting, and I might be able to bend them to my usecase.

As for the real world example, I want the user to have many nested functions with a signature looking something like fn _<T>() -> impl FnOnce(T) and then there might be an item impl Key which can be created from that T and assures that T has a special trait bound. So in short, it means for e.g. 20 nested functions either expanding to fn _<K: Key, T>(k: K) -> impl FnOnce(T) where T: Has<K::Provided> or just giving down an impl Key.

@ahicks92
Copy link

You might want that but what we're trying to say is that you need to consider all of the implications and write them down. So:

  • Can functions using this be pub? What happens if you use one of these unsafe bounds in a public API? On crates.io?
  • How often do these bounds come up in practice? One person with one usecase isn't worth major compiler work.
  • Why isn't something like the above named bound trick worth it? What's the shortcoming? If there is one, why is the shortcoming enough of a problem that it's worth fixing?
  • If the attribute is unsafe, how does someone who isn't you maintain the codebase if they are working 5 levels up? how does this prevent violating safety without an unsafe block?
  • What about other alternatives? Macros in where clauses and following conventions for naming type params for example?
    • Or a proc macro crate that can add the bounds via an attribute in some generic way perhaps? E.g. #[where_clause(myalias)]
    • Or dedicated syntax to name a bound, which I believe has been proposed before? E.g. bound MyBound = ...
  • As mentioned by others how does this interact with other nightly features?
  • If it is unsafe, what kind of UB can actually happen? How does it interact with the ongoing efforts to precisely define Rust's safety model?
  • Why is it worth basically adopting one of C++'s core shortcomings that every language tries to avoid Rust or otherwise?
  • What happens if a function assuming a bound decides to call another function that assumes a different bound?
    • And what happens when those bounds can never be satisfied at the same time?
  • Who is going to do this work if accepted?

I could probably go on for a while. If you think the solutions and points are all obvious, they need to go in your RFC. If you can't find answers you might want to consider retracting. RFCs aren't usually a forum thread that doesn't have a draft where you jump from "no there is no good answer right now" to "here is an RFC".

I think it's also worth pointing out that HRTBs are barely explained right now. There's one chapter in the nomicon about why they're needed for closures but as far as I know that's it. You're also proposing this for a feature which is so underused that it's not even fully documented, at least as far as I know (aka about 8 months ago; I spent a very very long time looking).

@DasLixou
Copy link
Author

DasLixou commented Apr 26, 2025

I really love treating the type system and generics like a whole other architecture and also want my small fancy transmute-equivalent there, which this to a certain point resembles. But as implied bounds solve my problem quite elegantly and the process and implementation of an RFC is hard, not only for me, and I don't think I can really write down how I feel about this feature, especially because my problem is kinda solved, so I'll close this RFC for now.
And a big thank you to everyone discussing here and thank you for pointing to implied bounds.

@DasLixou DasLixou closed this Apr 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-types Relevant to the types team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants