-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtually dispatched trait methods #3440
base: master
Are you sure you want to change the base?
Conversation
This is probably not what you want to here, but the pattern you are trying to express is fundamentally one of inheritance, not composition. You are treating Instead, dependencies should be specified explicitly, ie. it should be clear that the implementation of However, this is another red flag when it comes to the design: generally you don't want cycles in your dependencies. It implies that your supposedly separate components are in fact tightly coupled. It could also lead to bugs: in your example, accidental misuse of the It's difficult to say what the correct design should be when the example is so abstract, but one option is to provide a default implementation of pub trait System {
fn op1(&self) -> SystemOp1 {
self.op2() * 5
}
fn op2(&self) -> SystemOp2;
// etc.
} This can be adapted further if you want multiple different "op1 behaviours" depending on the implementation by making the result of pub trait System {
fn op1(&self) -> SystemOp1 {
self.op1(self.op2())
}
fn op2(&self) -> SystemOp2;
fn op1_impl(&self, op2: SystemOp2);
// etc.
} There are plenty more solutions like this that don't require sweeping language changes, are much more concise, whilst being more explicit about dependencies. |
Maybe you could find this helpful: Rfc:what the process is |
@VitWW thanks for the comment. I thought the numbering was unrelated to the PR ids, my bad. I have now updated the PR. Or is your comment referencing something else? I wasn't sure. |
@Diggsey thanks for the comment. Firstly, let me say that I wasn't claiming there aren't workarounds. Rust is pretty powerful as it is. But I cannot agree with you that 1) what i'm proposing is "sweeping language" change - it's an opt-in feature that requires a relatively simple grammar change that can express a pattern in what in my eyes is much more consise and flexible of an approach. In particular, your suggestions of modifying I take your point about unintended recursion being a potential issue. However, at least in the real-life case I have in mind there's a contract that must be respected that guarantees recursion never happens. Sadly, I cannot share specifics here, but I'll try to think of a good alternative example where what I'm talking about makes concrete sense. By the way, I am not sure if I necessarily agree that virtual methods should necessarily be equated to inheritance ( |
I've wanted inheritance once or twice but always found better ways, at least once by using Deref polymorphism. I think Deref polymorphism is much less of an anti-pattern than full inheritance btw. There exist various places where folks use manual vtables in rust, either safe or unsafe, so maybe glance at https://internals.rust-lang.org/t/associated-const-fns-can-clean-up-abstraction-layers/18562 |
Here is a slightly more concrete example. Imagine we have different models of attractiveness of a being. pub enum Being {
Pet,
Person,
}
pub trait AttrModel {
// we define all the differenet qualities we may use to model the attractiveness
// e.g. these are all the things we have concrete data for or models based on concrete data
fn intelligence(&self, being: &Being) -> f64;
fn looks(&self, being: &Being) -> f64;
fn compatibility(&self, being: &Being) -> f64;
fn attractive(&self, being: &Being) -> f64;
} Now, say we have different ideas of what attractiveness could be, e.g. pub struct Population1;
impl AttrModel for Population1 {
fn looks(&self, being: &Being) -> f64 {
// subjective, but according to this model with e.g. data fetched from some db
fetch_looks_pop1(being)
}
fn intelligence(&self, being: &Being) -> f64 {
// we could measure e.g. by some standard test
test1(being)
}
fn compat(&self, being: &Being) -> f64 {
// some measure of compatibility
std::cmp::max(self.intelligence(), self.looks())
}
fn attractive(&self, being: &Being) -> f64 {
// in this model we really only care about intelligence
self.intelligence()
}
}
pub struct Population2;
impl AttrModel for Population2 {
fn looks(&self, being: &Being) -> f64 {
// subjective, but according to this model perceived looks is correlated to intelligence as well
fetch_looks_pop2(being) + 0.2 * self.intelligence()
}
fn intelligence(&self, being: &Being) -> f64 {
// we could measure e.g. by some other standard test
test2(being)
}
fn compat(&self, being: &Being) -> f64 {
// some measure of compatibility, e.g. the average
0.5 * self.intelligence() + 0.5 * self.looks()
}
fn attractive(&self, being: &Being) -> f64 {
// in this model we really only care about a max of all three
std::cmp::max(self.intelligence(), std::cmp::max(self.looks(), self.compat()));
}
} Now, we can have many similar and different models. And we may want to have wrappers that implement standard functionality such as pub struct ExtraIntelligence<A: AttrModel>(A);
impl AttrModel {
fn looks(&self, being: &Being) -> f64 { self.0.looks(being) }
fn intelligence(&self, being: &Being) -> f64 { self.0.intelligence(being) + 1.0 }
fn compat(&self, being: &Being) -> f64 { self.0.compat(being) }
fn attractive(&self, being: &Being) -> f64 { self.0.attractive(being) }
} but that will obviously fail because the statically dispatched The whole thing becomes even more complicated if we start returning different things based on the type of being (which is where the potential of what I'm talking about is really obvious). We could consider pets to be compatible with us if we like how they look (so dispatch to I hope this example is a bit elucidating and I hope it's not offensive to anyone - I'm just trying to work with everyday concepts that everyone can kind of understand. |
Rust is not just a bag of features though, it's a cohesive whole. Sticking another feature on the side that doesn't fit with existing conventions is by definition a sweeping change, because it reduces the consistency of the language considerably.
Traits are there to be used, if a problem is complex enough then it warrants being broken down more, and that might mean more functions. It doesn't have to be public surface area of your API necessarily.
That's kind of the point though. We're only discussing this because you intend to design a modular system (otherwise you could just implement the logic directly) but without explicitly specifying what depends on what, you cannot make a modular system. If you have one "system adapter" that reverses the dependency order, then you will get infinite recursion if you try to combine that adapter with one that uses the original dependency order, because it's intrinsically broken to have mutually dependent attributes. There are a couple of obvious ways to avoid this mutual dependency situation:
In both cases a concise implementation naturally falls out that doesn't require new language features. |
Appreciate your input. I agree that rust is not just a bag of features.
However this doesn’t mean new ones don’t make any sense. Anyone new to rust
expecting it to be cohesive whole quickly learns about lack of
specialization for instance (to take one of the biggest missing things)
Conventions are by definition bound by the existing tools. Prior to const
generics, it might have been conventional to model const integers one way.
Now, the prior convention is unnecessary because we have better tools.
In that sense, I think we may just have to agree to disagree if what I’m
proposing fits consistently with the rest of the language.
Regarding your point about specifying dependencies, I thought self captures
exactly that - a dependence on the object and in the particular case of
traits - the object implementing a trait. What are the dependencies that
allow you to drain a hashmap? From a consumer pov, it’s just a mutable
reference to self. From a library designer’s point of view it might be a
whole lot of inner fields and data structures with their invariants.
Requiring inner implementation details to leak out through the common API
(the trait) is a big red flag for me. If I’m designing a particular
struct’s api, then fine. But for a trait where I devise a common contract
that makes sense within a given context, why should I not be able to
express when things should be statically OR dynamically dispatched?
Regarding dependencies and ending up in infinite loops - I think that’s
something that is up to the developer to prevent. In other words, it’s a
business logic/algorithmic issue. If I make a bug in an implementation of
an algorithm in safe rust and end up in an infinite loop whose fault was
that. I wouldn’t claim it’s rust’s for allowing me to loop {}.
Same with what I’m suggesting here. It’s a simple change that enables great
expressive power. It simplifies APIs, And in my opinion fits well with
rust‘s ability to express certain natural patterns that people will find
themselves wanting to do.
And of course, workarounds will exist. But to me that rings like saying we
don’t need #[derive()] because you can roll your own impl just the same and
it will work. Yeah, it will. But at what cost
El miércoles, 31 de mayo de 2023, Diggory ***@***.***>
escribió:
… But I cannot agree with you that 1) what i'm proposing is "sweeping
language" change - it's an opt-in feature that requires a relatively simple
grammar change that can express a pattern in what in my eyes is much more
consise and flexible of an approach.
Rust is not just a bag of features though, it's a cohesive whole. Sticking
another feature on the side that doesn't fit with existing conventions is
by definition a sweeping change, because it reduces the consistency of the
language considerably.
In particular, your suggestions of modifying System not only increases the
trait surface
Traits are there to be used, if a problem is complex enough then it
warrants being broken down more, and that might mean more functions. It
doesn't have to be public surface area of your API necessarily.
but also presupposes the dependency as being of one op1 on op2. But that
may not be the case (e.g. it could be the other way or both could share a
common dependency on a third op3, etc.)
That's kind of the point though. We're only discussing this because you
intend to design a modular system (otherwise you could just implement the
logic directly) but without explicitly specifying what depends on what, you
cannot make a modular system.
If you have one "system adapter" that reverses the dependency order, then
you *will* get infinite recursion if you try to combine that adapter with
one that uses the original dependency order, because it's intrinsically
broken to have mutually dependent attributes.
There are a couple of obvious ways to avoid this mutual dependency
situation:
1. Be explicit about the direction of the dependency.
2. Split the values into a set of "base values" and "dependent
values". This way any "dependent value" can depend on any "base value"
without causing infinite recursion. Things like ExtraIntelligence can
work by modifying the base values, so you can observe changes to the
"dependent values".
In both cases a concise implementation naturally falls out that doesn't
require new language features.
—
Reply to this email directly, view it on GitHub
<#3440 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6NXUAZ3KP62KNE7KMKITJLXIZTFXANCNFSM6AAAAAAYUGKW6E>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@blagowtf I meant
|
There is certainly a subjective element to it, but there are also some objective ones:
I could list some more but I think that's enough for now...
I'm not sure exactly what you mean. A "self capture" (by which I'm assuming you mean "self parameter") is just fancy syntax for passing the first argument to a function. It's not even required - you can always call functions using UFCS, ie.
A HashMap depends on it's Key and Value types. Importantly the behaviour of Keys and Values in a HashMap are the same as if you operated on those Keys and Values directly. That is to say - a Key/Value doesn't "know" it's in a HashMap. Similarly, if you put that HashMap inside another type, say This means that when you create a complex type, like
The problem is that in your scenario they are not implementation details though. The order in which the attributes depend on each other matters to the user of the type. They cannot be sure to use it correctly without understanding that order:
ie. How can the developer prevent this if you've hidden that fact as an "implementation detail"? By having "base attributes" and "derived attributes" you actually solve this problem: First, only other combinators (like "ExtraIntelligence") or implementations of the trait (like "Population1") need to care that "base attributes" exist at all. To the downstream developer using these types, they don't need to know about this concept at all. Secondly, it means the combinators (like "ExtraIntelligence") are actually fully interchangeable. You can combine an "ExtraIntelligence" with an "ExtraBlah" without it causing an infinite loop. And this is why it's OK for the downstream developer to be unaware of these methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it adds a lot of syntactic and semantic edge-cases that risk altering much of the rest of the language for the sake of a convenience.
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
The case of `mut self`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't state what the unresolved question actually is.
Implementing the above for immutable references is easy. However, mut references are inherently unsafe as we will be aliasing self initially. | ||
|
||
Solution is to either outright ban the above for `&mut self` or only allow it in unsafe contexts (e.g. treat it as `mut ptr`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unacceptable for &mut self
to have differing semantic guarantees in this context, because this makes analysis of Rust code much harder. This must make clear this would only be allowed with *mut T
. Not even unsafe
code is allowed to break the aliasing rules concerning &mut T
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that. I guess my point was that just like with refs to statics, the mutable case is more complicated and thus may necessitate unsafe {}. But point well taken!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static
is unique and special and static mut
should be regarded as a mistake, or at least, not something one should build any precedents off of. There are many subtleties regarding statics, like how referencing them is semantically equivalent to borrowing them, last I checked, which are non-obvious and don't really pattern elsewhere in the language. This is to be expected: the rest of the language doesn't really "do" global state, nevermind global mutable state.
|
||
If the called method is also declared virt, using `self->op2()` will retain the `vself` the same as originally passed to `op1`. Otherwise, it will replace it by `self as &dyn System`. | ||
|
||
Outside of traits, `a->b()` wouldn't compile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Syntax that is sensitive to the trait context but is not part of the trait's declaration syntax, but rather is part of the expression syntax, is extremely unusual. This would mean that macros that work inside a trait's functions do not work inside other functions. This would be very bad for the ecosystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But macros that use this syntax would be meant for the trait case, otherwise normal syntax will still work as it does today. So what is really lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, in syntax generation through macros, an expression is an expression and you don't have to think about it beyond that. This would create a "subclass" of $trait_expr
that could not be used where all $expr
could be used. This is a severe problem for macros concealing their implementation details:
Now macros that pass in an expression using this syntax to the inside of a trait and use that to do some work break if they instead pass the expression through to inside free functions or closures to do that work. In order to create slightly more opacity for traits, it blows a giant hole through opacity for macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
( maybe that's a bit strong, but "expressions are expressions" is kiiinda important in a general sense. )
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
I fail to see any major drawbacks besides the override of `->`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Addition of context-specific syntax is a significant drawback.
- Incompatibility with
&mut self
is a significant drawback and a sign this proposal is insufficiently general. - There are many other possible uses we might prefer to put
->
to that would conflict.
Please think about and try to enumerate more of the drawbacks beyond what I can think of off the top of my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to reply. I find your critique insightful.
The reason why I don't mind the context sensitivity (or see it as an issue) is because of e.g. await in async context. But I see why someone may not like more of that. But mind you, from outside the trait what I'm suggesting simply looks like any old trait call.
Regarding mut self, I just meant it's not something that can obviously live in safe Rust and that's ok. To me it's just like yet another form of ref to self (and as such carries its own pros/cons just like &self, &mut self, self, Box, etc.)
Specific syntax is not that important. I am sure there are other proposals about ->.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const context is another example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
for the most part doesn't change what raw syntax is permissible. A function in const is still a function and is still called as a function. You are right about async, but it is distinct in that async is a distinct addition to the language, rather than pervading all traits. By being "boxed up" into well-defined spaces, rather than "all traits", they are better-contained.
Also, the "keyword generics" initiative has been launched to try to find a solution to the problem I mention.
|
||
Of course, this behaviour may be what is desired. But sometimes, virtualizing at depth is more appropriate. Certainly when adopting OOP patterns. | ||
|
||
When I first stumbled upon this, I wondered if I could simply take `self` as `&dyn System` (or `Arc` or `Box` equivalent, etc.). Nope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this allowed by arbitrary self types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would definitely help for quite a few use cases, yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. To what degree does #![feature(arbitrary_self_types)]
obsolete or outmode this RFC? If what you want (essentially, "more control over vtables") can be built on that more ergonomically, then I think it would be most appropriate to redraft this RFC with that feature in mind.
|
||
- What other designs have been considered and what is the rationale for not choosing them? | ||
|
||
Explicit argument passing considered above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not give the rationale.
|
||
I believe this feature will make Rust more user-friendly to people more inclined to think in OOP terms or who (like me) simply found themselves writing code in a domain that is very amenable to an OOP approach. | ||
|
||
The syntax changes are relatively minimal and there is no extra cost for code that does not use this feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will become possible for this recursively virtualized pattern to be invoked via reference to a static
, even inside a fully concrete function. My intuition suggests this could make optimization significantly harder in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Care to elaborate here? Rust already knows when you're calling into a trait (forcing you to import it in scope) so any such call should simply use the ref to static as the vself? Or did i miss your point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My statement is that by just adding this feature, now functions that internally reference global state can have dependencies that are already hard to resolve at compile time become almost impossible to resolve at compile time, so they may not be statically resolved and thus devirtualized as easily as they could before. This could impact even programs that do not use this feature. It could be that this doesn't actually make anything worse for optimizers, but I suspect it does, so it's probably the case that it is not true that it imposes no costs anywhere else. Many languages, by adding some additional undecidability somewhere, defeat optimization much harder than they may have before.
The proposal to add |
Just quickly about the detection of recursion. I am empathically not suggesting the blocking of it because 1) my proposal would make it at best conditional recursion 2) existing semantics will still apply for all traits as written currently - virt is in the signature after all |
A |
Here is an attempt to use existing language features to implement the slightly more concrete example. pub trait Intelligence {
fn intelligence(&self) -> i64;
}
pub enum Being {
Pet,
Person,
}
impl Intelligence for Being {
fn intelligence(&self) -> i64 {
match self {
Being::Pet => 1,
Being::Person => 10,
}
}
}
pub struct IntelligentBeing<A>(A);
impl<A: Intelligence> Intelligence for IntelligentBeing<A> {
fn intelligence(&self) -> i64 {
self.0.intelligence() + 1
}
} |
In OOP theory the unconditional recursion hazard — as well as less blatant misbehaviors such as accidentally bypassing or duplicating some checks or side effects — is known specifically as the fragile base class problem. (I am not expressing an opinion on this RFC; only hoping the discussion will benefit from additional standard terminology.) |
Also, aside from the RFC's contents, for anything along these lines we would probably want to use one of the existing reserved keywords. |
I think several short side-by-side examples of where virtual inheritance is the only clean or viable solution to common problems would go a long way. (If these examples exist.) I think some discussion about what kinds of inheritance are allowed and disallowed is important. I expect some edge cases where the order of initialization, and the behavior of drop, is unsafe or undefined. I think some discussion of the existing friction in languages that have two or more camps -- one more functional, another more OO -- is important to get anywhere meaningful. Javascript, Scala, C++, Erlang and Lisp (w/ CLOS) stand out as important in that space. |
Rendered