- Feature Name:
provide_any
- Start Date: 2021-11-04
- RFC PR: rust-lang/rfcs#3192
- Rust Issue: rust-lang/rust#96024
The Provider
interface that is core to this proposal has been rejected for now by the libs team meeting. Without that element, what remains here is essentially just the Demand
type (being renamed in rust-lang/rust#113464 to Request
). Without Provider
, Demand
/Request
is only usable by types defined within the standard library itself as is the case in the error_generic_member_access feature proposal. Since error_generic_member_access
is the only known (at the time of writing this) feature using Demand
/Request
, the decision was made to track it in the error_generic_member_access
feature and mark this as rejected for now.
This RFC proposes extending the any
module of the core library with a generic API for objects to provide type-based access to data. (In contrast to the existing APIs which provides type-driven downcasting, the proposed extension integrates downcasting into data access to provide a safer and more ergonomic API).
By using the proposed API, a trait object can offer functionality like:
let s: String = object.request();
let s = object.request_field::<str>(); // s: &str
let x = object.request_field::<SpecificData>(); // x: &SpecificData
Here, the request
and request_field
methods are implemented by the author of object
using the proposed API as a framework.
- A major motivation for this RFC is 'generic member access' for the
Error
trait. That was previously proposed in RFC 2895. This RFC will useError
as a driving example, but explicitly does not propose changes toError
. - A proof-of-concept implementation of this proposal (and the motivating extension to
Error
) is at provide-any. - This work is adapted from dyno.
- Previous iterations of this work included exposing the concept of type tags. These are still used in the implementation but are no longer exposed in the API.
Trait objects (Pointer<dyn Trait>
) provide strong abstraction over concrete types, often reducing a wide variety of types to just a few methods. This allows writing code which can operate over many types, using only a restricted interface. However, in practice some kind of partial abstraction is required, where objects are treated abstractly but can be queried for data only present in a subset of all types which implement the trait interface. In this case there are only bad options: speculatively downcasting to concrete types (inefficient, boilerplatey, and fragile due to breaking abstraction) or adding numerous methods to the trait which might be functionally implemented, typically returning an Option
where None
means not applicable for the concrete type (boilerplatey, confusing, and leads to poor abstractions).
As a concrete example of the above scenario, consider the Error
trait. It is often used as a trait object so that all errors can be handled generically. However, classes of errors often have additional context that can be useful when handling or reporting errors. For example, an error backtrace, information about the runtime state of the program or environment, the location of the error in the source code, or help text suggestions. Adding backtrace methods to Error
has already been implemented (currently unstable), but adding methods for all context information is impossible.
Using the API proposed in this RFC, a solution might look something like:
use std::error::Error;
// Some concrete error type.
struct MyError {
backtrace: Backtrace,
suggestion: String,
}
impl Error for MyError {
fn provide_context<'a>(&'a self, req: &mut Demand<'a>) {
req.provide_ref::<Backtrace>(&self.backtrace)
.provide_ref::<str>(&self.suggestion);
}
}
// Perhaps in a different crate or module, a function for handling all errors, not just MyError.
fn report_error(e: &dyn Error) {
// Generic error handling.
// ...
// Backtrace.
if let Some(bt) = e.get_context_ref::<Backtrace>() {
emit_backtrace(bt);
}
// Help text suggestion.
// Note, we should really use a newtype here to prevent confusing different string
// context information (see appendix).
if let Some(suggestion) = e.get_context_ref::<str>() {
emit_suggestion(suggestion);
}
}
An alternative is to do some kind of name-driven access, for example we could add a method fn get(name: String) -> Option<&dyn Any>
to Error
(or use something more strongly typed than String
to name data). The disadvantage of this approach is that the caller must downcast the returned object and that leads to opportunities for bugs, since there is an implicit connection between the name and type of objects. If that connection changes, it will not be caught at compile time, only at runtime (and probably with a panicking unwrap, since programmers will be likely to unwrap the result of downcasting, believing it to be guaranteed by get
). Furthermore, this approach is limited by constraints on Any
, we cannot return objects by value, return objects which include references (due to the 'static
bound on Any
), objects which are not object safe, or dynamically sized objects (e.g., we could not return a &str
).
Beyond Error
, one could imagine using the proposed API in any situation where we might add arbitrary data to a generic object. Another concrete example might be the Context
object passed to future::poll
. See also a full example in the appendix.
In this section I'll describe how the proposed API is used and defined. Typically there is an intermediate library which has a trait which extends Provider
(we will use the Error
trait in our examples) and has API which is a facade over any
's API. This proposal is relatively complex in implementation. However, most users will not even be aware of its existence. Using the any
extensions in an intermediate library has some complexity, but using the intermediate library is straightforward for end users.
This proposal supports generic, type-driven access to data and a mechanism for intermediate implementers to provide such data. The key parts of the interface are the Provider
trait for objects which can provide data, and the request_*
functions for requesting data from an object which implements Provider
. Note that end users should not call the request_
functions directly, they are helper functions for intermediate implementers to use.
For the rest of this section we will explain the earlier example, starting from data access and work our way down the implementation to the changes to any
. Note that the fact that Error
is in the standard library is mostly immaterial. any
is a public module and can be used by any crate. We use Error
as an example, we are not proposing changes to Error
in this RFC.
A user of Error
trait objects can call get_context_ref
to access data by type which might be carried by an Error
object. The function returns an Option
and will return None
if the requested type of data is not carried. For example, to access a backtrace (in a world where the Error::backtrace
method has been removed): e.get_context_ref::<Backtrace>()
(specifying the type using the turbofish syntax may not be necessary if the type can be inferred from the context, though we recommend it).
Lets examine the changes to Error
required to make this work:
pub mod error {
pub trait Error: Debug + Provider {
...
fn provide_context<'a>(&'a self, _req: &mut Demand<'a>) {}
}
impl<T: Error> Provider for T {
fn provide<'a>(&'a self, req: &mut Demand<'a>) {
self.provide_context(req);
}
}
impl dyn Error + '_ {
...
pub fn get_context_ref<T: ?Sized + 'static>(&self) -> Option<&T> {
crate::any::request_ref(self)
}
}
}
get_context_ref
is added as a method on Error
trait objects (Error
is also likely to support similar methods for values and possibly other types, but I will elide those details), it simply calls any::request_ref
(we'll discuss this function below). But where does the context data come from? If a concrete error type supports backtraces, then it must override the provide_context
method when implementing Error
(by default, the method does nothing, i.e., no data is provided, so get_context_ref
will always returns None
). provide_context
is used in the blanket implementation of Provider
for Error
types, in other words Provider::provide
is delegated to Error::provide_context
.
Note that by adding provide_context
with a default empty implementation and the blanket impl
of Provider
, these changes to Error
are backwards compatible. However, this pattern is only possible because Provider
and Error
will be defined in the same crate. For third party users, users will implement Provider::provide
directly, in the usual way.
In provide_context
, an error type provides access to data via a Demand
object, e.g., req.provide_ref::<Backtrace>(&self.backtrace)
. The type of the reference passed to provide_ref
is important here (and we encourage users to use explicit types with turbofish syntax even though it is not necessary, this might even be possible to enforce using a lint). When a user calls get_context_ref
, the requested type must match the type of an argument to provide_ref
, e.g., the type of &self.backtrace
is &Backtrace
, so a call to get_context_ref::<Backtrace>()
will return a reference to self.backtrace
. An implementer can make multiple calls to provide_ref
to provide multiple data with different types.
Note that Demand
has methods for providing values as well as references, and for providing more complex types. These will be covered in the next section.
The important additions to any
are
pub trait Provider {
fn provide<'a>(&'a self, req: &mut Demand<'a>);
}
pub fn request_value<'a, T: 'static>(provider: &'a dyn Provider) -> Option<T> { ... }
pub fn request_ref<'a, T: ?Sized + 'static>(provider: &'a dyn Provider) -> Option<&'a T> { ... }
pub struct Demand<'a>(...);
impl<'a> Demand<'a> {
pub fn provide_value<T: 'static, F: FnOnce() -> T>(&mut self, fulfil: F) -> &mut Self { ... }
pub fn provide_ref<T: ?Sized + 'static>(&mut self, value: &'a T) -> &mut Self { ... }
}
For details of the implementation, see the provide-any repo which has a complete (though unpolished) implementation of this proposal, and an implementation of the extensions to Error
used in the examples above.
impl<'a> Demand<'a> {
/// Provide a value or other type with only static lifetimes.
pub fn provide_value<T, F>(&mut self, fulfil: F) -> &mut Self
where
T: 'static,
F: FnOnce() -> T,
{ ... }
/// Provide a reference, note that the referee type must be bounded by `'static`, but may be unsized.
pub fn provide_ref<T: ?Sized + 'static>(&mut self, value: &'a T) -> &mut Self { ... }
}
Demand
is an object for provider types to provide data to be accessed. It is required because there must be somewhere for the data (or a reference to it) to exist.
provide_value
and provide_ref
are convenience methods for the common cases of providing a temporary value and providing a reference to a field of self
, respectively. provide_value
takes a function to avoid unnecessary work when querying for data of a different type; provide_ref
does not use a function since creating a reference is typically cheap.
There is some overlap in use cases with Any
. It is sub-optimal for the standard library to include two systems with such similar functionality. However, I believe the new functionality is a complement to any
, rather than an alternative: any
supports type-hiding where the concrete type is chosen by the library, whereas with Provider
, the library user chooses the concrete type.
This proposal is fairly complex, however, this is mitigated by restricting the exposed complexity to intermediate implementers or to users with advanced use cases. For many Rust programmers, they won't even know the implementation exists, but will reap the benefits via more powerful error handling, etc.
There are are few ways to do without the proposed feature: in some situations it might be possible to add concrete accessor methods to traits. A trait could implement generic member access based on a value identifier (i.e., name-driven rather than type driven), such a method would return a dyn Any
trait object. Or a user could speculatively downcast a trait object to access its data.
Each of these approaches has significant downsides: adding methods to traits leads to a confusing API and is impractical in many situations (including for Error
). Value-driven access only works with types which implement Any
, that excludes types with lifetimes and unsized types; furthermore it requires the caller to downcast the result which is error-prone and fragile. Speculative downcasting violates the encapsulation of trait objects and is only possible if all implementers are known (again, not possible with Error
).
The proposed API could live in its own crate, rather than in libcore. However, this would not be useful for Error
(or other standard library types).
As in earlier drafts of this RFC, the proposed API could be in its own module (provide_any
) rather than be part of any
, either as a sibling or a child of any
.
There are numerous ways to tweak the proposed API. The dyno
and object-provider
crates provide two such examples.
We could expose type tags (as used in the implementation of these APIs) to the user. However, whether to do so, and if so exactly how type tags should work (even the key abstractions) are open questions (see below for more discussion). Exposing type tags to the user makes for a much more flexible API (any type can be used if the user can write their own tags), but it requires the user to understand a somewhat subtle and complex abstraction. The API as currently presented is simpler.
Operations involving runtime types are intrinsically tied to the specifics of a language, its runtime, and type system. Therefore, there is not much prior art from other languages.
A closely related feature from other languages is reflection. However, reflective field access is usually name-driven rather than type-driven. Due to Rust's architecture, general reflection is 'impossible'.
In Rust, there are several approaches to the problem. This proposal is adapted from dyno; object provider is a similar crate from the same author.
Some Rust error libraries provide roughly similar functionality. For example, Anyhow (and Eyre) allow adding context to errors in Result
s, which can be accessed by downcasting the error object to the context object.
Can we improve on the proposed API? Although we have iterated on the design extensively, there might be room for improvement either by general simplification, or making the most general cases using type tags more ergonomic.
As with any relatively abstract library, naming the concepts here is difficult. Are there better names? In particular, 'demand' (formerly 'requisition') is very generic.
Extending the API of Error
is a primary motivation for this RFC, but those extensions are only sketched in the examples and implementation. What exactly should Error
's API look like?
It was suggested by @Plecra in the comments of RFC 2895, that this mechanism could be used for providing data from a future's Context
object. That is a more demanding application since it is likely to require &mut
references, objects with complex lifetimes, and possibly even closures to be returned. That had motivated seeking a general API, rather than only supporting references and values.
The implementation of the proposed API uses type tags. These are similar to the Any
trait in that they allow up- and down-casting of types, however, they are fundamentally different in that the abstract trait is not implemented by the concrete type, but rather there is a separate type hierarchy of tags which are a representation of the type. This allows lifetimes to be accurately reflected in the abstract types, which is crucial for the sound implementation of this API. Exactly how these tags should be represented, however, is unresolved. There is one implementation in the provide-any repo, this is an adaption of dyno. @Plecra has proposed an alternative encoding. There are probably others. Perhaps adding language features will allow still more. I expect this part of the implementation can be tweaked over time. The proposed API is designed to keep type tags fully encapsulated so that they can evolve without backwards compatibility risk.
Type tags (discussed above) are part of the implementation of this proposal. In the future, they could be exposed to the user to make a more flexible, general API. Even with this more flexible API, we would still want to keep the existing API for convenience.
The proposal only supports handling value types (or other types with only 'static
lifetimes) and reference types. We could also support mutable references fairly easily. Note, however, that this requires more methods than one might expect since providing mutable references requires a mutable reference to the provider, but for non-mutable references requiring a mutable provider is overly restrictive. Furthermore, providing multiple mutable references to different types requires extending the API of Demand
in a non-trivial way (this is done in provide-any to demonstrate feasibility, thanks to @Plecra for the implementation). Therefore, I have not included handling of mutable references in this proposal, but we could easily add such support in the future.
One issue with using type-based provision is that an object might reasonably provide many different values with the same type. If the object intends to provide multiple values with the same type, then it must distinguish them. Even if it doesn't, the user might request a value expecting one thing and get another with the same type, leading to logic errors (e.g., requesting a string from MyError
expecting an error code and getting a suggestion message).
To address this, users should use specific newtypes to wrap more generic types. E.g., the MyError
author could provide a Suggestion
newtype: pub struct Suggestion(pub String)
. A user would use get_context::<Suggestion>()
(or otherwise specify the type, e.g., let s: Suggestion = e.get_context().unwrap();
) and would receive a Suggestion
which they would then unpack to retrieve the string value.
This appendix gives another example of the proposed API in action, this time for an intermediate trait which is not part of the standard library. The example is a systems with a plugin architecture where plugins extend Provider
so that plugin authors can access arbitrary data in their plugins without having to downcast the plugins. In the example I show the plugin definition and a single plugin (RulesFilter
), I assume these are in different crates. RulesFilter
uses the provider API to give access to a statistics summary (produced on demand) and to give access to a borrowed slice of its rules.
// Declared in an upstream crate.
pub trait Plugin: Provider { ... }
impl dyn Plugin {
pub fn get_plugin_data<T: 'static>(&self) -> Option<T> {
any::request_value(self)
}
pub fn borrow_plugin_data<T: ?Sized + 'static>(&self) -> Option<&T> {
any::request_ref(self)
}
}
// Plugin definition for RulesFilter (downstream)
struct RulesFilter {
rules: Vec<Rule>,
}
impl Plugin for RulesFilter { ... }
struct Statistics(String);
impl Provider for RulesFilter {
fn provide<'a>(&'a self, demand: &mut Demand<'a>) {
demand
.provide_value(|| Statistics(format!(...)))
.provide_ref::<[Rule]>(&self.rules);
}
}
fn log_plugin_stats(plugin: &dyn Plugin) {
if let Some(Statistics(s)) = plugin.get_plugin_data() {
log(plugin.id(), s);
}
}
fn log_active_rules(plugin: &mut dyn Plugin) {
if let Some(rules) = plugin.borrow_plugin_data::<[Rule]>() {
for r in rules {
log(plugin.id(), r);
}
}
}