Skip to content

Latest commit

 

History

History
284 lines (192 loc) · 21.7 KB

3192-dyno.md

File metadata and controls

284 lines (192 loc) · 21.7 KB

This RFC was previously approved, but part of it later rejected

The Provider interface that is core to this proposal has been rejected for now by the libs team meeting. Without that element, what remains here is essentially just the Demand type (being renamed in rust-lang/rust#113464 to Request). Without Provider, Demand/Request is only usable by types defined within the standard library itself as is the case in the error_generic_member_access feature proposal. Since error_generic_member_access is the only known (at the time of writing this) feature using Demand/Request, the decision was made to track it in the error_generic_member_access feature and mark this as rejected for now.

Summary

This RFC proposes extending the any module of the core library with a generic API for objects to provide type-based access to data. (In contrast to the existing APIs which provides type-driven downcasting, the proposed extension integrates downcasting into data access to provide a safer and more ergonomic API).

By using the proposed API, a trait object can offer functionality like:

let s: String = object.request();
let s = object.request_field::<str>();           // s: &str
let x = object.request_field::<SpecificData>();  // x: &SpecificData

Here, the request and request_field methods are implemented by the author of object using the proposed API as a framework.

Notes

  • A major motivation for this RFC is 'generic member access' for the Error trait. That was previously proposed in RFC 2895. This RFC will use Error as a driving example, but explicitly does not propose changes to Error.
  • A proof-of-concept implementation of this proposal (and the motivating extension to Error) is at provide-any.
  • This work is adapted from dyno.
  • Previous iterations of this work included exposing the concept of type tags. These are still used in the implementation but are no longer exposed in the API.

Motivation

Trait objects (Pointer<dyn Trait>) provide strong abstraction over concrete types, often reducing a wide variety of types to just a few methods. This allows writing code which can operate over many types, using only a restricted interface. However, in practice some kind of partial abstraction is required, where objects are treated abstractly but can be queried for data only present in a subset of all types which implement the trait interface. In this case there are only bad options: speculatively downcasting to concrete types (inefficient, boilerplatey, and fragile due to breaking abstraction) or adding numerous methods to the trait which might be functionally implemented, typically returning an Option where None means not applicable for the concrete type (boilerplatey, confusing, and leads to poor abstractions).

As a concrete example of the above scenario, consider the Error trait. It is often used as a trait object so that all errors can be handled generically. However, classes of errors often have additional context that can be useful when handling or reporting errors. For example, an error backtrace, information about the runtime state of the program or environment, the location of the error in the source code, or help text suggestions. Adding backtrace methods to Error has already been implemented (currently unstable), but adding methods for all context information is impossible.

Using the API proposed in this RFC, a solution might look something like:

use std::error::Error;

// Some concrete error type.
struct MyError {
    backtrace: Backtrace,
    suggestion: String,
}

impl Error for MyError {
    fn provide_context<'a>(&'a self, req: &mut Demand<'a>) {
        req.provide_ref::<Backtrace>(&self.backtrace)
            .provide_ref::<str>(&self.suggestion);
    }
}

// Perhaps in a different crate or module, a function for handling all errors, not just MyError.
fn report_error(e: &dyn Error) {
    // Generic error handling.
    // ...

    // Backtrace.
    if let Some(bt) = e.get_context_ref::<Backtrace>() {
        emit_backtrace(bt);
    }

    // Help text suggestion.
    // Note, we should really use a newtype here to prevent confusing different string
    // context information (see appendix).
    if let Some(suggestion) = e.get_context_ref::<str>() {
        emit_suggestion(suggestion);
    }
}

An alternative is to do some kind of name-driven access, for example we could add a method fn get(name: String) -> Option<&dyn Any> to Error (or use something more strongly typed than String to name data). The disadvantage of this approach is that the caller must downcast the returned object and that leads to opportunities for bugs, since there is an implicit connection between the name and type of objects. If that connection changes, it will not be caught at compile time, only at runtime (and probably with a panicking unwrap, since programmers will be likely to unwrap the result of downcasting, believing it to be guaranteed by get). Furthermore, this approach is limited by constraints on Any, we cannot return objects by value, return objects which include references (due to the 'static bound on Any), objects which are not object safe, or dynamically sized objects (e.g., we could not return a &str).

Beyond Error, one could imagine using the proposed API in any situation where we might add arbitrary data to a generic object. Another concrete example might be the Context object passed to future::poll. See also a full example in the appendix.

Guide-level explanation

In this section I'll describe how the proposed API is used and defined. Typically there is an intermediate library which has a trait which extends Provider (we will use the Error trait in our examples) and has API which is a facade over any's API. This proposal is relatively complex in implementation. However, most users will not even be aware of its existence. Using the any extensions in an intermediate library has some complexity, but using the intermediate library is straightforward for end users.

This proposal supports generic, type-driven access to data and a mechanism for intermediate implementers to provide such data. The key parts of the interface are the Provider trait for objects which can provide data, and the request_* functions for requesting data from an object which implements Provider. Note that end users should not call the request_ functions directly, they are helper functions for intermediate implementers to use.

For the rest of this section we will explain the earlier example, starting from data access and work our way down the implementation to the changes to any. Note that the fact that Error is in the standard library is mostly immaterial. any is a public module and can be used by any crate. We use Error as an example, we are not proposing changes to Error in this RFC.

A user of Error trait objects can call get_context_ref to access data by type which might be carried by an Error object. The function returns an Option and will return None if the requested type of data is not carried. For example, to access a backtrace (in a world where the Error::backtrace method has been removed): e.get_context_ref::<Backtrace>() (specifying the type using the turbofish syntax may not be necessary if the type can be inferred from the context, though we recommend it).

Lets examine the changes to Error required to make this work:

pub mod error {
    pub trait Error: Debug + Provider {
        ...
        fn provide_context<'a>(&'a self, _req: &mut Demand<'a>) {}
    }

    impl<T: Error> Provider for T {
        fn provide<'a>(&'a self, req: &mut Demand<'a>) {
            self.provide_context(req);
        }
    }

    impl dyn Error + '_ {
        ...
        pub fn get_context_ref<T: ?Sized + 'static>(&self) -> Option<&T> {
            crate::any::request_ref(self)
        }
    }
}

get_context_ref is added as a method on Error trait objects (Error is also likely to support similar methods for values and possibly other types, but I will elide those details), it simply calls any::request_ref (we'll discuss this function below). But where does the context data come from? If a concrete error type supports backtraces, then it must override the provide_context method when implementing Error (by default, the method does nothing, i.e., no data is provided, so get_context_ref will always returns None). provide_context is used in the blanket implementation of Provider for Error types, in other words Provider::provide is delegated to Error::provide_context.

Note that by adding provide_context with a default empty implementation and the blanket impl of Provider, these changes to Error are backwards compatible. However, this pattern is only possible because Provider and Error will be defined in the same crate. For third party users, users will implement Provider::provide directly, in the usual way.

In provide_context, an error type provides access to data via a Demand object, e.g., req.provide_ref::<Backtrace>(&self.backtrace). The type of the reference passed to provide_ref is important here (and we encourage users to use explicit types with turbofish syntax even though it is not necessary, this might even be possible to enforce using a lint). When a user calls get_context_ref, the requested type must match the type of an argument to provide_ref, e.g., the type of &self.backtrace is &Backtrace, so a call to get_context_ref::<Backtrace>() will return a reference to self.backtrace. An implementer can make multiple calls to provide_ref to provide multiple data with different types.

Note that Demand has methods for providing values as well as references, and for providing more complex types. These will be covered in the next section.

The important additions to any are

pub trait Provider {
    fn provide<'a>(&'a self, req: &mut Demand<'a>);
}

pub fn request_value<'a, T: 'static>(provider: &'a dyn Provider) -> Option<T> { ... }
pub fn request_ref<'a, T: ?Sized + 'static>(provider: &'a dyn Provider) -> Option<&'a T> { ... }

pub struct Demand<'a>(...);

impl<'a> Demand<'a> {
    pub fn provide_value<T: 'static, F: FnOnce() -> T>(&mut self, fulfil: F) -> &mut Self { ... }
    pub fn provide_ref<T: ?Sized + 'static>(&mut self, value: &'a T) -> &mut Self { ... }
}

Reference-level explanation

For details of the implementation, see the provide-any repo which has a complete (though unpolished) implementation of this proposal, and an implementation of the extensions to Error used in the examples above.

Demand

impl<'a> Demand<'a> {
    /// Provide a value or other type with only static lifetimes.
    pub fn provide_value<T, F>(&mut self, fulfil: F) -> &mut Self
    where
        T: 'static,
        F: FnOnce() -> T,
    { ... }

    /// Provide a reference, note that the referee type must be bounded by `'static`, but may be unsized.
    pub fn provide_ref<T: ?Sized + 'static>(&mut self, value: &'a T) -> &mut Self { ... }
}

Demand is an object for provider types to provide data to be accessed. It is required because there must be somewhere for the data (or a reference to it) to exist.

provide_value and provide_ref are convenience methods for the common cases of providing a temporary value and providing a reference to a field of self, respectively. provide_value takes a function to avoid unnecessary work when querying for data of a different type; provide_ref does not use a function since creating a reference is typically cheap.

Drawbacks

There is some overlap in use cases with Any. It is sub-optimal for the standard library to include two systems with such similar functionality. However, I believe the new functionality is a complement to any, rather than an alternative: any supports type-hiding where the concrete type is chosen by the library, whereas with Provider, the library user chooses the concrete type.

This proposal is fairly complex, however, this is mitigated by restricting the exposed complexity to intermediate implementers or to users with advanced use cases. For many Rust programmers, they won't even know the implementation exists, but will reap the benefits via more powerful error handling, etc.

Rationale and alternatives

There are are few ways to do without the proposed feature: in some situations it might be possible to add concrete accessor methods to traits. A trait could implement generic member access based on a value identifier (i.e., name-driven rather than type driven), such a method would return a dyn Any trait object. Or a user could speculatively downcast a trait object to access its data.

Each of these approaches has significant downsides: adding methods to traits leads to a confusing API and is impractical in many situations (including for Error). Value-driven access only works with types which implement Any, that excludes types with lifetimes and unsized types; furthermore it requires the caller to downcast the result which is error-prone and fragile. Speculative downcasting violates the encapsulation of trait objects and is only possible if all implementers are known (again, not possible with Error).

The proposed API could live in its own crate, rather than in libcore. However, this would not be useful for Error (or other standard library types).

As in earlier drafts of this RFC, the proposed API could be in its own module (provide_any) rather than be part of any, either as a sibling or a child of any.

There are numerous ways to tweak the proposed API. The dyno and object-provider crates provide two such examples.

We could expose type tags (as used in the implementation of these APIs) to the user. However, whether to do so, and if so exactly how type tags should work (even the key abstractions) are open questions (see below for more discussion). Exposing type tags to the user makes for a much more flexible API (any type can be used if the user can write their own tags), but it requires the user to understand a somewhat subtle and complex abstraction. The API as currently presented is simpler.

Prior art

Operations involving runtime types are intrinsically tied to the specifics of a language, its runtime, and type system. Therefore, there is not much prior art from other languages.

A closely related feature from other languages is reflection. However, reflective field access is usually name-driven rather than type-driven. Due to Rust's architecture, general reflection is 'impossible'.

In Rust, there are several approaches to the problem. This proposal is adapted from dyno; object provider is a similar crate from the same author.

Some Rust error libraries provide roughly similar functionality. For example, Anyhow (and Eyre) allow adding context to errors in Results, which can be accessed by downcasting the error object to the context object.

Unresolved questions

Can we improve on the proposed API? Although we have iterated on the design extensively, there might be room for improvement either by general simplification, or making the most general cases using type tags more ergonomic.

As with any relatively abstract library, naming the concepts here is difficult. Are there better names? In particular, 'demand' (formerly 'requisition') is very generic.

Extending the API of Error is a primary motivation for this RFC, but those extensions are only sketched in the examples and implementation. What exactly should Error's API look like?

It was suggested by @Plecra in the comments of RFC 2895, that this mechanism could be used for providing data from a future's Context object. That is a more demanding application since it is likely to require &mut references, objects with complex lifetimes, and possibly even closures to be returned. That had motivated seeking a general API, rather than only supporting references and values.

The implementation of the proposed API uses type tags. These are similar to the Any trait in that they allow up- and down-casting of types, however, they are fundamentally different in that the abstract trait is not implemented by the concrete type, but rather there is a separate type hierarchy of tags which are a representation of the type. This allows lifetimes to be accurately reflected in the abstract types, which is crucial for the sound implementation of this API. Exactly how these tags should be represented, however, is unresolved. There is one implementation in the provide-any repo, this is an adaption of dyno. @Plecra has proposed an alternative encoding. There are probably others. Perhaps adding language features will allow still more. I expect this part of the implementation can be tweaked over time. The proposed API is designed to keep type tags fully encapsulated so that they can evolve without backwards compatibility risk.

Future possibilities

Type tags (discussed above) are part of the implementation of this proposal. In the future, they could be exposed to the user to make a more flexible, general API. Even with this more flexible API, we would still want to keep the existing API for convenience.

The proposal only supports handling value types (or other types with only 'static lifetimes) and reference types. We could also support mutable references fairly easily. Note, however, that this requires more methods than one might expect since providing mutable references requires a mutable reference to the provider, but for non-mutable references requiring a mutable provider is overly restrictive. Furthermore, providing multiple mutable references to different types requires extending the API of Demand in a non-trivial way (this is done in provide-any to demonstrate feasibility, thanks to @Plecra for the implementation). Therefore, I have not included handling of mutable references in this proposal, but we could easily add such support in the future.

Appendix 1: using newtypes

One issue with using type-based provision is that an object might reasonably provide many different values with the same type. If the object intends to provide multiple values with the same type, then it must distinguish them. Even if it doesn't, the user might request a value expecting one thing and get another with the same type, leading to logic errors (e.g., requesting a string from MyError expecting an error code and getting a suggestion message).

To address this, users should use specific newtypes to wrap more generic types. E.g., the MyError author could provide a Suggestion newtype: pub struct Suggestion(pub String). A user would use get_context::<Suggestion>() (or otherwise specify the type, e.g., let s: Suggestion = e.get_context().unwrap();) and would receive a Suggestion which they would then unpack to retrieve the string value.

Appendix 2: plugin example

This appendix gives another example of the proposed API in action, this time for an intermediate trait which is not part of the standard library. The example is a systems with a plugin architecture where plugins extend Provider so that plugin authors can access arbitrary data in their plugins without having to downcast the plugins. In the example I show the plugin definition and a single plugin (RulesFilter), I assume these are in different crates. RulesFilter uses the provider API to give access to a statistics summary (produced on demand) and to give access to a borrowed slice of its rules.

// Declared in an upstream crate.
pub trait Plugin: Provider { ... }

impl dyn Plugin {
    pub fn get_plugin_data<T: 'static>(&self) -> Option<T> {
        any::request_value(self)
    }

    pub fn borrow_plugin_data<T: ?Sized + 'static>(&self) -> Option<&T> {
        any::request_ref(self)
    }
}

// Plugin definition for RulesFilter (downstream)
struct RulesFilter {
    rules: Vec<Rule>,
}

impl Plugin for RulesFilter { ... }

struct Statistics(String);

impl Provider for RulesFilter {
    fn provide<'a>(&'a self, demand: &mut Demand<'a>) {
        demand
            .provide_value(|| Statistics(format!(...)))
            .provide_ref::<[Rule]>(&self.rules);
    }
}

fn log_plugin_stats(plugin: &dyn Plugin) {
    if let Some(Statistics(s)) = plugin.get_plugin_data() {
        log(plugin.id(), s);
    }
}

fn log_active_rules(plugin: &mut dyn Plugin) {
    if let Some(rules) = plugin.borrow_plugin_data::<[Rule]>() {
        for r in rules {
            log(plugin.id(), r);
        }
    }
}