Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: trait-declared associated type synonyms #313

Closed
rust-highfive opened this issue Sep 24, 2014 · 6 comments
Closed

RFC: trait-declared associated type synonyms #313

rust-highfive opened this issue Sep 24, 2014 · 6 comments

Comments

@rust-highfive
Copy link

Note: Much or all of the feature requested here should be covered by RFC PR #195


Issue by pnkfelix
Tuesday Feb 19, 2013 at 16:15 GMT

For earlier discussion, see rust-lang/rust#5033

This issue was labelled with: A-servo, A-traits, A-typesystem, B-RFC, I-enhancement, P-high in the Rust repository


Rust's type system reflects many advances in the state of the art.

However, there is one feature (offered by C++ and ML) that is not present in Rust: The ability to declare within a trait a named type (and then each impl item would be obligated to bind that name to an appropriate type for the implementation).

Related work: See Garcia et al. 2003, especially section 6.2, 10.2, and 10.3. Another useful reference is Chakravarty et al 2005

The Haskell community calls such a construct an "associated type synonym."

Here is an small example to illustrate the idea.

First, the easy part: how such a feature would look at the level of trait and impl items:

trait Graph {
  pub type Node;   // Graphs are composed of vertices
  pub type Edge;   // and arcs between the vertices.
  pub fn src(&self, edge: Edge) -> Node;
  pub fn tgt(&self, edge: Edge) -> Node;
}


// A vector of (source,target) pairs is a graph where the edges
// indices into the vector
impl<T : Copy> &[(T,T)] : Graph {
  pub type Node = T;    // Here we satisfy the obligation to provide Node
  pub type Edge = uint;  // and Edge implementation types.
  pub fn src(&self, edge: Edge) -> Node { let (s,_) = self[edge]; s }
  pub fn tgt(&self, edge: Edge) -> Node { let (_,t) = self[edge]; t }
}

Ideally one would extract the type from a trait by using a path, using a type parameter as a path component. So for example:

trait IncidenceGraph : Graph {
  pub fn out_degree(&self, node: self::Node) -> uint;
  pub fn out_edges(&self, node: self::Node, it: fn(self::Edge) -> bool);
}

fn children<G:IncidenceGraph>(&g:G, &n: G::Node) -> ~[G::Node] {
  let result = ~[];
  for g.out_edges(n) |e| {
    ret.push(g.tgt(n))
  }
  ret
}

Note: I am aware that one can express currently express type synonyms in much the same way that one does in Java (or C#, or Eiffel; see Garcia et al paper): by making them a type-parameter of the trait itself. Rust's type-inference system does make that workaround more palatable, but it seems natural to enable the developer to express what they mean more directly: These types are obligations of the implementation, much like the functions declared in the trait, and thus it makes sense to express them the same way.


There are various issues to consider: e.g.:

  • Do we allow bounds on the type declarations in the trait,
  • Is there is any potential ambiguity with module names introduced by allowing traits (or rather, type parameters implementing a trait) to appear as path components in paths that are used in a type-expression context.
  • Are there any gotcha's or generalizations that have been uncovered by the broader PL community that we would want to address. (For example, GHC has indexed data families, with which I am not familiar but seem to be a generalization with some potential gotchas. Caveat: I am not a Haskell programmer.)

Despite the above issues, this seems like a relatively straightforward change to the language (especially since one can already encode the pattern, so adding it should not inject e..g any type soundness bugs, right?).

@cartazio
Copy link

there are no gotchas for assoicated data families, none. they're great! NB that associate data families!= associated type families (and i think rust is currently only providing associated type families?)

one really really nice thing about data families is they provide injectivity so that knowing the type of the data family instance tells you what the index should be! This is not the case with type families.

@eduardoleon
Copy link

I would like to extend this proposal with the ability to declare both abstract and concrete types as trait members. (Presently, associated types can only be abstract.) My current use case is doing something like this:

pub trait InputIterator {
    pub type Element; // abstract
    pub type Leftovers; // abstract
    pub enum Step { Cont(Element, Self), Done(Leftovers) } // concrete
    pub fn get(self) -> Step;
}

pub trait OutputIterator {
    pub type Element;
    pub type Leftovers;
    pub enum Step { Cont(Self), Done(Leftovers) }
    pub fn put(self, Element) -> Step;
}

In implementations, the programmer must only provide definitions for those type members that are abstract in the original trait:

struct Zip<L, R> { left: L, right: R }
impl<L: InputIterator, R: InputIterator> InputIterator for Zip<L, R> {
    type Element = (L::Element, R::Element);
    enum Leftovers {
        MoreL(L::Element, L, R::Leftovers),
        MoreR(L::Leftovers, R::Element, R),
        Neither(L::Leftovers, R::Leftovers)
    }
    // enum Step *cannot* be redefined here! That would mess up any code that uses the trait.
    fn get(self) -> Step {
        match (self.left.get(), self.right.get()) {
            (L::Cont(el, il), R::Cont(er, ir)) => Cont( (el, er), Zip { left: il, right: ir } ),
            // ...
        }
    }
}

The benefits of this approach are:

  1. The names Step, Cont and Done can be reused for both traits.
  2. We do not need to make InputStep a generic with three type parameters, or OutputStep a generic with two type parameters.

@pnkfelix
Copy link
Member

@eduardoleon the associated types proposal has support for defining default type expressions for a given associated type.

So your example could be written there as something like this (right?):

pub mod input { pub enum Step<S, E, L> { Cont(E, S), Done(L) } }
pub mod output { pub enum Step<S, L> { Cont(S), Done(L) } }
pub trait InputIterator {
    type Element; // abstract
    type Leftovers; // abstract
    type Step = input::Step<Self, Element, Leftovers>;
    fn get(self) -> Step;
}

pub trait OutputIterator {
    type Element;
    type Leftovers;
    type Step = output::Step<Self, Element, Leftovers>;
    fn put(self, Element) -> Step;
}

@eduardoleon
Copy link

@pnkfelix It can be written like that... with some caveats. One thing that worries me is that, presently, associated types can be overriden by the programmer in trait implementations. I need external users to be able to pattern-match on Step, for example:

// again, example taken from C++'s standad library
pub enum Transform<II, LI, IO, LO> { StopI(LI, IO), StopO(II, LO) }
pub fn<I: InputIterator, O: OutputIterator>(f: |I::Element| -> O::Element, i: I, o: O) ->
    Transform<I, I::Leftovers, O, O::Leftovers> {
    match i.get() {
        input::Done(l) => StopI(l, o),
        input::Cont(e, i) =>
            match o.put(f(e)) {
                output::Done(l) => StopO(i, l),
                output::Cont(o) => transform(i, o)
            }
    }
}

The pattern matching done above is unsound if type members are overridable. I am aware that this can be worked around by not defining Step in either InputIterator and OutputIterator, and replacing Step with its definition in the type signatures for get and put. However, this comes with its own set of annoyances:

pub trait InputIterator {
    type Element;
    type Leftovers;
    fn get(self) -> input::Step<Self, Element, Leftovers>; // looks good so far
}

// Annoyance 1: Defining ZipInputLeftovers, with no less than 6 generic
// parameters, outside of the InputIterator implementation for Zip.
// This is because generics type definitions alone are too dumb to
// gather the associated types of the type parameters.
pub enum ZipInputLeftovers<IL, EL, LL, IR, ER, LR> {
    More1(EL, IL, LR),
    More2(LL, ER, IR),
    Neither(LL, LR)
}

impl<L: InputIterator, R: InputIterator> InputIterator for Zip<L, R> {
    type Element = (L::Element, R::Element);
    type Leftovers = ZipLeftovers<L, L::Element, L::Leftovers, R, R::Element, R::Leftovers>;
    // Annoyance 2: get's type is way too long. zip is a particularly simple iterator,
    // I can guarantee it gets worse with other iterators, such as flatten and extend.
    fn get(self) -> input::Step(Zip<Element, Leftovers>, Element, Leftovers) {
        match (self.left.get(), self.right.get()) { ... }
    }
}

@quantheory
Copy link
Contributor

I sort of like the idea of having concrete types, but I'm puzzled. Shouldn't something like the following be possible under the existing RFC?

pub enum InputStep<I: InputIterator> {
    Cont(I::Element, I),
    Done(I::Leftovers),
}

pub enum ZipInputLeftovers<L: InputIterator, R: InputIterator> {
    More1(L::Element, L, R::Leftovers),
    More2(L::Leftovers, R::Element, R),
    Neither(L::Leftovers, R::Leftovers),
}

Of course this is more verbose than being able to inline these in the trait or impl, and you have to use names like InputStep in some cases, but this seems much better than making every used type a generic (input) parameter. Or am I wrong and something prevents this?

@pnkfelix
Copy link
Member

I'm going to close this ticket; at this point, the addition of associated items to the language has addressed the original problem that prompted me to file it back in early 2013, and any suggestions of further changes should be filed as separate tickets rather than appended here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants