Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tighten the definition of union types in the new engine #4005

Closed
stuhood opened this issue Oct 27, 2016 · 6 comments
Closed

Tighten the definition of union types in the new engine #4005

stuhood opened this issue Oct 27, 2016 · 6 comments

Comments

@stuhood
Copy link
Member

stuhood commented Oct 27, 2016

Unions in thrift/c/etc are distinct types, of which you can have a concrete instance.

Unions in scala/typescript are a synthetic type, of which you cannot construct an instance. Instead, a value typed as a union will concretely have the type of one of the members of the union.


Currently we are mostly applying the latter (especially since there is no way to instantiate a concrete instance of the union), but our task indexing allows exact-match lookups of the union type, which means that the decision of which task to use acts more like the former case.

Additionally, since we index by both the union and the members of the union, it will be the case that sometimes we select a task for a member of the union, but get some other member at runtime, leading to a runtime failure.

cc @baroquebobcat

@stuhood stuhood added the engine label Oct 27, 2016
@baroquebobcat
Copy link
Contributor

I think we need a concrete union type for the results of tasks like parsing where the actual types are determined at runtime. For me the question is how the union types and their members interact with other tasks.

one proposal would be to model it as though there's a cast/typecheck function that turns a union type'd concrete object into either a member type or doesn't match.
We could then use that to allow tasks that produce a union to fulfill selects for members of the union. The tricky part there is that it wouldn't apply for subset selects, ie if a select selected for a subset of a union, it wouldn't match. I think this is a reasonable behavior though.

A union is a type you can consume as a type, or you can consume a member, but it may or may not be fulfilled via the task that produces the union.
If you have a separate union that is a subset of another union, you would need to create a task to convert. We could add casts for subsets too, but I'd prefer to start without and see where that takes us.

How's this for a proposal:

Get rid of type constraints. Instead we'll only have something called a ProductType.

  • ProductTypes are either singular, concrete types, or unions.
  • A Select's slot for a ProductType can only consume one type.
  • A Task can produce only one ProductType.
  • If a Task produces a ProductType that is a union, we inject cast/type-checks for each of the member types of the union into the rule graph.

@stuhood
Copy link
Member Author

stuhood commented Feb 27, 2017

I think I'm biased against a concrete union type (ie, a type which is instantiated to wrap an instance of the member type), because it doesn't feel particularly first class. A user could easily create a concrete union outside of the engine and then use it to do this themselves.

The tricky part there is that it wouldn't apply for subset selects, ie if a select selected for a subset of a union, it wouldn't match. I think this is a reasonable behavior though.

Removing this (which is what is used for the symbol table, iirc) would be tricky... it's one of the key reasons for unions in the first place. It sounds like you're suggesting that downcasting from the union to a particular type would still be supported in some way though, so I might be misunderstanding.

EDIT: I guess the gist of my objection is that the concrete union type you've suggested sounds like it would be user facing. It would probably make sense to have a concrete union type as an implementation detail, but exposing it to users would be unfortunate.

@baroquebobcat
Copy link
Contributor

Hm. I should have re-reviewed the initial issue after putting together my thoughts.

I think unions should be synthetic in that a task ought to never see an instance of a union type. But unions are user facing in that the engine's scheduling and matching semantics about them are exposed to implementors.

@stuhood
Copy link
Member Author

stuhood commented Feb 27, 2017

Yea, that makes more sense.

@stuhood
Copy link
Member Author

stuhood commented Mar 29, 2017

This also seems to block adding a check immediately after a Task runs that validates that it returns the type that it says it does, because we duplicate the Task for every member of the union, and then allow the Select above it to Noop for members that don't match the appropriate type.

If instead the Task was installed precisely once for the union, then the return value would match the union constraint, and only then would the Select discard it.

stuhood pushed a commit that referenced this issue Mar 6, 2019
…ngine (#7114)

### Problem

See #4535 and #4005, in particular [this comment on #4535](#4535 (comment)). `TypeConstraint` is a pretty general construct that we would like to do more with, for example #6936, and as of [the current discussion in #4535](#4535 (comment)) we realize we can emulate union types in `@rule`s without it, matching just against a specific type.

### Solution

- Convert `output_constraint` in the `Rule` subclasses in `rules.py` into `output_type`, and add a `SubclassesOf(type)` type check in `datatype` fields in those classes to ensure this.
- Remove `satisfied_by()` and `satisfied_by_type()` externs, and add a `product_type()` extern used to intern a python `type` as a `TypeId`.
- Convert all `TypeConstraint`s passed to the engine for intrinsics (e.g. `Snapshot`) into `TypeId`s.
- Match whether a rule's result matches its declared output type by simply using `==`!
- Manually implement `fmt::Debug` for `TypeId` to be the same as `Display` (we may want to do these differently in the future, but it is very useful to see the actual type name when debugging).

### Result

Everything is the same, but now we don't have the additional complexity of `TypeConstraint` down in the engine when we can do simple `TypeId` equality. This lets us get more creative with `TypeConstraint` on the python side, while type checking on the rust side is a little less complex (and probably more efficient to some degree).
@Eric-Arellano
Copy link
Contributor

Unions have been used extensively now and have fairly strong conventions that work out nicely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants