Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better type hints for attribute definitions #103

Merged
merged 36 commits into from
Jun 10, 2022
Merged

Conversation

math-fehr
Copy link
Collaborator

This PR changes:

  • The type of Data, which now is a generic type
  • The way we define attribute parameters

This is a first step to make our project pyright-safe.
Though this requires us also to switch to Python 3.10, which I do not really like, but at some point we need to evolve anyway, and this will let us write pattern-matching code.

Example

Previously, we defined parameters with

param = ParameterDef(MyAttribute)

This breaks a bit the type system of python.
So here is what I'm proposing:

param: ParameterDef[MyAttribute]

ParameterDef[T] is a type alias for Annotated[T, ParameterDefAnnot], so essentially this will be understood as T by the type checker.

This means that we can now define attributes expecting a single attribute type, and this will be understood by the type checker.

Type with non-structural constraints

We can also add more constraints, that do not necessarily change the type of the parameter.
For instance, here is how we could define a parameter that is an even number:

param: ParameterDef[Annotated[IntAttr, EvenAttrConstraint]]

This means that for the type system, this will still be translated to IntAttr, so everything is fine!
And for IRDL, this will check first that the parameter is an IntAttr, then will call the constraint to check that it respects the constraints.

Current drawbacks

The major drawback I see is for instance for ArrayAttr. We want to define an ArrayAttr of IntegerType using ArrayAttr[IntegerType], but this cannot be understood by IRDL. Thus, we have to write something like Annotated[ArrayAttr[IntegerType], ArrayOfConstraint(IntegerType)], which is not really nice to write.

Maybe one idea would be to also replace ArrayOfConstraint(IntegerType) with ArrayOfConstraint[IntegerType], so we could write something like ArrayOf = Annotated[T, ArrayOfConstraint[T]], making it more concise, by just writing ArrayOf[IntegerType] in the end.

Tell me what you think!

@math-fehr math-fehr force-pushed the irdl-better-types branch 2 times, most recently from 687dca1 to 134a400 Compare April 14, 2022 19:52
Copy link
Collaborator

@webmiche webmiche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Looking forward to working with this, getting in more tooling support would help a lot in the development with xdsl. Plus, all the red lines in my project are annoying me, since I know that the types actually work out...

src/xdsl/irdl.py Outdated Show resolved Hide resolved
src/xdsl/util.py Outdated Show resolved Hide resolved
@michel-steuwer
Copy link
Member

@math-fehr I committed a couple of changes with 42847e0

The main point of this commit are:

  • I made ArrayAttr a generic inheriting from Data[List[A]], this means we now can write:
shape: ParameterDef[ArrayAttr[IntegerAttr]]

instead of

shape: ParameterDef[Annotated[ArrayAttr, ArrayOfConstraint(IntegerAttr)]]

before. I updated the irdl implementation to add the ArrayOfConstraint implicitly.
In fact, I made it a bit more generic and add it to any subclass of Data[List[A]] where A is a type var with bound Attribute.

  • I use at places Sequence instead of List as the type of a function argument, as Sequence allows passing List and Frozenlist to the function.

  • I added a couple of additional type annotations across xdsl

A review of the commit from @math-fehr would be great (specifically the code I added to irdl_to_attr_constraint)!
Additional feedback from @webmiche or @Dinistro is also welcome!

I will look over the rest and see if I have more comments, but I think I am fairly happy with this as a first step to add better typing.

@math-fehr
Copy link
Collaborator Author

Thanks a lot!

I'll add some more stuff tonight/tomorrow, but it already looks way better.

Copy link
Collaborator Author

@math-fehr math-fehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for doing a big pass over everything!
I'll try to make one change, to remove the attribute constraint from IRDL, and essentially allow users to encode special cases, rather than write them in IRDL.

value = IntAttr.build(value)
if not isinstance(typ, IndexType):
typ = IntegerType.build(typ)
return IntegerAttr([value, typ])


@irdl_attr_definition
class ArrayAttr(Data[List[Attribute]]):
class ArrayAttr(Data[List[A]]):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, you are using the A defined in ir.
Could we instead use a new typevar for each generic class? Otherwise, a change in ir will have an effect here, and we probably won't think of updating this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type variables in Python are sufficiently weird.
The blank import of ir imports A as well that's why it is in scope.
Maybe we can have all type variables start with an underscore, so that they are not automatically imported everywhere?

src/xdsl/irdl.py Outdated
return constraints[0]

# Attribute class case
# This is a coercion for an `BaseAttr`.
if isclass(irdl) and issubclass(irdl, Attribute):
return BaseAttr(irdl)

# yapf: disable
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of complexifying IRDL with this. This is because we would have the same issue with Dict, or any generic data structure we could put in Data. This would mean that any other case would have to be encoded in IRDL, which is not really extensible.

I feel that instead, we could add some fields and methods to ArrayAttr, and make IRDL collect this information to produce these constraints. I'm gonna try a bit tomorrow to see how it look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually thought about this as an alternative.
Reasons, why I went with this design:

  1. this fixes the type of the data field to be the once we want (e.g. when we write something.data[0].data where something: Data[List[IntAttr]] then we get an int back and the type checker knows this)
  2. there is nothing more to be said, the constraint can be inferred automatically (as the implementation shows)
  3. I deliberately made the implementation in IRDL not specific to ArrayAttr but generic to Data[List[A]], so any subclass of Data with a List[A].
    Given this type, the data field will be List[A] and therefore the constraint to verify is that all elements of the list must satisfy the constraint of A.
    All of this is just a continuation of what Data does already.

We could also implement a constraint that verifies the equivalent for Dict[K, V] checking that all keys satisfy the K constraint and all associated values the V constraint.

Furthermore, we could restrict what you are allowed to use as a type-argument with Data, e.g. Attribute | List | Dict but nothing more ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree with points 1 and 2, and don't intend to change them!
The only thing I don't really like is that IRDL now needs to be modified for any new data structure that users might use.
I'll try something now to show an example on how I thought doing this!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a pull request that should explain what I meant: #105
Feel free to merge it to this PR if you think this makes sense, otherwise we can discuss it on that PR!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the PR.
You clearly succeeded at keeping 1, but I don't think you managed to keep 2, as you now have to implement the generic_constraint_coercion function in ArrayAttr.

I am ok with the change, as I can see your point of keeping IRDL free of "special" cases, even though I think that you now open the door to allowing people to misuse the API more easily (as they can define a subclass of Data[List[A]] but forget to implement generic_constraint_coercion, or implement it wrongly).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I misunderstood point 2! I don't think there is actually any way to go around it, in the sense that I don't think we can handle most cases in IRDL, since each new case will require a change in IRDL (For instance, I don't think two users will have the same ArrayData using List).

We are sure that users won't forget to implement the generic_constraint_coercion, since IRDL won't accept any generic Data without implementing this (This is because IRDL cannot translate generic types to attribute constraints). However, we can't prevent them to implementing them wrongly I believe.

I'll add on the other PR some features to document it better when user forget to implement generic_constraint_coercion!

@math-fehr math-fehr force-pushed the irdl-better-types branch from 5a0d522 to 73a9476 Compare May 5, 2022 22:29
@math-fehr
Copy link
Collaborator Author

I believe that the PR should soon be ready to merge?

The last thing I'd like to add before is the support for this:

T = TypeVar("T", bound=Attribute)
@irdl_attr_definition
class VectorType(ParametrizedAttribute):
    name = "vector"
    shape: ParameterDef[ArrayAttr[IntegerAttr]]
    element_type: ParameterDef[T]

so we can write VectorType[IntegerAttr] for instance.
To handle that use case, we would have to generate two verifiers: The one for VectorType and the one for the constraints using VectorType[IntegerAttr].

Now consider this problem. When we encounter a TypeVar in a definition, we might think that we just want to transform it to an AnyAttr constraint. This is because this is more general. However, if this constraint is put in a Not, then we have the opposite effect. So what I would suggest is to allow TypeVar everywhere but in Not and custom constraints given by the user (this includes the GenericData), since they might not preserve this "order" on constraints.

I think if we restrain IRDL like this, I should be able to write something to make it work!

@math-fehr
Copy link
Collaborator Author

math-fehr commented May 9, 2022

I added this support on the last PR, tell me if this works for you!
I think if people are happy with this, we should be able to merge it! (I'll take care of operations in a separate PR)

@webmiche
Copy link
Collaborator

Would be great if you could wait for #111 to be fixed, s.t. we can roll out a new version with that fix, which I can use before rewriting everything ;)

@webmiche
Copy link
Collaborator

@math-fehr This can be merged now (after a rebase I guess)

@math-fehr math-fehr force-pushed the irdl-better-types branch 2 times, most recently from 0999d9a to 3de15ba Compare June 4, 2022 16:29
@math-fehr
Copy link
Collaborator Author

That should be now good on my end, @michel-steuwer, do you want to add something before I merge it?

@math-fehr math-fehr merged commit 95f5ce2 into main Jun 10, 2022
@math-fehr math-fehr deleted the irdl-better-types branch June 10, 2022 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants