-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce typeguard in tests #2073
Conversation
evap/staff/importers/enrollment.py
Outdated
class Meta(type): | ||
def __instancecheck__(cls, instance: object) -> TypeGuard["ValidCourseData"]: | ||
if not isinstance(instance, CourseData): | ||
return False | ||
return all_fields_valid(instance) | ||
|
||
|
||
class ValidCourseData(CourseData, metaclass=Meta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, didn't know about __instancecheck__
. On one hand this is so cool, but on the other hand it allows programmers to do so many things that are just wrong :D
Two notes:
- should probably be named
ValidCourseDataMeta
instead of justMeta
? - does this allow us to just use
isinstance
in the importer when we currently checkall_fields_valid
? If yes, is that desirable? Feels consistent to me, but I'm unsure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if 2., should all_fields_valid
be merged with this instancecheck? What do you think @niklasmohrin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, we are overwriting isinstance(x, ValidCourseData)
here. Do you know if this has any implications internally? Afaik, all Python objects have their own __class__
field to look up methods etc. So if we had a method on ValidCourseData
, we could have the situation that an object passes isinstance
, but still uses a method from a parent, right? This is odd to think about, can't we just have def as_valid(self) -> Optional[ValidCourseData]
method where we construct the object of the child class?
(If the answer is no, I would be fine with merging the two methods)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I understood you correctly.
As far as I understood it, isinstance(x, cls)
will use the implementation of __instancecheck__
of the metaclass of cls
, for all and every x
. The metaclass is also "inherited" when subclassing ValidCourseData
in this case. So, the answer should be no I guess? (I may have misunderstood the question)
Context: typeguard uses isinstance
to determine if the annotated return type matches the type of the returned object. typeguard does not support TypeGuard
s though (originally intended for static typechecking only) and complains that valid CourseData
objects aren't actually ValidCourseData
objects at runtime. This is just a limitation of typeguard.
Since metaclasses always cause some headache (see this thread), maybe it is still better to use as_valid
all along? Then I feel like we may be loosing some conveniences of TypeGuard
though.
We can also just don't do 2., i.e. revert 123af29 and leave the __instancecheck__
meta thingy only for typeguard and continue to use all_fields_valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading your question again: You mean when we override a function of CourseData
in ValidCourseData
? Then indeed, the implementation in CourseData
should be used, because the type
of the object did not change. Since we are talking about TypeGuard
s here, this is not really relevant because you won't override a function in ValidCourseData
, if I understood TypeGuard
correctly (a "feature" is that the runtime type remains the same). You would only change the function parameter types / return types in ValidCourseData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really convinced to be honest, I feel that we should generally avoid playing with isinstance
, because it is a function that programmers have strong assumptions about and we are breaking them here. In particular, this typing-related change has implications in non-typing situations, because we actually changed the code.
In the case here, the change is probably harmless and the easiest we will get. If you two prefer the current approach here, I am fine with that, still, I would vote against __instancecheck__
though :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be ok with it here since ValidCourseData
is a dataclass with no methods, but I see your point, so I'm also open to other workarounds / solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, reverted 123af29 first of all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After taking another look at the usage of ValidCourseData
, I am also fine with moving forward with the current version. Constructing a true instance of ValidCourseData
would mean that we need to inject this new instance back into the process, adding another Mapper
. Even worse, we would have to add ValidEnrollmentParsedRow
etc. which have ValidCourseData
instead of CouseData
.
Maybe, as a fun exercise, we could come up with a class decorator that adds a "validated" version of a class into the class namespace, so we would have
@partial_invalid
@dataclass
class CourseData:
"""Holds information about a course, retrieved from an import file."""
name_de: str
name_en: str
degrees: MaybeInvalid[set[Degree]]
course_type: MaybeInvalid[CourseType]
is_graded: MaybeInvalid[bool]
responsible_email: str
def foo(course_data: CourseData):
bar: CourseData.Validated = course_data.validate()
# ...
... and it recurses into member variables that are annotated with partial_invalid
. It is clearly overkill and I am unsure how well typecheckers will be able to see through this, but it sounds like a fun project :^)
0daed1b
to
4451d51
Compare
Github does not update to the last changes somehow... Maybe my changes will come up soon... |
6e7598d
to
123af29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we verified that this catches the kinds of errors we expect it to / are there any examples yet?
Some final remarks:
close #2066