-
-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixins with only dataframe_checks cause error in SchemaModel #433
Comments
First off, didn't get to say it in the formal report above, but thank you for putting your time into this wonderful library! To business, it turns out the error turns out to be simpler. You can reproduce the error for any schema with no annotations. For instance, the following code snippet produces a similar error: import pandera as pa
class ASchema(pa.SchemaModel):
pass
ASchema.to_schema()
# Traceback (most recent call last):
# File "test.py", line 10, in <module>
# ASchema.to_schema()
# File ".../pandera/model.py", line 150, in to_schema
# cls.__config__ = cls._collect_config()
# File ".../pandera/model.py", line 331, in _collect_config
# base_options = _extract_config_options(config)
# File ".../pandera/model.py", line 97, in _extract_config_options
# for name, value in vars(config).items()
# TypeError: vars() argument must have __dict__ attribute The issue seems to stem from this line: https://github.com/pandera-dev/pandera/blob/master/pandera/model.py#L126 Apparently, at least in Python 3.8.6, if no annotations are specified in the subclass, As an example, if we put a import pandera as pa
class ASchema(pa.SchemaModel):
pass
# {'Config': typing.Type[pandera.model.BaseConfig],
# '__checks__': typing.Dict[str, typing.List[pandera.checks.Check]],
# '__config__': typing.Union[typing.Type[pandera.model.BaseConfig], NoneType],
# '__dataframe_checks__': typing.List[pandera.checks.Check],
# '__fields__': typing.Dict[str, typing.Tuple[pandera.typing.AnnotationInfo, pandera.model_components.FieldInfo]],
# '__schema__': typing.Union[pandera.schemas.DataFrameSchema, NoneType]} But if we add an annotation we get: import pandera as pa
class ASchema(pa.SchemaModel):
a: pa.typing.Series[int]
# {'a': pandera.typing.Series[int]} One possible fix would be to use |
Thanks @khwilson for the bug report and concise examples. You are right, the problem comes from I've just pushed a fix. I simply filter out annotations that starts with "_" and "Config", as it is done when collecting regular fields. Btw, your |
Excellent! One quick comment on your solution, though: it's not common practice but I assume there will be times people have fields that start with And I'm happy to do a PR with the primary_key_mixin if you wanted to add it to the main library. I've been getting some decent mileage out of it. :-) |
Actually private annotations have always been ignored since I think the primary key would be better as a recipe. It seems very specific for pandera itself, but @cosmicBboy might have another opinion. It would be nice if you could post the template in the discussions. I'm sure many people would find it useful. One thing is that mypy complains: |
Thanks, y'all! Happy to start a discussion about the mixin! |
fixed by #434 |
Describe the bug
When creating a
SchemaModel
, if it only containsdataframe_check
s then theto_schema
function fails when it tries to takevars(config)
.Code Sample, a copy-pastable example
This fails
With traceback (replaced some directory names for privacy):
Expected behavior
The above should run to completion
Desktop (please complete the following information):
Additional context
If you add a single class-level variable with an annotation of its own it succeeds, e.g.:
However, if you do NOT have an annotation, it still fails:
The text was updated successfully, but these errors were encountered: