Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization bug when using schema inheritance #800

Closed
cristianmatache opened this issue Nov 24, 2024 · 3 comments
Closed

Serialization bug when using schema inheritance #800

cristianmatache opened this issue Nov 24, 2024 · 3 comments

Comments

@cristianmatache
Copy link
Contributor

Describe the bug

if is_union(model.__annotations__[field_name]) and include_type:
asdict["-type"] = value.get_fullname()

In this code section .__annotations__ is misused. __annotations__ is accessed on the instance, which has the same behavior as accessing it on the class (type(instance)). In this case, __annotations__ only contains the fields of the current class but not those of its parent classes.

To Reproduce

from dataclasses import dataclass

from dataclasses_avroschema import AvroModel

@dataclass
class A(AvroModel):
    a: int

@dataclass
class Parent(AvroModel):
    p: A

@dataclass
class Child(Parent):
    c: int

child = Child(p=A(1), c=1)
child.serialize()  # Errors

Further notes on __annotations__:

assert Child.__annotations__ == child.__annotations__ == {'c': int}
from typing import get_type_hints
assert set(get_type_hints(child)) == {'c'}  # On the instance
assert {'p', 'c'}.issubset(set(get_type_hints(Child)))  # On the class

Expected behavior
Serialization should succeed.

Suggestions

  • Use typing.get_type_hints
  • Make it more performant by adding an lru_cache for example
%timeit is_union(get_type_hints(type(model))[field_name])
41.1 µs ± 819 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


from functools import lru_cache
@lru_cache
def f(model_t, field_name):
    return is_union(get_type_hints(model_t)[field_name])

%timeit f(type(model), field_name)
160 ns ± 3.13 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
@marcosschroh
Copy link
Owner

Thanks for reporting it. I am a bit busy but as soon as I have time I will take a look

@marcosschroh
Copy link
Owner

Fixed in the next released

@cristianmatache
Copy link
Contributor Author

Thank you, @marcosschroh!

I have 2 questions, out of curiosity

  • why not use @lru_cache, the whole serialization would become much more performant?
  • why do we need this as "a hack"
    annotations = model.__annotations__
    # This is a hack to get the annotations from the parent class
    # https://github.com/marcosschroh/dataclasses-avroschema/issues/800
    if model.__class__.mro()[1] != base_class:
    annotations.update(typing.get_type_hints(model.__class__))
    ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants