-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Dataclass descriptor behavior inconsistent #102646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report! Detailed response below. In summary, I'm leaving this issue open because I think there is one potential code change follow-up (to make the behavior more consistent in the case of
This is true, in the sense that the interaction with descriptors was not initially considered in the dataclasses design. But the choice to clarify and document the behaviors was made carefully and with a lot of discussion (much of it on the typing-sig mailing list IIRC.)
This is a good point; these two should probably be equivalent. In the latter case, the descriptor is (already today) assigned to the class and functions as a descriptor at runtime, but the default value in dataclass' internal
This doesn't seem like a problem; it just means that internally dataclasses is fetching the default value off the class more than once. A descriptor should be fine with having its
This is not true. Maybe you tested the If you use a data descriptor (one defining All of this is normal Python descriptor behavior that is not particular to dataclasses.
This is true, and useful, but not inconsistent, because
The field will be treated as a dataclass field, and thus e.g. included in the default generated
No, because the premise (that |
Agree that being called twice or any number of times is not a problem :) Also, you are right that it does get called for instances if it's a data descriptor. It took me a while to understand why, but it checks out. Also, I was wrong about Sorry for the confusion, and thanks for taking the time to explain point by point. |
Yes, but the problem is that we don't ever do the class
See the confusing error message due to the
Although the descriptor is attached to the class and used in both cases, I think the behavior of |
Just ran into this today, the documentation on: https://docs.python.org/3/library/dataclasses.html#descriptor-typed-fields only mentions the case when using the descriptor directly instead of via Following is a demonstration based off the example on the above documentation page: from dataclasses import dataclass, field
from typing import Any
class IntConversionDescriptor:
def __init__(self, *, default):
self._default = default
def __set_name__(self, owner, name):
self._name = "_" + name
def __get__(self, obj, type):
if obj is None:
return self._default
return getattr(obj, self._name, self._default)
def __set__(self, obj, value):
setattr(obj, self._name, int(value))
@dataclass
class InventoryItemDirectDefault:
quantity_on_hand: IntConversionDescriptor = IntConversionDescriptor(default=100)
@dataclass
class InventoryItemFieldDefault:
quantity_on_hand: IntConversionDescriptor = field(default=IntConversionDescriptor(default=100))
@dataclass
class InventoryItemFieldDefaultInitFalse:
quantity_on_hand: IntConversionDescriptor = field(init=False, default=IntConversionDescriptor(default=100))
def test(dc_type: type[Any], init_val: int | None) -> None:
if init_val is None:
i = dc_type()
else:
i = dc_type(init_val)
print(f"dc_type: {dc_type.__name__}, init_val: {init_val}")
print(i.quantity_on_hand) # 100
i.quantity_on_hand = 2.5 # calls __set__ with 2.5
print(i.quantity_on_hand) # 2
if __name__ == '__main__':
test(InventoryItemFieldDefault, 9)
test(InventoryItemDirectDefault, 9)
test(InventoryItemDirectDefault, None)
test(InventoryItemFieldDefaultInitFalse, None)
test(InventoryItemFieldDefault, None) expected output:
actual output:
|
In #94424, the behavior of dataclasses for a descriptor that was assigned as a field default was defined and documented.
While I might be wrong here, a number of things lead me to believe it was an "accidental feature" being documented:
my_field: T = descriptor
), notmy_field: T = field(default=descriptor)
.__get__
is called twice withinstance=None
, but not called for instance attribute gets.__set__
is called for each instance attribute assignment.We can probably address those with some amount of effort, but first I'd like to make sure this behavior is intentional and desired.
ClassVar
?__get__
is called only during class initialization withinstance=None
, then wouldn't the following be equivalent?Motivation: python/mypy#14869 brought up mypy's dataclasses plugin not properly understanding the finer semantics of assigning a descriptor, which led me to look at what exactly are the semantics.
The text was updated successfully, but these errors were encountered: