Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can immutable dataclasses be hashed as dictionary keys? #160

Closed
u3Izx9ql7vW4 opened this issue Sep 6, 2023 · 4 comments · Fixed by #165
Closed

Can immutable dataclasses be hashed as dictionary keys? #160

u3Izx9ql7vW4 opened this issue Sep 6, 2023 · 4 comments · Fixed by #165
Assignees
Labels
enhancement New feature or request

Comments

@u3Izx9ql7vW4
Copy link

u3Izx9ql7vW4 commented Sep 6, 2023

Is your feature request related to a problem? Please describe.
I'd like to be able to hash (immutable) dataclass objects as keys. Currently I have a field which contains a dict for which the keys are a dataclass object. I get the following error:

mashumaro.exceptions.UnserializableDataError: dict with dataclasses as keys is not supported by mashumaro

I tried subclassing NamedTuple and got:

mashumaro.exceptions.UnserializableDataError: ForwardRef('str') as a field type is not supported by mashumaro

Describe the solution you'd like
Would be great if I could have immutable dataclasses as keys.

Describe alternatives you've considered
Tried named tuples, which did not work.

Additional context
This is my intended usecase:

@dataclass
class NetworkConfig(DataClassYAMLMixin):
    network: dict[NodeIdentifier, NodeConfig]

@dataclass(frozen=True, eq=True)
class NodeIdentifier(DataClassYAMLMixin):
    group: str
    name: str

@dataclass
class NodeConfig:
    name: str
    family: str 
    subscribers: list[NodeIdentifier] = field(default_factory=list)
@u3Izx9ql7vW4
Copy link
Author

u3Izx9ql7vW4 commented Sep 6, 2023

Ok, I kind of gotten it to work with SerializableType if I drop the dataclass decorator on the NodeIdentifier, but printing it out isn't pretty. It would be really great dataclass serialization as keys was supported.

@Fatal1ty
Copy link
Owner

Fatal1ty commented Sep 6, 2023

Hi @user293811

mashumaro.exceptions.UnserializableDataError: dict with dataclasses as keys is not supported by mashumaro

Let me explain why you see this exception. All the supported serialization formats (json, msgpack, yaml, toml) are kind of dictionary based and allow only simple immutable types to be keys. By default dataclasses are serialized into dicts which can't be dictionary keys because they are mutable. NamedTuples are serialized into lists or dicts depending on the engine you choose (default is lists), so they can't be dictionary keys too.

Taking all above into account, if you want to use dataclasses or named tuples as keys in your structure, you should provide a custom serialization (and deserialization for the reverse) method that will convert them into simple immutable object like string. As you have noticed, SerializableType is one of the available approaches to providing a custom serialization method. However, in the current implementation of mappings serialization there is still a code checking that the key is not a dataclass:

def ensure_collection_type_args_supported(
collection_type: Type,
type_args: Sequence[Type],
) -> bool:
if type_args and is_dataclass(type_args[0]):
raise UnserializableDataError(
f"{type_name(collection_type, short=True)} "
f"with dataclasses as keys is not supported by mashumaro"
)
return True

This is why you were forced to remove the dataclass decorator. To workaround this problem we can extend this condition by checking SerializableType subclassing. Serialization will fail with TypeError: unhashable type: 'xxx' if dataclass is intentionally converted to unhashable type, but I think we can leave it.

The proposed fix looks trivial. I can do it but if you're willing to contribute yourself, PR is welcome.

Edit: Oh, I just noticed that using forward references breaks this fix on python 3.8. It would be better if I looked at it myself.

@Fatal1ty Fatal1ty added the enhancement New feature or request label Sep 6, 2023
@Fatal1ty
Copy link
Owner

Fatal1ty commented Sep 6, 2023

I tried subclassing NamedTuple and got:

mashumaro.exceptions.UnserializableDataError: ForwardRef('str') as a field type is not supported by mashumaro

This looks like another problem related to ForwardRef support. I managed to support forward references to TypedDicts here but this is something new. Could you provide a reproducible example with details in a separate issue? I'd like to cover as many cases as possible.

@Fatal1ty
Copy link
Owner

I modified the process of checking to check if the key type is Hashable. This alteration will also trigger an UnserializableField exception for other types that are not hashable.

If you remove frozen=True parameter from NodeIdentifier the following exception will now be raised:

mashumaro.exceptions.UnserializableField: Field "network" of type dict[NodeIdentifier, NodeConfig] in NetworkConfig is not serializable: NodeIdentifier is unhashable and can not be used as a key

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants