Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a way for class_schema to deserialize to dictionary, to support gradual conversion of legacy code #275

Open
ikriv opened this issue Jul 26, 2024 · 3 comments

Comments

@ikriv
Copy link

ikriv commented Jul 26, 2024

I converted legacy code that used dictionaries to use dataclasses. However, the conversion is not 100%, so the same type is sometimes represented as a dataclass, and sometimes as a dictionary.

In the new code path, I use class_schema(MyType)().loads(input), it gives me the dataclass, and all is good. But in the old, not yet converted code paths, I either need to maintain legacy schema manually (bad), or I need a way to obtain an 'old style' marshmallow schema from clsas_schema(MyType) that produces a dictionary. I found only one reliable way to achieve that by using loads() and then dump():

schema = class_schema(MyType)
d = schema.dump(schema.loads(input))

Needless to say, this is cumbersome and inefficient.
A better solution would be to have an option to provide a class schema that does not override load() of the base schema, something along the lines of:

dict_schema = class_schema(MyType, use_base_schema_load = True)

This would create a schema class that derives from the provided base schema (marshmallow.Schema by default), defines all necessary fields, but does not override load().

Example code before conversion:

# before conversion
class ConfigSchema(marshmallow.Schema):
    threshold = fields.Float(valid=Range(0,1), required=True)

def frequent_function(data):
   threshold = data["threshold"]
   # use threshold in some way
   ...

def rare_function(data):
   threshold = data["threshold"]
   # use threshold in some other way

data = ConfigSchema().loads('{"threshold": 0.2}')
frequent_function(data)
rare_function(data)

Then we convert frequent_function() to use dataclass, but we leave rare_function() as is:

# after partial conversion

@dataclass
class Config:
    threshold: float = field(metadata={"validate": Range(0,1)})

def frequent_function(data: Config) -> None:
   threshold = data.threshold
   # use threshold in some way
   ...

def rare_function(data: Dict[str, Any]):
   threshold = data["threshold"]
  # use threshold in some other way
   ...

input = '{"threshold": 0.2}'
schema = class_schema(Config)
frequent_function(schema.loads(input))
rare_function(schema.dump(schema.loads(input)) # convoluted

PROPOSAL:

Let's have

input = '{"threshold": 0.2}'
schema = class_schema(Config)
dict_schema = class_schema(Config, use_base_schema_load = True)
frequent_function(schema.loads(input))
rare_function(dic_schema.loads(input)) # better

I studied the code and I am ready to provide a pull request if you are on board with the idea.
@dairiki
Copy link
Collaborator

dairiki commented Aug 14, 2024

Would the use of dataclasses.asdict help in your case?

@dairiki
Copy link
Collaborator

dairiki commented Aug 14, 2024

Related, possible dup: #213

@ikriv
Copy link
Author

ikriv commented Aug 18, 2024

Indeed, asdict would probably work even better than deserialization, because it will keep the type conversions, e.g. datetime will remain datetime, and not a string. Not sure why I missed it, thanks for the idea!

But still, we would first deserialize to dataclass and then convert to dict, which is clearly not ideal for performance, especially for large data graphs.

And indeed, my request looks very similar to #213, thanks for checking!
Do you recall what was the reason for not merging #213?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants