Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(union_serializer): do not raise warnings in nested unions #1513

Merged

Conversation

lukapeschke
Copy link
Contributor

@lukapeschke lukapeschke commented Oct 31, 2024

In case unions of unions are used, this will bubble-up the errors rather than warning immediately. If no solution is found among all serializers by the top-level union, it will warn as before.

  • Unit tests for the changes exist
  • Documentation reflects the changes where applicable
  • Pydantic tests pass with this pydantic-core (except for expected changes)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @sydney-runkle

Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 58.33333% with 10 lines in your changes missing coverage. Please review.

Project coverage is 89.36%. Comparing base (ab503cb) to head (5506e1f).
Report is 224 commits behind head on main.

Files with missing lines Patch % Lines
src/serializers/type_serializers/union.rs 58.33% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1513      +/-   ##
==========================================
- Coverage   90.21%   89.36%   -0.86%     
==========================================
  Files         106      112       +6     
  Lines       16339    17928    +1589     
  Branches       36       40       +4     
==========================================
+ Hits        14740    16021    +1281     
- Misses       1592     1887     +295     
- Partials        7       20      +13     
Files with missing lines Coverage Δ
src/serializers/type_serializers/union.rs 71.65% <58.33%> (-10.66%) ⬇️

... and 52 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4cb82bf...5506e1f. Read the comment docs.

Copy link

codspeed-hq bot commented Oct 31, 2024

CodSpeed Performance Report

Merging #1513 will not alter performance

Comparing lukapeschke:fix-union-of-unions-serializer (5506e1f) with main (4cb82bf)

Summary

✅ 155 untouched benchmarks

@lukapeschke lukapeschke force-pushed the fix-union-of-unions-serializer branch from a5838c7 to 41d6b25 Compare October 31, 2024 10:56
@lukapeschke
Copy link
Contributor Author

lukapeschke commented Oct 31, 2024

Hello @sydney-runkle , could you please review ?

Also, I'm sitting in a train with bad wifi so I can't clone the pydantic repo to run the tests right now, but I'll do it asap

@lukapeschke
Copy link
Contributor Author

Update: pydantic tests still pass with this change

@davidhewitt
Copy link
Contributor

Hi @lukapeschke, thanks for the PR here.

I took a look at your original issue pydantic/pydantic#10682 and I could still reproduce it on this branch. I sort of see the motivation for the change here and I think it might be ok, but;

the whole serializer warning system is due an overhaul, so I'm reluctant to add the complexity here.

@lukapeschke
Copy link
Contributor Author

Hi @davidhewitt thanks for the reply!

The bug I'm encountering was actually a different one. At first glance I thought it was the same as pydantic/pydantic#10682 but it's not, my bad. I'll unlink the issue.

Below is an MRE for the problem I have, it's the same as the one described in the unit tests. I tested your branch, but the problem is still there.

What would be your preferred way to solve this ? Should I open a different issue in the pydantic repo ? Could you include a fix in your PR if you have an easier one ? If not, I'm also happy to rewrite this if you have a simpler idea in mind, as I'm not very familiar with the pydantic codebase.

Let me know what you'd prefer 🙂

from pydantic import BaseModel, TypeAdapter, Field
from typing import Literal, Annotated


class A(BaseModel):
    a: str


class B(BaseModel):
    b: str


class Animal(BaseModel):
    type_: str = Field(alias="type")


class Dog(Animal):
    type_: Literal["dog"] = Field("dog", alias="type")


class Cat(Animal):
    type_: Literal["cat"] = Field("cat", alias="type")


type Letter = A | B
type Animal = Cat | Dog


class MyModel(BaseModel):
    elems: list[Animal | Letter]


# prints {'elems': [{'type_': 'dog'}]}
print(MyModel(elems=[{"type": "dog"}]).model_dump(warnings="error"))
# I expect this to print {'elems': [{'a': 'a'}]}, but it raises an error
print(MyModel(elems=[{"a": "a"}]).model_dump(warnings="error"))

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thank you for the clarification, I understand your issue now.

I think you're heading in the right direction with your fix, but the problem with matching on the serializer to detect nested unions is that these can be wrapped in combinators like function wrappers or defaults etc, and it will be difficult to exhaustively detect these combinations.

Instead we should leverage the serialization state (the extra), to add handling directly inside the to_python function. I've added two comments below which should be sufficient touch points if you start from a clean main again. We'd need similar handling inside the json_key and serde_serialize functions. Your test cases will hopefully pass after those modifications.

let errors = match to_python_extractor(value, include, exclude, &new_extra, choices) {
ToPythonExtractorResult::Success(obj) => return Ok(obj),
ToPythonExtractorResult::Errors(errs) => errs,
};

if retry_with_lax_check {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch should only be entered if extra.check != SerCheck::Strict

Suggested change
if retry_with_lax_check {
if extra.check != SerCheck::Strict && retry_with_lax_check {

return Ok(v);
}
if let ToPythonExtractorResult::Success(v) = to_python_extractor(value, include, exclude, &new_extra, choices) {
return Ok(v);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check if extra.check != SerCheck::None here. That would tell us if we're inside a nested. union serialization. If so, we should return a PydanticSerializationUnexpectedValue error here containing the errors.

That way, the warning and the inference fallback below this point will only happen at the top level union.

@lukapeschke lukapeschke force-pushed the fix-union-of-unions-serializer branch from 51aaa25 to 043fce1 Compare November 6, 2024 13:18
In case unions of unions are used, this will bubble-up the errors rather
than warning immediately. If no solution is found among all serializers
by the top-level union, it will warn as before.

Signed-off-by: Luka Peschke <mail@lukapeschke.com>
@lukapeschke lukapeschke force-pushed the fix-union-of-unions-serializer branch from 44a3475 to 5c38506 Compare November 6, 2024 13:27
@lukapeschke
Copy link
Contributor Author

@davidhewitt Thanks a lot for the pointers! Indeed, by only checking extra.check, the code is simpler, the PR is almost only tests now. I've added tests for json_key as well, but I've not been able to make serde_serialize fail, so I haven't touched its implem. I guess it is because the function itself is not recursive, but relies on to_python instead ?

Could you please re-review ? Pydantic's tests are still passing with this new version.

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, thanks!

I think let's give serde_serialize the same modification, because while I think you're right that the current structure doesn't lead to recursion, it's not impossible that would change in the future. If it did change, we'd need the same checks so it seems best to just keep all three implementations consistent.

After that, I'm keen to merge this and get this fix into pydantic 2.10 before the final release 👍

Signed-off-by: Luka Peschke <mail@lukapeschke.com>
Copy link
Contributor Author

@lukapeschke lukapeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidhewitt thanks for the quick review! 5506e1f adds the same implem to serde_serialize, along with a non-regression test in case it gets refactored at some point

Comment on lines +198 to +200
} else {
// NOTE: if this function becomes recursive at some point, an `Err(_)` containing the errors
// will have to be returned here
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left this empty for now, as we can't return a PydanticSerializationUnexpectedValue here since the function is supposed to return S::Error., and refactoring this would introduce extra complexity

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks super, thanks very much for the fix!

@davidhewitt davidhewitt merged commit 2419981 into pydantic:main Nov 7, 2024
29 checks passed
@lukapeschke lukapeschke deleted the fix-union-of-unions-serializer branch November 7, 2024 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants