Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coerce_numbers_to_str can cause unicode decode errors #10664

Closed
1 task done
AaDalal opened this issue Oct 19, 2024 · 4 comments
Closed
1 task done

coerce_numbers_to_str can cause unicode decode errors #10664

AaDalal opened this issue Oct 19, 2024 · 4 comments
Labels
bug V2 Bug related to Pydantic V2 good first issue

Comments

@AaDalal
Copy link

AaDalal commented Oct 19, 2024

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

Hi! I just upgraded pydantic version and was testing passing a string like 'hi there!\ud835' which contains an unpaired unicode character to a model with coerce_numbers_to_str=True and saw a unicode error. When I changed it to False it went away.

I'm not sure whether this should/shouldn't throw an error, but I think the behavior should be more consistent.

Example Code

from pydantic import BaseModel, ConfigDict

class ModelWithCoercion(BaseModel):
     x: str
     model_config = ConfigDict(coerce_numbers_to_str=True)

class ModelWithoutCoercion(BaseModel):
     x: str

# Ok
ModelWithoutCoercion(x='hi there!\ud835')

# Also Ok
ModelWithoutCoercion(x=b'hi there!\ud835')

# Also Ok
ModelWithCoercion(x=b'hi there!\ud835')

# Error
ModelWithCoercion(x='hi there!\ud835')

Python, Pydantic & OS Version

pydantic version: 2.9.2
        pydantic-core version: 2.23.4
          pydantic-core build: profile=release pgo=false
                 install path: /Users/aagam.dalal/Library/Caches/pypoetry/virtualenvs/egp-api-backend-YxsoG3qx-py3.11/lib/python3.11/site-packages/pydantic
               python version: 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ]
                     platform: macOS-14.3.1-arm64-arm-64bit
             related packages: fastapi-0.115.2 mypy-1.12.0 typing_extensions-4.12.2
                       commit: unknown
@AaDalal AaDalal added bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation labels Oct 19, 2024
@AaDalal AaDalal changed the title coerce_numbers_to_str causes unicode decode errors coerce_numbers_to_str can cause unicode decode errors Oct 19, 2024
@sydney-runkle
Copy link
Member

sydney-runkle commented Oct 24, 2024

Looks like a bug, though not high priority. PRs welcome with a fix! The fix here will likely be in pydantic-core.

@sydney-runkle sydney-runkle added good first issue and removed pending Awaiting a response / confirmation labels Oct 24, 2024
@aberenda-optifino
Copy link

Hi there
is it ok if I take this issue?

@andrey-berenda
Copy link

The error was because when coerce_numbers_to_str enabled then StrConstrainedValidator was used

StrConstrainedValidator - converts Python string to Rust string which return error if Python string has invalid unicode character.

My PR will fix if only coerce_numbers_to_str enabled - instead of using StrConstrainedValidator it is better to use StrValidator

But if something else is used: min/max length, to_lower, to_upper, strip_whitespace or pattern then we will see the same error, but I hope it is expected

@andrey-berenda
Copy link

pydantic 2.10 released
So I think it is better to close this issue
fyi: @sydney-runkle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2 good first issue
Projects
None yet
Development

No branches or pull requests

4 participants