You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In out migration to pydantic 2, we found a JSON document that pydantic 1 was able to load and pydantic 2 can't with the error:
Invalid JSON: lone leading surrogate in hex escape at line...
Here's a simple way of reproducing:
import json
from pydantic_core import from_json
data = b'{"test": "text\udce2\udc80\udc99text"}'
print(json.loads(data))
print(from_json(data))
This first print from python's json works:
{'test': 'text\udce2\udc80\udc99text'}
The second one using pydantic_core (used by pydantic2) raises
Traceback (most recent call last):
File "check.py", line 7, in <module>
print(from_json(data))
^^^^^^^^^^^^^^^
ValueError: lone leading surrogate in hex escape at line 1 column 20
Here's some versions
Python 3.12.2
pydantic 2.8.2
pydantic-core 2.20.1
Thank you!
The text was updated successfully, but these errors were encountered:
Part of the problem will be that a Python str is allowed to have invalid unicode sequences (see e.g. PEP 383 and the 'surrogateescape' handler) to contain (encoded) arbitrary byte payloads. Decoding to UTF8 (and any UTF8 operations) on these strings will fail.
Rust String data, on the other hand, strictly requires valid UTF8.
Hello,
In out migration to pydantic 2, we found a JSON document that pydantic 1 was able to load and pydantic 2 can't with the error:
Here's a simple way of reproducing:
This first print from python's json works:
The second one using pydantic_core (used by pydantic2) raises
Here's some versions
Thank you!
The text was updated successfully, but these errors were encountered: