Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Make Ray Serve compatible with Pydantic 1.10.x #39864

Closed
shrekris-anyscale opened this issue Sep 26, 2023 · 6 comments
Closed

[Serve] Make Ray Serve compatible with Pydantic 1.10.x #39864

shrekris-anyscale opened this issue Sep 26, 2023 · 6 comments
Assignees
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue

Comments

@shrekris-anyscale
Copy link
Contributor

What happened + What you expected to happen

Currently, Ray's cloudpickle cannot serialize Pydantic dataclasses in Pydantic 1.10.x. However, it can serialize them in Pydantic 1.9.2.

Ray's cloudpickle logic should be updated to successfully serialize Pydantic dataclasses. This test should succeed.

Versions / Dependencies

Ray on the latest master. Pydantic 1.10.x.

Reproduction script

See test_serialize_serve_dataclass.

Issue Severity

High: It blocks me from completing my task.

@shrekris-anyscale shrekris-anyscale added bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue labels Sep 26, 2023
@shrekris-anyscale shrekris-anyscale self-assigned this Sep 26, 2023
@shrekris-anyscale
Copy link
Contributor Author

shrekris-anyscale commented Sep 26, 2023

Here's a debugging script that I'm using to experiment with the failing test:

# import pickle
import ray.cloudpickle as pickle

from copy import deepcopy
from pydantic import BaseModel
from pydantic.dataclasses import is_builtin_dataclass
from dataclasses import dataclass
from typing import Any, Dict, Optional

@dataclass
class BackendMetadata:
    is_blocking: bool = True
    autoscaling_config: Optional[Dict[str, Any]] = None

bm = deepcopy(BackendMetadata)

# breakpoint()

class BackendConfig(BaseModel):
    internal_metadata: BackendMetadata = BackendMetadata()

my = BackendConfig()

def ser():
    print(pickle.dumps(BackendMetadata))

def cmp():
    print(bm is BackendMetadata)

def pd_dc():
    print(is_builtin_dataclass(BackendMetadata))

breakpoint()
culprit = type(my.__dict__["internal_metadata"]).__dict__["__init__"]

print("Finished")

# Hypothesis: adding BackendMetadata to the BaseModel makes pydantic switch
# the dataclass's __init__ method with a cyfunction __init__ method that
# can't be serialized by cloudpickle (but can be serialized with pickle?).

@shrekris-anyscale
Copy link
Contributor Author

And here's a repro script:

# File name: repro.py

import ray
# import pickle
import ray.cloudpickle as pickle

from pydantic import BaseModel
from dataclasses import dataclass
from typing import Any, Dict, List, Optional

# START_RAY = True
START_RAY = False

if START_RAY:
    ray.init(ignore_reinit_error=True)

@dataclass
class BackendMetadata:
    is_blocking: bool = True
    autoscaling_config: Optional[Dict[str, Any]] = None

class BackendConfig(BaseModel):
    internal_metadata: BackendMetadata = BackendMetadata()

if START_RAY:
    ray.get(ray.put(BackendConfig()))

    @ray.remote
    def consume(f):
        pass

    ray.get(consume.remote(BackendConfig()))
else:
    serialized_object = pickle.dumps(BackendConfig())
    deserialized_object = pickle.loads(serialized_object)

@shrekris-anyscale
Copy link
Contributor Author

This turns out to be harder than expected. The issue is that Pydantic 1.10.x converts vanilla dataclasses into Pydantic dataclasses when they're used as type hints in a Pydantic model. During this process, Pydantic also replaces plain Python methods with cythonized methods, which can't be serialized with cloudpickle due to cloudpipe/cloudpickle#408. When Ray attempts to serialize the dataclass or Pydantic model with cloudpickle, it fails with the following error:

% python repro.py 
Traceback (most recent call last):
  File "repro.py", line 32, in <module>
    serialized_object = pickle.dumps(BackendConfig())
  File "/Users/shrekris/Desktop/ray/python/ray/cloudpickle/cloudpickle_fast.py", line 88, in dumps
    cp.dump(obj)
  File "/Users/shrekris/Desktop/ray/python/ray/cloudpickle/cloudpickle_fast.py", line 733, in dump
    return Pickler.dump(self, obj)
AttributeError: Can't pickle local object '__create_fn__.<locals>.__init__'

The __init__ method has been cythonized (from objgraph):

Screen Shot 2023-09-26 at 2 59 18 PM

@shrekris-anyscale
Copy link
Contributor Author

Ideally, Pydantic either (1) wouldn't mutate the original dataclass or (2) would keep a copy of the original dataclass that we could use to create a custom serializer. Unfortunately, it looks like Pydantic (1) mutates the original dataclass and (2) doesn't keep a copy of the original dataclass anywhere.

One possible, hacky workaround: the Pydantic dataclass still has the same fields as the original dataclass. Serve could reconstruct a copy of the original dataclass by creating a new dataclass with the same fields upon deserialization. However, if the user defines a custom constructor, that would be lost.

@ddelange
Copy link
Contributor

ref #40451

@shrekris-anyscale
Copy link
Contributor Author

We've merged the changes to make Ray compatible with Pydantic 2.5+. You can start using Pydantic 2.5+ with Ray 2.9, which should be out at the end of December.

These changes should also be in the Ray nightly, so feel free to try them out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue
Projects
None yet
Development

No branches or pull requests

2 participants