Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

follow up from #58504

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh requested a review from a team as a code owner November 10, 2025 20:53
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Nov 10, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes the ReplicaContext object serializable by correctly handling the non-serializable _handle_registration_callback during pickling. The implementation uses __getstate__ and __setstate__ which is a standard and effective approach. The addition of a new test file with comprehensive unit tests for serialization is excellent and covers various scenarios, ensuring the change is robust. The code is well-written and the changes are solid. I have one minor suggestion to enhance test completeness.

Comment on lines +80 to +82
assert deserialized.replica_id == ctx.replica_id
assert deserialized.rank == ctx.rank
assert deserialized.world_size == ctx.world_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better test completeness and consistency with test_pickle_without_callback, it would be good to also assert that servable_object and _deployment_config are preserved after deserialization. While servable_object is None in this test, explicitly checking it and _deployment_config makes the test more robust against future regressions.

Suggested change
assert deserialized.replica_id == ctx.replica_id
assert deserialized.rank == ctx.rank
assert deserialized.world_size == ctx.world_size
assert deserialized.replica_id == ctx.replica_id
assert deserialized.servable_object == ctx.servable_object
assert deserialized._deployment_config == ctx._deployment_config
assert deserialized.rank == ctx.rank
assert deserialized.world_size == ctx.world_size

Copy link
Contributor

@eicherseiji eicherseiji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

Copy link
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense in theory, but from what I see __getstate__ isn't being respected during Ray's serialization process

testing this PR with same test (from doc test failures where we discovered this) - python doc/source/llm/doc_code/serve/multi_gpu/dp_basic_example.py

steps

  1. check out PR
  2. symlink (setup-dev) serve
  3. run doc test

results in same failures:

!!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
(ServeController pid=158116)     Serializing '_handle_registration_callback' <function ReplicaBase._set_internal_replica_context.<locals>.register_handle_callback at 0x7588682f2840>...
(ServeController pid=158116)     !!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
(ServeController pid=158116)     Detected 1 nonlocal variables. Checking serializability...
(ServeController pid=158116)         Serializing 'self' <ray.serve._private.replica.Replica object at 0x7589b5d83550>...
(ServeController pid=158116)         !!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
(ServeController pid=158116)             Serializing '_abc_impl' <_abc._abc_data object at 0x7589b5d4b340>...
(ServeController pid=158116)             !!! FAIL serialization: cannot pickle '_abc._abc_data' object
(ServeController pid=158116)     Serializing '_annotated' ReplicaContext...
(ServeController pid=158116) ================================================================================

@ray-gardener ray-gardener bot added the serve Ray Serve Related Issue label Nov 11, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue stale The issue is stale. It will be closed within 7 days unless there are further conversation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants