Skip to content

[Data] "Actor ... has constructor arguments in the object store" warning with minimal Ray Data use #57733

@bveeramani

Description

@bveeramani

What happened + What you expected to happen

I was running a regular batch inference pipeline, and saw this error. This confused me, because I'm not using any large actor arguments.

Running 0: 0.00 row [00:00, ? row/s]{"asctime":"2025-10-15 07:54:31,170","levelname":"E","message":"Actor with class name: 'MapWorker(MapBatches(EmbedPatches))' and ID: '9e175f8de7f8d1f2271c24e308000000' has constructor arguments in the object store and max_restarts > 0. If the arguments in the object store go out of scope or are lost, the actor restart will fail. See #53727 for more details.","filename":"core_worker.cc","lineno":2170}

Versions / Dependencies

4036252

Reproduction script

>>> class Actor:
...     def __call__(self, batch):
...             return batch
... 
>>> ray.data.range(1).map_batches(Actor).materialize()
2025-10-15 10:08:45,989 INFO logging.py:293 -- Registered dataset logger for dataset dataset_6_0
2025-10-15 10:08:45,993 INFO streaming_executor.py:159 -- Starting execution of Dataset dataset_6_0. Full logs are in /tmp/ray/session_2025-10-15_10-08-22_304424_12580/logs/ray-data
**2025-10-15 10:08:45,994 INFO streaming_executor.py:160 -- Execution plan of Dataset dataset_6_0: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadRange] -> ActorPoolMapOperator[MapBatches(Actor)]**

THIS IS THE ISSUE -> [2025-10-15 10:08:46,014 E 12580 63854] core_worker.cc:2162: Actor with class name: 'MapWorker(MapBatches(Actor))' and ID: 'f317cd9c02fcc41ff203c17201000000' has constructor arguments in the object store and max_restarts > 0. If the arguments in the object store go out of scope or are lost, the actor restart will fail. See https://github.com/ray-project/ray/issues/53727 for more details.
                                                
2025-10-15 10:08:46,319 INFO streaming_executor.py:279 -- ✔️  Dataset dataset_6_0 execution finished in 0.33 seconds
✔️  Dataset dataset_6_0 execution finished in 0.33 seconds: 100%|██████████████████████████████████████████████████████████████████| 1.00/1.00 [00:00<00:00, 3.06 row/s] 
- ReadRange->SplitBlocks(20): Tasks: 0; Actors: 0; Queued blocks: 0; Resources: 0.0 CPU, 0.0B object store: 100%|██████████████████| 1.00/1.00 [00:00<00:00, 3.26 row/s]
- MapBatches(Actor): Tasks: 0; Actors: 0; Queued blocks: 0; Resources: 1.0 CPU, 8.0B object store; [0/1 objects local]: : 1.00 row [00:00, 3.25 row/s]                  
MaterializedDataset(num_blocks=1, num_rows=1, schema={id: int64})

Issue Severity

None

Metadata

Metadata

Assignees

Labels

bugSomething that is supposed to be working; but isn'tdataRay Data-related issuesgood-first-issueGreat starter issue for someone just starting to contribute to Ray

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions