Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Data] Fix incorrect pending task size if outputs are empty (#47604)
If an operator outputs empty blocks, then Ray Data thinks that the operator has 256 MiB of pending task outputs, even though it should be 0. For example: ```python import pyarrow as pa output = pa.Table.from_pydict({"data": [None] * 128}) assert output.nbytes == 0, output.nbytes ``` The reason for the bug is because we check if `average_bytes_per_output` is truthy rather than if it's not `None`. https://github.com/ray-project/ray/blob/1f83fb44580e392ba6d39a9e79bbdd8cd5b7d916/python/ray/data/_internal/execution/interfaces/op_runtime_metrics.py#L369-L371 --- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
- Loading branch information