Skip to content

Commit 3cd13da

Browse files
committed
Fix divide by zero error when all URI size estimations fail
- Add check for empty row_sizes list before calculating average - Fall back to using number of rows in block when no file sizes are available - Prevents ZeroDivisionError when all URIs fail to provide size information
1 parent 47131b2 commit 3cd13da

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

python/ray/data/_internal/planner/plan_download_op.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,12 @@ def _estimate_nrows_per_partition(self, block: pa.Table) -> int:
281281
]
282282

283283
target_nbytes_per_partition = self._data_context.target_max_block_size
284+
if len(row_sizes) == 0:
285+
logger.warning(
286+
"Unable to estimate file sizes for URIs. "
287+
"Falling back to using the number of rows in the block as the partition size."
288+
)
289+
return len(block)
284290
avg_nbytes_per_row = sum(row_sizes) / len(row_sizes)
285291
if avg_nbytes_per_row == 0:
286292
logger.warning(

0 commit comments

Comments
 (0)