-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] AlltoAll OP, Update Data progress bars to use row as the iteration unit #46924
Changes from all commits
e3bde4b
436a06b
469dadd
2278193
affadba
2e43235
2ab1358
fa03501
ccd786f
e231e41
dbe70b9
ee0a105
96d0032
cd75b7c
f901228
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -134,7 +134,13 @@ def block_until_complete(self, remaining: List[ObjectRef]) -> None: | |
done, remaining = ray.wait( | ||
remaining, num_returns=len(remaining), fetch_local=False, timeout=0.1 | ||
) | ||
self.update(len(done)) | ||
total_rows_processed = 0 | ||
for _, result in zip(done, ray.get(done)): | ||
num_rows = ( | ||
result.num_rows if hasattr(result, "num_rows") else 1 | ||
) # Default to 1 if no row count is available | ||
total_rows_processed += num_rows | ||
self.update(total_rows_processed) | ||
|
||
with _canceled_threads_lock: | ||
if t in _canceled_threads: | ||
|
@@ -158,9 +164,15 @@ def fetch_until_complete(self, refs: List[ObjectRef]) -> List[Any]: | |
) | ||
if fetch_local: | ||
fetch_local = False | ||
total_rows_processed = 0 | ||
for ref, result in zip(done, ray.get(done)): | ||
ref_to_result[ref] = result | ||
self.update(len(done)) | ||
num_rows = ( | ||
result.num_rows if hasattr(result, "num_rows") else 1 | ||
) # Default to 1 if no row count is available | ||
total_rows_processed += num_rows | ||
# TODO(zhilong): Change the total to total_row when init progress bar | ||
self.update(total_rows_processed) | ||
Comment on lines
+170
to
+175
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nice, thanks for the fix here. for consistency, can we also apply the same logic in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed |
||
|
||
with _canceled_threads_lock: | ||
if t in _canceled_threads: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
self.input_dependencies[0].num_output_rows_total()
something that is static? Should we cache this value with some call likeself._output_rows = self.input_dependencies[0].num_output_rows_total()
?If this total is a live total that is updated as execution continues makes sense to leave as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! Here the
data:image/s3,"s3://crabby-images/43c98/43c983fb9e48b06ba926fb413b74f8705a85e70e" alt="image"
self._output_rows
is not static, but it's our primary option, as it will be update here