Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Fix exception in async map #47110

Merged
merged 5 commits into from
Aug 15, 2024

Conversation

Bye-legumes
Copy link
Contributor

Why are these changes needed?

close #47102

Related issue number

#47102

Checks

  • [√] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [√] I've run scripts/format.sh to lint the changes in this PR.
  • [] I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [√] Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
output_batch_queue.put(output_batch)
except Exception as e:
exception_queue.put(e)
output_batch_queue.put(None) # Signal to stop processing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we use a constant/more specific sentinel value instead of generic None in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just change it to dict

@scottjlee
Copy link
Contributor

@Bye-legumes Could you also add a unit test? You can base it off the reproducible in #47102, or the script you used to test. You can add it in test_map.py (see test_map_batches_async_generator for an example). thanks!

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
@Bye-legumes
Copy link
Contributor Author

@Bye-legumes Could you also add a unit test? You can base it off the reproducible in #47102, or the script you used to test. You can add it in test_map.py (see test_map_batches_async_generator for an example). thanks!

Added!

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
# Here, `out_batch` is a one-row output batch
# from the async generator, corresponding to a
# single row from the input batch.
out_batch = output_batch_queue.get()
if out_batch is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also be updated to SENTINEL i think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I fixed! thx!

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
@@ -318,13 +318,19 @@ def transform_fn(
# generators, and in the main event loop, yield them from
# the queue as they become available.
output_batch_queue = queue.Queue()
exception_queue = queue.Queue()
SENTINEL = dict() # Signal to stop processing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty dict can also be a valid UDF return.
I think we can avoid having this extra exception_queue and SENTINEL.
Just check if out_batch = output_batch_queue.get() is an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! That right! I just modify my modification to put exception in the queue

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
@raulchen raulchen enabled auto-merge (squash) August 15, 2024 21:25
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Aug 15, 2024
@raulchen raulchen merged commit b5a341f into ray-project:master Aug 15, 2024
7 checks passed
simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Aug 16, 2024
close ray-project#47102

---------

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Data] Async MapBatches hangs upon exception raised from UDF
3 participants