Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach traceback to Exception & Test disatpching process (#1036) #1038

Merged
merged 1 commit into from
Feb 22, 2023

Conversation

ejguan
Copy link
Contributor

@ejguan ejguan commented Feb 22, 2023

Cherry-pick: #1036

Summary:
Partially fixes #969

Changes

  • Add ExceptionWrapper to attach traceback to the Exception
    • Reason: traceback is unserializable. So, it has to be passed by string
    • In order to provide informative Error message, pass name for each process like dispatching process and worker process <id>.
  • Add tests to validate Error propagation from the dispatching process
    • parametrize the tests
  • Fix a bug for round_robin_demux to return a list of DataPipe rather than a single DataPipe when num_of_instances is 1.

Pull Request resolved: #1036

Reviewed By: NivekT

Differential Revision: D43472709

Pulled By: ejguan

fbshipit-source-id: e5c9e581ca881f523fb568b6f46bf16ecfc243d2

Summary:
Partially fixes pytorch#969

### Changes

- Add `ExceptionWrapper` to attach traceback to the Exception
  - Reason: traceback is unserializable. So, it has to be passed by string
  - In order to provide informative Error message, pass name for each process like `dispatching process` and `worker process <id>`.
- Add tests to validate Error propagation from the dispatching process
  - parametrize the tests
- Fix a bug for `round_robin_demux` to return a list of DataPipe rather than a single DataPipe when `num_of_instances` is 1.

Pull Request resolved: pytorch#1036

Reviewed By: NivekT

Differential Revision: D43472709

Pulled By: ejguan

fbshipit-source-id: e5c9e581ca881f523fb568b6f46bf16ecfc243d2
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 22, 2023
@ejguan ejguan mentioned this pull request Feb 22, 2023
10 tasks
@ejguan ejguan requested a review from NivekT February 22, 2023 15:00
@ejguan ejguan merged commit 41cfdfa into pytorch:release/0.6 Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants