-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_handler_metrics_saver_dist
error
#3621
Milestone
Comments
the root cause is that the heavy lifting happens on rank 0, and rank 1 may exit early and clear the process rank information.
|
still an issue https://github.com/Project-MONAI/MONAI/runs/4852155300?check_suite_focus=true I'll add a barrier following the comment #3641 (comment) cc @Nic-Ma |
wyli
added a commit
to wyli/MONAI
that referenced
this issue
Jan 18, 2022
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
7 tasks
wyli
added a commit
to wyli/MONAI
that referenced
this issue
Jan 25, 2022
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
wyli
added a commit
to wyli/MONAI
that referenced
this issue
Jan 26, 2022
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
wyli
added a commit
that referenced
this issue
Feb 3, 2022
* temp spatial_resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes precisions Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update dict version Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds docs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * copy grid for resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove normalize coordinates Signed-off-by: Wenqi Li <wenqil@nvidia.com> * [MONAI] python code formatting Signed-off-by: monai-bot <monai.miccai2019@gmail.com> * try to fix #3621 (#3673) Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes typing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes grid_sample, interpolate URLs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * simplify norm_coords Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstring Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update moveaxis Signed-off-by: Wenqi Li <wenqil@nvidia.com> * spatial sample tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * additional tests spatial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * test invert saptial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * rtol assert close Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes TF32 tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * smaller tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * skip when quick testing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * comp tensor and ndarray Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * try to use torch.solve Signed-off-by: Wenqi Li <wenqil@nvidia.com> * Revert "fixes tests" This reverts commit e532490. Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes test_affined Signed-off-by: Wenqi Li <wenqil@nvidia.com> * default to float32 rotate/randrotate Signed-off-by: Wenqi Li <wenqil@nvidia.com> * workaround for #3752 Signed-off-by: Wenqi Li <wenqil@nvidia.com> * default to float32 rotate/randrotate Signed-off-by: Wenqi Li <wenqil@nvidia.com> * temp test Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstring Signed-off-by: Wenqi Li <wenqil@nvidia.com> Co-authored-by: monai-bot <monai.miccai2019@gmail.com>
wyli
added a commit
that referenced
this issue
Feb 4, 2022
* temp spatial_resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes precisions Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update dict version Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds docs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * copy grid for resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove normalize coordinates Signed-off-by: Wenqi Li <wenqil@nvidia.com> * [MONAI] python code formatting Signed-off-by: monai-bot <monai.miccai2019@gmail.com> * try to fix #3621 (#3673) Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes typing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes grid_sample, interpolate URLs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * simplify norm_coords Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstring Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update moveaxis Signed-off-by: Wenqi Li <wenqil@nvidia.com> * spatial sample tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * additional tests spatial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * test invert saptial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * rtol assert close Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes TF32 tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * smaller tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * skip when quick testing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * comp tensor and ndarray Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * try to use torch.solve Signed-off-by: Wenqi Li <wenqil@nvidia.com> * temp updates Signed-off-by: Wenqi Li <wenqil@nvidia.com> * enhance typing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * temp test Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes Signed-off-by: Wenqi Li <wenqil@nvidia.com> * Revert "temp test" This reverts commit 6200a38. Signed-off-by: Wenqi Li <wenqil@nvidia.com> * enhance types Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update util Signed-off-by: Wenqi Li <wenqil@nvidia.com> * reverse workaround Signed-off-by: Wenqi Li <wenqil@nvidia.com> * formatting Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update type def. Signed-off-by: Wenqi Li <wenqil@nvidia.com> Signed-off-by: Wenqi Li <wenqil@nvidia.com> * temp test Signed-off-by: Wenqi Li <wenqil@nvidia.com> * warn unused Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remote ignore Signed-off-by: Wenqi Li <wenqil@nvidia.com> * Revert "warn unused" This reverts commit e645807. Signed-off-by: Wenqi Li <wenqil@nvidia.com> * Revert "temp test" This reverts commit ddc4770. Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> Co-authored-by: monai-bot <monai.miccai2019@gmail.com>
wyli
added a commit
that referenced
this issue
Feb 7, 2022
* temp spatial_resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes precisions Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update dict version Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds docs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * copy grid for resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove normalize coordinates Signed-off-by: Wenqi Li <wenqil@nvidia.com> * [MONAI] python code formatting Signed-off-by: monai-bot <monai.miccai2019@gmail.com> * try to fix #3621 (#3673) Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes typing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes grid_sample, interpolate URLs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * simplify norm_coords Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstring Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update moveaxis Signed-off-by: Wenqi Li <wenqil@nvidia.com> * spatial sample tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * additional tests spatial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * test invert saptial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds a base writer and an itk writer Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstrings Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove return self Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds reorient_spatial_axes Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * sync 3701 Signed-off-by: Wenqi Li <wenqil@nvidia.com> * try to fix #3766 Signed-off-by: Wenqi Li <wenqil@nvidia.com> * revise docstring to be concise Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * 3765 Enhance `create_multigpu_supervised_XXX` for distributed (#3768) * [DLMED] add check for devices Signed-off-by: Nic Ma <nma@nvidia.com> * [DLMED] update according to comments Signed-off-by: Nic Ma <nma@nvidia.com> * update to support dynamic spatial_size Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com>
wyli
added a commit
that referenced
this issue
Feb 8, 2022
* temp spatial_resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes precisions Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update dict version Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds docs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * copy grid for resampling Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove normalize coordinates Signed-off-by: Wenqi Li <wenqil@nvidia.com> * [MONAI] python code formatting Signed-off-by: monai-bot <monai.miccai2019@gmail.com> * try to fix #3621 (#3673) Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes typing Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes grid_sample, interpolate URLs Signed-off-by: Wenqi Li <wenqil@nvidia.com> * simplify norm_coords Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstring Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update moveaxis Signed-off-by: Wenqi Li <wenqil@nvidia.com> * spatial sample tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * additional tests spatial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * test invert saptial resample Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds a base writer and an itk writer Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update docstrings Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove return self Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds reorient_spatial_axes Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes unit tests Signed-off-by: Wenqi Li <wenqil@nvidia.com> * sync 3701 Signed-off-by: Wenqi Li <wenqil@nvidia.com> * try to fix #3766 Signed-off-by: Wenqi Li <wenqil@nvidia.com> * revise docstring to be concise Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com> * 3765 Enhance `create_multigpu_supervised_XXX` for distributed (#3768) * [DLMED] add check for devices Signed-off-by: Nic Ma <nma@nvidia.com> * [DLMED] update according to comments Signed-off-by: Nic Ma <nma@nvidia.com> * update to support dynamic spatial_size Signed-off-by: Wenqi Li <wenqil@nvidia.com> * adds nibabel pil writers Signed-off-by: Wenqi Li <wenqil@nvidia.com> * remove unused ignore Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update based on comments Signed-off-by: Wenqi Li <wenqil@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
a frequent error from the premerge tests, e.g.
The text was updated successfully, but these errors were encountered: