-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor multi-node running command into dedicated functions #6623
Conversation
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
for more information, see https://pre-commit.ci
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
/build |
Looks like the test error is not related to this PR: I can take a look after a few hours. |
/build |
1 similar comment
/build |
/integration-test |
@wyli blossom-ci looks "skipped" when I clicked into it. May I know if we can re-trigger it? |
the last two runs have the same errors:
I'll try to run it again.. |
/black |
Thanks @wyli for pointing it out. I was looking at the wrong things. Let me fix this. |
Signed-off-by: Mingxin <18563433+mingxin-zheng@users.noreply.github.com>
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
/build |
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
thanks @mingxin-zheng, I modified the |
/build |
Thank you @wyli |
Fixes #6567 .
Description
The PR aims to refactor the multi-node command preparation and running for auto3dseg.
In the initial draft, I assume the functions are only for Auto3DSeg and they are internal, so I put them in the Auto3DSeg
utils
. I am open to changes if we think the usage can be more general.Some details:
To address the 3 variations of commands used in Auto3DSeg:
python script.py <options>
torchrun <specs> script.py <options>
bcprun <specs> -c python script.py <options>
I split the
<options>
and<specs>
passing in different stages<options>
is in preparation stage, e.g.create_cmd
in Auto3DSeg<specs>
is related to device configuration, and so it is in the launch stage e.g.run_cmd
in Auto3DSegEach variation has its version of preparation and launching.
Types of changes
./runtests.sh -f -u --net --coverage
../runtests.sh --quick --unittests --disttests
.make html
command in thedocs/
folder.