-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop FairScale sharded parity tests #16069
Conversation
⚡ Required checks status: All passing 🟢Groups summary🟢 pytorch_lightning: Tests workflowThese checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Benchmarks
These checks are required after the changes to 🟢 pytorch_lightning: Azure HPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure IPU
These checks are required after the changes to Thank you for your contribution! 💜
|
* Remove the deprecated profiler imports (#16059) * Revert "Load app before setting LIGHTNING_DISPATCHED" (#16064) Revert "Load app before setting LIGHTNING_DISPATCHED (#16057)" This reverts commit 8d3339a. * [App] Hot fix: Resolve detection of python debugger (#16068) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Load the app before setting `LIGHTNING_DISPATCHED` (#16071) * fix(cloud): detect and ignore venv (#16056) Co-authored-by: Ethan Harris <ethanwharris@gmail.com> * Add function to remove checkpoint to allow override for extended classes (#16067) * Drop FairScale sharded parity tests (#16069) * minor fix: indent spaces in comment-out (#16076) * ci: print existing candidates (#16077) * [App] Fix bug where previously deleted apps cannot be re-run from the CLI (#16082) * Better check for programmatic lightningignore (#16080) Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> * [App] Removing single quote (#16079) * [App] PoC: Add support for Request (#16047) * Have checkgroup pull the latest runs (#16033) * Update Multinode Warning (#16091) * [App] Serve datatypes with better client code (#16018) * docs: add PT version (#16010) * docs: add PT version * stable Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * add 1.13.1 to adjust versions (#16099) * Remove redundant `find_unused_parameters=False` in Lite (#16026) * [App] Add display name property to the work (#16095) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> * Fix detection of whether app is running in cloud (#16045) * [App] Add work.delete (#16103) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> * [App] Improve the autoscaler UI (#16063) [App] Improve the autoscaler UI (#16063) * Re-enable Lite CLI on Windows + PyTorch 1.13 (#15645) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * [App] Min replica=0 would break autoscaler component (#16092) * fixing the bug where num_replica=0 would fail * changelog * [App] Scale out/in interval for autoscaler (#16093) * Adding arguments for scale out/in interval * Tests * Set the default work start method to spawn on MacOS (#16089) * [App] Add status endpoint, enable `ready` (#16075) Co-authored-by: thomas chaton <thomas@grid.ai> * Clarify `work.stop()` limitation (#16073) * fix merge errors * Update torchvision requirement from <=0.14.0,>=0.11.1 to >=0.11.1,<0.15.0 in /requirements (#16108) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> * CI: settle file names (#16098) * CI: settle file names * rename * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix test failing on master due to bad auto-merge (#16118) * fix merge error Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Yurij Mikhalevich <yurij@grid.ai> Co-authored-by: Ethan Harris <ethanwharris@gmail.com> Co-authored-by: Sean Naren <snarenthiran@nvidia.com> Co-authored-by: Qiushi Pan <17402261+qqpann@users.noreply.github.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Sherin Thomas <sherin@lightning.ai> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz>
* Remove the deprecated profiler imports (#16059) * Revert "Load app before setting LIGHTNING_DISPATCHED" (#16064) Revert "Load app before setting LIGHTNING_DISPATCHED (#16057)" This reverts commit 8d3339a. * [App] Hot fix: Resolve detection of python debugger (#16068) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Load the app before setting `LIGHTNING_DISPATCHED` (#16071) * fix(cloud): detect and ignore venv (#16056) Co-authored-by: Ethan Harris <ethanwharris@gmail.com> * Add function to remove checkpoint to allow override for extended classes (#16067) * Drop FairScale sharded parity tests (#16069) * minor fix: indent spaces in comment-out (#16076) * ci: print existing candidates (#16077) * [App] Fix bug where previously deleted apps cannot be re-run from the CLI (#16082) * Better check for programmatic lightningignore (#16080) Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> * [App] Removing single quote (#16079) * [App] PoC: Add support for Request (#16047) * Have checkgroup pull the latest runs (#16033) * Update Multinode Warning (#16091) * [App] Serve datatypes with better client code (#16018) * docs: add PT version (#16010) * docs: add PT version * stable Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * add 1.13.1 to adjust versions (#16099) * Remove redundant `find_unused_parameters=False` in Lite (#16026) * [App] Add display name property to the work (#16095) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> * Fix detection of whether app is running in cloud (#16045) * [App] Add work.delete (#16103) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> * [App] Improve the autoscaler UI (#16063) [App] Improve the autoscaler UI (#16063) * Re-enable Lite CLI on Windows + PyTorch 1.13 (#15645) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * [App] Min replica=0 would break autoscaler component (#16092) * fixing the bug where num_replica=0 would fail * changelog * [App] Scale out/in interval for autoscaler (#16093) * Adding arguments for scale out/in interval * Tests * Set the default work start method to spawn on MacOS (#16089) * [App] Add status endpoint, enable `ready` (#16075) Co-authored-by: thomas chaton <thomas@grid.ai> * Clarify `work.stop()` limitation (#16073) * fix merge errors * Update torchvision requirement from <=0.14.0,>=0.11.1 to >=0.11.1,<0.15.0 in /requirements (#16108) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> * CI: settle file names (#16098) * CI: settle file names * rename * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix test failing on master due to bad auto-merge (#16118) * fix merge error Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Yurij Mikhalevich <yurij@grid.ai> Co-authored-by: Ethan Harris <ethanwharris@gmail.com> Co-authored-by: Sean Naren <snarenthiran@nvidia.com> Co-authored-by: Qiushi Pan <17402261+qqpann@users.noreply.github.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Sherin Thomas <sherin@lightning.ai> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz>
* Remove the deprecated profiler imports (#16059) * Revert "Load app before setting LIGHTNING_DISPATCHED" (#16064) Revert "Load app before setting LIGHTNING_DISPATCHED (#16057)" This reverts commit 8d3339a. * [App] Hot fix: Resolve detection of python debugger (#16068) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Load the app before setting `LIGHTNING_DISPATCHED` (#16071) * fix(cloud): detect and ignore venv (#16056) Co-authored-by: Ethan Harris <ethanwharris@gmail.com> * Add function to remove checkpoint to allow override for extended classes (#16067) * Drop FairScale sharded parity tests (#16069) * minor fix: indent spaces in comment-out (#16076) * ci: print existing candidates (#16077) * [App] Fix bug where previously deleted apps cannot be re-run from the CLI (#16082) * Better check for programmatic lightningignore (#16080) Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> * [App] Removing single quote (#16079) * [App] PoC: Add support for Request (#16047) * Have checkgroup pull the latest runs (#16033) * Update Multinode Warning (#16091) * [App] Serve datatypes with better client code (#16018) * docs: add PT version (#16010) * docs: add PT version * stable Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * add 1.13.1 to adjust versions (#16099) * Remove redundant `find_unused_parameters=False` in Lite (#16026) * [App] Add display name property to the work (#16095) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> * Fix detection of whether app is running in cloud (#16045) * [App] Add work.delete (#16103) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> * [App] Improve the autoscaler UI (#16063) [App] Improve the autoscaler UI (#16063) * Re-enable Lite CLI on Windows + PyTorch 1.13 (#15645) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * [App] Min replica=0 would break autoscaler component (#16092) * fixing the bug where num_replica=0 would fail * changelog * [App] Scale out/in interval for autoscaler (#16093) * Adding arguments for scale out/in interval * Tests * Set the default work start method to spawn on MacOS (#16089) * [App] Add status endpoint, enable `ready` (#16075) Co-authored-by: thomas chaton <thomas@grid.ai> * Clarify `work.stop()` limitation (#16073) * fix merge errors * Update torchvision requirement from <=0.14.0,>=0.11.1 to >=0.11.1,<0.15.0 in /requirements (#16108) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> * CI: settle file names (#16098) * CI: settle file names * rename * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix test failing on master due to bad auto-merge (#16118) * fix merge error Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Yurij Mikhalevich <yurij@grid.ai> Co-authored-by: Ethan Harris <ethanwharris@gmail.com> Co-authored-by: Sean Naren <snarenthiran@nvidia.com> Co-authored-by: Qiushi Pan <17402261+qqpann@users.noreply.github.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Sherin Thomas <sherin@lightning.ai> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz>
What does this PR do?
These benchmarks check the speed difference between ddp and sharded DDP, however this is meant to be a responsibility of Fairscale. We can assume they run similar benchmarks.
Suggested by @SeanNaren
Does your PR introduce any breaking changes? If yes, please list them.
None
cc @Borda