Support multi-output models by tjruwase · Pull Request #170 · deepspeedai/DeepSpeed

tjruwase · 2020-03-24T18:03:14Z

Correctly handle multi-output models by performing loss scaling in backward() instead of forward()

…/scale_none_loss

Unit tests for multi output models

g-karthik

Haven't looked at the tests closely -- the core changes look good to me!

ShadenSmith · 2020-03-27T14:50:51Z

@tjruwase I think you will need to tell DeepSpeed to update the DeepSpeedExamples submodule in order to incorporate your fix to Squad last night.

You can do that from your branch:

git submodule update --remote DeepSpeedExamples

Then commit that update.

ShadenSmith · 2020-03-27T14:52:26Z

@tjruwase would you do me a favor and also enable the Megatron tests in your branch for now? Megatron is disabled by default until I get the nightly tests going. Those lines are just commented out in tests/model/run_sanity_check.py. I'm wondering if we need to account for the new loss scaling in those tests too.

Enable Megatron model tests

…/scale_none_loss

tjruwase · 2020-03-27T20:28:20Z

@ShadenSmith Thanks for the guidance. I have updated the DeepSpeedExamples submodule and enabled Megatron in model tests. Now run_sanity_checks passes. For some reason, github has not received notification of the model tests passing.

* Push to remote * Correctly handle multi output models by doing loss scaling in backward() Unit tests for multi output models * Fix formatting issues * Formatting issues fix * Fix formatting * Update DeepSpeedExamples submodule Enable Megatron model tests

tjruwase added 5 commits March 23, 2020 21:27

Push to remote

312314a

Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…

420ebc2

…/scale_none_loss

Correctly handle multi output models by doing loss scaling in backward()

dafd3c7

Unit tests for multi output models

Fix formatting issues

84977a2

Formatting issues fix

f8809eb

tjruwase assigned ShadenSmith, jeffra and samyam Mar 24, 2020

This was linked to issues Mar 24, 2020

DeepSpeed assumes model returns just one variable: loss #150

Closed

Handle None in loss scaling #157

Closed

Fix formatting

cdc1a89

g-karthik approved these changes Mar 25, 2020

View reviewed changes

tjruwase added 2 commits March 25, 2020 11:42

Merge branch 'master' into olruwase/scale_none_loss

9c10dca

Merge conflicts

279b1d6

ShadenSmith approved these changes Mar 27, 2020

View reviewed changes

tjruwase added 2 commits March 27, 2020 18:26

Update DeepSpeedExamples submodule

d816b80

Enable Megatron model tests

Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…

404d794

…/scale_none_loss

tjruwase merged commit 53c73fe into master Mar 27, 2020

tjruwase mentioned this pull request Mar 27, 2020

Fix incorrect tensorboard loss calculation #164

Closed

jeffra pushed a commit to jeffra/DeepSpeed that referenced this pull request Apr 19, 2021

New offloading configuration (deepspeedai#170)

67f957a

jeffra deleted the olruwase/scale_none_loss branch September 24, 2021 04:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multi-output models#170

Support multi-output models#170
tjruwase merged 10 commits intomasterfrom
olruwase/scale_none_loss

tjruwase commented Mar 24, 2020

Uh oh!

g-karthik left a comment

Uh oh!

ShadenSmith commented Mar 27, 2020

Uh oh!

ShadenSmith commented Mar 27, 2020

Uh oh!

tjruwase commented Mar 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

tjruwase commented Mar 24, 2020

Uh oh!

g-karthik left a comment

Choose a reason for hiding this comment

Uh oh!

ShadenSmith commented Mar 27, 2020

Uh oh!

ShadenSmith commented Mar 27, 2020

Uh oh!

tjruwase commented Mar 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants