Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OSS Train Loop #563

Closed
wants to merge 1 commit into from
Closed

Add OSS Train Loop #563

wants to merge 1 commit into from

Conversation

dahsh
Copy link
Contributor

@dahsh dahsh commented Jun 30, 2022

Summary:

  • Add simple train loop example of dataloader2 with a toy model in the torchdata
    examples. This demonstrates the dataloader2 usage with train step (forward +
    backward + optimize) in the train loop of multiple epochs.

  • This basic example skipped the dataset and datapipe creation part, only focused
    on the train loop usage part with the DLv2. We will have more examples to
    showcase the advantages:

(1) The usage of the DLv2 with popular open source dataset.
(2) Integrate datasets/datapipes with different reading service.
(3) Use popular open source dataset in the E2E training.
(4) Datapipe manipulation for example batch, collate, map.
(5) Dist usage and examples with features such as sharding_filter for the sharding feature.
(6) Eventually add those examples to the pytorch tutorial.

Reviewed By: ejguan

Differential Revision: D37366122

@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Jun 30, 2022
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D37366122

Summary:
Pull Request resolved: pytorch#563

* Add simple train loop example of dataloader2 with a toy model in the torchdata
 examples. This demonstrates the dataloader2 usage with train step (forward +
backward + optimize) in the train loop of multiple epochs.

* This basic example skipped the dataset and datapipe creation part, only focused
 on the train loop usage part with the DLv2. We will have more examples to
showcase the advantages:

(1) The usage of the DLv2 with popular open source dataset.
(2) Integrate datasets/datapipes with different reading service.
(3) Use popular open source dataset in the E2E training.
(4) Datapipe manipulation for example batch, collate, map.
(5) Dist usage and examples with features such as sharding_filter for the sharding feature.
(6) Eventually add those examples to the pytorch tutorial.

Reviewed By: ejguan

Differential Revision: D37366122

fbshipit-source-id: 38863b70d5c8e99928ccbfcd8aa4e50de4d536f5
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D37366122

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants