Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Big deprecation] Introduces a DataLoaderConfig #2441

Merged
merged 12 commits into from
Feb 14, 2024
Merged

Conversation

muellerzr
Copy link
Collaborator

@muellerzr muellerzr commented Feb 13, 2024

What does this PR do?

As we prepare for 1.0.0, this PR introduces a large deprecation to some of the arguments in the Accelerator, wherein all arguments related to the configuration of the DataLoader are now sent through the DataLoaderConfig. After this PR is merged, we will aim for a 2-release gap between this and 1.0.0 (by May at the latest), as this is a very large (and potentially scary) change for users.

The reasoning is to make it easier for integrations like transformers (and in the future accelerate in general) to configure seperate entities related to more config-heavy items in the accelerator more manageable, and leave the base params in the Accelerator (without config) to be true core important items.

Mapping of old to new:

-accelerator = Accelerator(split_batches=True, dispatch_batches=True)
+dl_config = DataLoaderConfig(split_batches=True, dispatch_batches=True)
+accelerator = Accelerator(dataloader_config=dl_config)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@SunMarc @BenjaminBossan

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this looks good. I have some minor comments and then a discussion about detecting the deprecated arguments, where I'm unsure what the best course of action is. Please take a look.

docs/source/quicktour.md Outdated Show resolved Hide resolved
src/accelerate/accelerator.py Show resolved Hide resolved
src/accelerate/accelerator.py Outdated Show resolved Hide resolved
src/accelerate/accelerator.py Outdated Show resolved Hide resolved
tests/test_accelerator.py Show resolved Hide resolved
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found another potential issue with backwards compatibility, please check.

docs/source/quicktour.md Outdated Show resolved Hide resolved
src/accelerate/accelerator.py Show resolved Hide resolved
src/accelerate/accelerator.py Show resolved Hide resolved
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now from my side, thanks for working on this refactor.

tests/test_accelerator.py Outdated Show resolved Hide resolved
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for your work !

self.dataloader_config.use_seedable_sampler = use_seedable_sampler
if len(deprecated_dl_args) > 0:
values = ", ".join([f"{k}={v}" for k, v in deprecated_dl_args.items()])
warnings.warn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maybe use logger.warning so that users have more control of their logs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we don't want them to mute that as these deprecations aren't ones that will exist for many, many months. They'll exist for a very short (relative) time. We have this for the other 1.0 warnings as well :)

(Just checked, same with transformers)

@@ -55,6 +56,43 @@ def parameterized_custom_name_func(func, param_num, param):


class AcceleratorTester(AccelerateTestCase):
# Should be removed after 1.0.0 release
def test_deprecated_values(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice !

@muellerzr muellerzr merged commit 164193f into main Feb 14, 2024
25 checks passed
@muellerzr muellerzr deleted the dataloader-config branch February 14, 2024 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants