🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 #2554

muellerzr · 2024-03-14T17:34:42Z

What does this PR do?

Rather than having huggingface/accelerate-gpu huggingface/accelerate-cpu + a new repository for each new env that we will add (e.g. soon to be XLA and XLA_gpu), this PR centralizes all our docker images to just be in huggingface/accelerate and uses the right tags to make discovery and management easier.

This PR also includes a README in /docker which mentions what images are available and how to run things.

The Pros:

Easier discoverability of images based on tags, rather than needing to know the full repo
Easier interoperability and debugging as images now have nightly tags associated with the date they were ran
Allows for easier fine-grained tagging and build management to use later on (such as deepspeed specific builds, XLA-specific builds, etc, which now can be pushed as a seperate tag instead of needing to make a ton of new repos)

The Cons:

Downstream users using huggingface/accelerate-cpu or huggingface/accelerate-gpu will not receive updated images.

Not sure if there's a good way to alert users to the deprecation inside Docker, so we'll just need to do so thoroughly in comms.

The conversion:

huggingface/accelerate-gpu:latest -> huggingface/accelerate:gpu-nightly
huggingface/accelerate-cpu:latest -> huggingface/accelerate:cpu-nightly

New image tags

Nightlies are now built with versioning in mind for individuals to roll back to easily (or CI's). This helps us easily trakc down if a new dependency got released, when it happened, and what was new in the build.

huggingface/accelerate:{gpu,cpu}-nightly
huggingface/accelerate:{gpu,cpu}-nightly-{YYYY}-{MM}-{DD}

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ydshieh @BenjaminBossan

muellerzr · 2024-03-14T17:35:15Z

New repo is located here: https://hub.docker.com/r/huggingface/accelerate/tags

Looking at migrating the README.md to the repo currently

HuggingFaceDocBuilderDev · 2024-03-14T17:38:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ydshieh

LGTM, thanks.

Huggingface/accelerate-gpu:latest -> huggingface/accelerate:gpu-latest

I think we don't have tag containing -latest anymore.

Regarding how long to keep the images, I still think it's better to talk (or at least inform) the infra team. Not sure how large is an accelerate image, but transformers images are of around ~10G. (And I don't know if we will get problems when we have hundred of such images 😆 )

muellerzr · 2024-03-14T17:55:38Z

I think we don't have tag containing -latest anymore.

Thanks, yup adjusted as it's -nightly now

Regarding how long to keep the images, I still think it's better to talk (or at least inform) the infra team. Not sure how large is an accelerate image, but transformers images are of around ~10G. (And I don't know if we will get problems when we have hundred of such images 😆 )

Definitely the next step. Accelerate images on Docker are also around 10gb

BenjaminBossan

The reason for these changes make sense to me. As mentioned, the question about retention should probably be addressed.

Regarding the deprecation, there doesn't seem to be an official way: docker/hub-feedback#907. Not sure what can be done, really, except for mentioning it in the release notes.

muellerzr · 2024-03-18T13:34:57Z

Not sure what can be done, really, except for mentioning it in the release notes.

@BenjaminBossan I more or less plan on doing the following:

The official README in the repos will be updated to include a big deprecation notification and what to migrate to
We'll publish one more release of the images, stating a PRINT statement on entry to the container that the image is deprecated and to migrate (NVIDIA does this, it stands to reason we can too)

Next steps from here are more fine-grained builds as needed internally and externally

A DeepSpeed specific build (tag: accelerate:cuda-deepspeed)
A TransformerEngine + MS-AMP specific build (tag: accelerate-cuda-fp8)

muellerzr added 2 commits March 14, 2024 13:15

Move to using tags

e07e786

Add readme

c91e805

muellerzr requested review from ydshieh and BenjaminBossan March 14, 2024 17:34

muellerzr added 2 commits March 14, 2024 13:41

Include hf repo description in auto-build

17644ac

Test

08c9c61

ydshieh approved these changes Mar 14, 2024

View reviewed changes

Even with an a...

333f428

muellerzr added 3 commits March 14, 2024 14:00

Rm readme things

bb8d4cf

Symlink README for docker repo

7d9c8be

Include readme

1d7f4c3

muellerzr changed the title ~~Move to using tags rather than latest for docker images and consolidate image repos~~ 🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 Mar 14, 2024

muellerzr added 6 commits March 14, 2024 14:07

Fin

0a7aee2

Try now?

a4f732c

Finally got symlink working

00f687f

Let's try this

b4164b1

Forgot runs-on

11b3262

Still perm issues, revert

2a84513

BenjaminBossan approved these changes Mar 15, 2024

View reviewed changes

muellerzr merged commit 2ad42e7 into main Mar 18, 2024
25 checks passed

muellerzr deleted the use-accelerate-repo branch March 18, 2024 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 #2554

🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 #2554

muellerzr commented Mar 14, 2024 •

edited

Loading

muellerzr commented Mar 14, 2024

HuggingFaceDocBuilderDev commented Mar 14, 2024

ydshieh left a comment •

edited

Loading

muellerzr commented Mar 14, 2024

BenjaminBossan left a comment

muellerzr commented Mar 18, 2024

🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 #2554

🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 #2554

Conversation

muellerzr commented Mar 14, 2024 • edited Loading

What does this PR do?

The conversion:

New image tags

Before submitting

Who can review?

muellerzr commented Mar 14, 2024

HuggingFaceDocBuilderDev commented Mar 14, 2024

ydshieh left a comment • edited Loading

Choose a reason for hiding this comment

muellerzr commented Mar 14, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

muellerzr commented Mar 18, 2024

muellerzr commented Mar 14, 2024 •

edited

Loading

ydshieh left a comment •

edited

Loading