[docs] Doc sprint #3099

stevhliu · 2024-09-10T21:33:43Z

Addresses the docs improvements internally discussed:

add headers for more classes/functions in the API docs so they're easier and faster to find
fix unsupported <Note> tags
align title in the toctree with the header referenced in the doc
remove 🤗 emoji to avoid cluttering

HuggingFaceDocBuilderDev · 2024-09-10T21:44:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

muellerzr

Nice! 🤩

cc @SunMarc for the inference related docs

muellerzr · 2024-09-10T21:47:32Z

docs/source/package_reference/fsdp.md

-# Utilities for Fully Sharded Data Parallelism
+# Fully Sharded Data Parallel utilities
+
+## enable_fsdp_ram_efficient_loading


Wish we didn't have to do this everywhere but I suppose can't be helped :/

SunMarc

Thanks for this sprint @stevhliu ! The big model inference part is a lot nicer now ! I left a few suggestions

docs/source/_toctree.yml

docs/source/package_reference/inference.md

docs/source/usage_guides/big_modeling.md

stevhliu · 2024-09-11T16:21:12Z

docs/source/usage_guides/distributed_inference.md


 The general idea with pipeline parallelism is: say you have 4 GPUs and a model big enough it can be *split* on four GPUs using `device_map="auto"`. With this method you can send in 4 inputs at a time (for example here, any amount works) and each model chunk will work on an input, then receive the next input once the prior chunk finished, making it *much* more efficient **and faster** than the method described earlier. Here's a visual taken from the PyTorch repository:

 ![PiPPy example](https://camo.githubusercontent.com/681d7f415d6142face9dd1b837bdb2e340e5e01a58c3a4b119dea6c0d99e2ce0/68747470733a2f2f692e696d6775722e636f6d2f657955633934372e706e67)

 To illustrate how you can use this with Accelerate, we have created an [example zoo](https://github.com/huggingface/accelerate/tree/main/examples/inference) showcasing a number of different models and situations. In this tutorial, we'll show this method for GPT2 across two GPUs.

-Before you proceed, please make sure you have the latest pippy installed by running the following:
+Before you proceed, please make sure you have the latest PyTorch version installed by running the following:


I also updated the installation instructions here, let me know if its incorrect!

Nope it's correct :)

in a follow-up after this I'll fix the PiPPy example chunk so we don't have a merge conflict :)

stevhliu added 2 commits September 10, 2024 14:15

docs sprint

c9d434f

youtube id

6fc51df

stevhliu requested a review from muellerzr September 10, 2024 21:46

muellerzr approved these changes Sep 10, 2024

View reviewed changes

muellerzr requested a review from SunMarc September 10, 2024 21:48

SunMarc approved these changes Sep 11, 2024

View reviewed changes

feedback

0b143eb

stevhliu commented Sep 11, 2024

View reviewed changes

muellerzr merged commit fc52fa9 into huggingface:main Sep 11, 2024
2 checks passed

stevhliu deleted the doc-improvements branch September 11, 2024 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docs] Doc sprint #3099

[docs] Doc sprint #3099

stevhliu commented Sep 10, 2024

HuggingFaceDocBuilderDev commented Sep 10, 2024

muellerzr left a comment

muellerzr Sep 10, 2024

SunMarc left a comment

stevhliu Sep 11, 2024

muellerzr Sep 11, 2024

muellerzr Sep 11, 2024

[docs] Doc sprint #3099

[docs] Doc sprint #3099

Conversation

stevhliu commented Sep 10, 2024

HuggingFaceDocBuilderDev commented Sep 10, 2024

muellerzr left a comment

Choose a reason for hiding this comment

muellerzr Sep 10, 2024

Choose a reason for hiding this comment

SunMarc left a comment

Choose a reason for hiding this comment

stevhliu Sep 11, 2024

Choose a reason for hiding this comment

muellerzr Sep 11, 2024

Choose a reason for hiding this comment

muellerzr Sep 11, 2024

Choose a reason for hiding this comment