Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` #36091

hmellor · 2025-02-07T13:26:56Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

NouamaneTazi

Finally PP is becoming native in transformers!! 🔥🔥🔥

For the attr to add i was thinking something more like:

base_model_pp_plan = OrderedDict([
    ("embed_tokens", {"input_tensors": [...], "output_tensors": [...]}),
    ("some_other_op_that_comes_before_the_first_layer", {"input_tensors": [...], "output_tensors": [...]}),
    ("layers.*", {"input_tensors": [...], "output_tensors": [...]}), # maybe leave ".*" to highlight it's a modulelist
    ("norm", {"input_tensors": [...], "output_tensors": [...]}),
    ("cast_to_fp32", {"input_tensors": [...], "output_tensors": [...]}),
    ("some_other_op_that_comes_after_the_last_layer", {"input_tensors": [...], "output_tensors": [...]})
])

And in LlamaForCausalLM

_pp_plan = base_model_pp_plan.insert(-1, ("lm_head", {"input_tensors": [...], "output_tensors": [...]})

Or for example Deepseek can add multiple heads etc

src/transformers/models/llama/configuration_llama.py

HuggingFaceDocBuilderDev · 2025-02-07T14:36:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

ArthurZucker

Super nice for an initial design!

src/transformers/models/llama/configuration_llama.py

src/transformers/models/llama/modeling_llama.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor · 2025-02-08T10:26:43Z

I've simplified the schema as suggested. Perhaps we could also add an Enum to so that users can index these schemas without knowing that [0] means inputs and [1] means outputs?

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

ArthurZucker

Super nice!
maybe the only thing missing is a small test for models that do support PP!
this could for example be using this:

from torch.distributed.pipelining import pipeline, SplitPoint

# An example micro-batch input
x = torch.LongTensor([1, 2, 4, 5])

pipe = pipeline(
    module=mod,
    mb_args=(x,),
    split_spec={
        "layers.1": SplitPoint.BEGINNING,
    }
)

with something similar as what we are doing for TP! Tho this could be in another PR!
(or from here

NouamaneTazi

Very clean! Left minor comment to use tuple instead of list
Thanks!

NouamaneTazi · 2025-02-10T15:32:04Z

src/transformers/modeling_utils.py

+    # be indexed using the `PipelineParallel` enum as follows:
+    # - `_pp_plan["layers"][PipelineParallel.inputs]`
+    # - `_pp_plan["layers"][PipelineParallel.outputs]`


src/transformers/models/aria/configuration_aria.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor · 2025-02-10T16:26:17Z

Thanks both for the reviews!

Since there is currently no pipeline logic added by this PR let's save testing for a follow up PR.

ArthurZucker

Nice! I'll try to work on the follow up PR asap to futur proof the design but LGTM

huggingface#36091) * Add `base_model_pp_plan` to `PretrainedConfig` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add `_pp_plan` to `PreTrainedModel` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add both to Llama for testing Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix type error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update to suggested schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * `_pp_plan` keys are not patterns Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Simplify schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix typing error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update input name for Llama Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Aria Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Bamba Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Cohere 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to diffllama and emu3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Gemma 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to GLM and GPT NeoX Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Granite and Helium Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Mistral and Mixtral Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to OLMo 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Phi and Phi 3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Starcoder 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add enum for accessing inputs and outputs Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update type hints to use tuples Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Change outer list to tuple Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor added 4 commits February 7, 2025 13:21

Add base_model_pp_plan to PretrainedConfig

64f1fd6

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add _pp_plan to PreTrainedModel

040ba14

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add both to Llama for testing

ac56826

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Fix type error

fee056f

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

NouamaneTazi reviewed Feb 7, 2025

View reviewed changes

src/transformers/models/llama/configuration_llama.py Outdated Show resolved Hide resolved

NouamaneTazi self-assigned this Feb 7, 2025

hmellor added 2 commits February 7, 2025 16:44

Update to suggested schema

826cd17

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

_pp_plan keys are not patterns

cfee680

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor mentioned this pull request Feb 7, 2025

Add pipeline parallel support to TransformersModel vllm-project/vllm#12832

Open

NouamaneTazi removed their assignment Feb 7, 2025

ArthurZucker reviewed Feb 7, 2025

View reviewed changes

src/transformers/models/llama/configuration_llama.py Show resolved Hide resolved

src/transformers/models/llama/modeling_llama.py Outdated Show resolved Hide resolved

Simplify schema

7e03c0d

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor added 16 commits February 10, 2025 14:37

Fix typing error

b9ee76f

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Update input name for Llama

f5ed151

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Aria

cb60de8

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Bamba

9a07903

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Cohere 1 & 2

57a2aa5

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to diffllama and emu3

2703a9e

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Gemma 1 & 2

19f5346

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to GLM and GPT NeoX

5de4151

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Granite and Helium

3a490d8

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Mistral and Mixtral

0c2c8df

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to OLMo 1 & 2

c896062

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan to Phi and Phi 3

c01266d

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL

81c1185

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add pp plan for Starcoder 2

95cec4b

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add enum for accessing inputs and outputs

387e3a8

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Merge branch 'main' into add-pp-plan

dd1a2d5

hmellor marked this pull request as ready for review February 10, 2025 14:56

hmellor requested review from ArthurZucker and NouamaneTazi February 10, 2025 14:56

ArthurZucker approved these changes Feb 10, 2025

View reviewed changes

NouamaneTazi approved these changes Feb 10, 2025

View reviewed changes

hmellor added 2 commits February 10, 2025 17:22

Update type hints to use tuples

dca584d

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Change outer list to tuple

0212912

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

ArthurZucker approved these changes Feb 11, 2025

View reviewed changes

ArthurZucker merged commit f5fff67 into huggingface:main Feb 12, 2025
22 of 25 checks passed

hmellor deleted the add-pp-plan branch February 13, 2025 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` #36091

Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` #36091

hmellor commented Feb 7, 2025

NouamaneTazi left a comment •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 7, 2025

ArthurZucker left a comment

hmellor commented Feb 8, 2025

ArthurZucker left a comment

NouamaneTazi left a comment

NouamaneTazi Feb 10, 2025

hmellor commented Feb 10, 2025

ArthurZucker left a comment

Add pipeline parallel plan to PretrainedConfig and PreTrainedModel #36091

Add pipeline parallel plan to PretrainedConfig and PreTrainedModel #36091

Conversation

hmellor commented Feb 7, 2025

What does this PR do?

Before submitting

Who can review?

NouamaneTazi left a comment • edited Loading

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Feb 7, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

hmellor commented Feb 8, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

NouamaneTazi left a comment

Choose a reason for hiding this comment

NouamaneTazi Feb 10, 2025

Choose a reason for hiding this comment

hmellor commented Feb 10, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` #36091

Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` #36091

NouamaneTazi left a comment •

edited

Loading