Add arctic model support by adding w2 to all_reduce #6856

pi314ever · 2024-12-11T23:22:37Z

As title says.

Default behavior of arctic model produces shape issues with AutoTP due to the MLP layer performing w2 * act(w1*w3). However, method provided to fix Mixtral-7x8b in #5257 does not work since the MLP for Arctic is also used within a ModuleList for the MoE. This results in MLP weights hiding behind individual experts as layers #.w#, which is not caught by the fix in #5257. This adds the check directly within replace, where it can check for actual layer names for the w2 key in the model to patch with all_reduce.

Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

pi314ever · 2024-12-11T23:31:55Z

@microsoft-github-policy-service agree company="Intel"

jeffra · 2024-12-16T19:25:59Z

@RezaYazdaniAminabadi @sfc-gh-reyazda can you take a look?

deepspeed/module_inject/auto_tp.py

As title says. Default behavior of arctic model produces shape issues with AutoTP due to the MLP layer performing `w2 * act(w1*w3)`. However, method provided to fix Mixtral-7x8b in deepspeedai#5257 does not work since the MLP for Arctic is also used within a ModuleList for the MoE. This results in MLP weights hiding behind individual experts as layers `#.w#`, which is not caught by the fix in deepspeedai#5257. This adds the check directly within replace, where it can check for actual layer names for the `w2` key in the model to patch with `all_reduce`. --------- Signed-off-by: Daniel Huang <daniel1.huang@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: siqi <siqi@tecorigin.com>

As title says. Default behavior of arctic model produces shape issues with AutoTP due to the MLP layer performing `w2 * act(w1*w3)`. However, method provided to fix Mixtral-7x8b in deepspeedai#5257 does not work since the MLP for Arctic is also used within a ModuleList for the MoE. This results in MLP weights hiding behind individual experts as layers `#.w#`, which is not caught by the fix in deepspeedai#5257. This adds the check directly within replace, where it can check for actual layer names for the `w2` key in the model to patch with `all_reduce`. --------- Signed-off-by: Daniel Huang <daniel1.huang@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

pi314ever requested review from awan-10, loadams and tjruwase as code owners December 11, 2024 23:22

pi314ever added 2 commits December 11, 2024 15:24

Add arctic model support by adding w2 to all_reduce

5b72093

Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

Added arctic model to supported models for autotp

96eb813

Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

pi314ever force-pushed the arctic-enabling-upstream branch from 2c2084b to 96eb813 Compare December 11, 2024 23:24

loadams requested review from jeffra and removed request for awan-10 December 12, 2024 00:38

tjruwase added 2 commits December 12, 2024 11:36

Merge branch 'master' into arctic-enabling-upstream

dbd96d6

Merge branch 'master' into arctic-enabling-upstream

6b1dfd3

sfc-gh-reyazda reviewed Dec 16, 2024

View reviewed changes

deepspeed/module_inject/auto_tp.py Show resolved Hide resolved

Merge branch 'master' into arctic-enabling-upstream

b9af421

tjruwase approved these changes Dec 17, 2024

View reviewed changes

loadams added 2 commits December 17, 2024 11:58

Merge branch 'master' into arctic-enabling-upstream

a98fad8

Merge branch 'master' into arctic-enabling-upstream

acfd41c

loadams merged commit 0b25630 into deepspeedai:master Dec 18, 2024
10 of 11 checks passed

pi314ever mentioned this pull request Jan 24, 2025

Enabling Snowflake Arctic on Gaudi 3 huggingface/optimum-habana#1719

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add arctic model support by adding w2 to all_reduce #6856

Add arctic model support by adding w2 to all_reduce #6856

pi314ever commented Dec 11, 2024

pi314ever commented Dec 11, 2024

jeffra commented Dec 16, 2024

Add arctic model support by adding w2 to all_reduce #6856

Add arctic model support by adding w2 to all_reduce #6856

Conversation

pi314ever commented Dec 11, 2024

pi314ever commented Dec 11, 2024

jeffra commented Dec 16, 2024