[`NLLB-MoE`] Fix NLLB MoE 4bit inference #27012

younesbelkada · 2023-10-23T10:21:54Z

What does this PR do?

The hidden states gets silenty casted in unint8 leading to the error described in #26898

The check and self.fc2.weight.dtype != torch.int8 is not sufficient in order to cover 4bit models, for these models the weights are in uint8, hence adding an extra condition to cover 4bit models fixes the inference issue

cc @ArthurZucker

HuggingFaceDocBuilderDev · 2023-10-23T10:40:18Z

The documentation is not available anymore as the PR was closed or merged.

ArthurZucker

Interesting! Thanks for the fix

CAH9487 · 2023-10-23T13:06:07Z

Thanks!

fix NLLB MoE 4bit

fix NLLB MoE 4bit

a74de2a

younesbelkada requested a review from ArthurZucker October 23, 2023 10:34

Merge remote-tracking branch 'upstream/main' into fix-nllb-moe

eb2c545

ArthurZucker approved these changes Oct 23, 2023

View reviewed changes

younesbelkada merged commit 244a53e into huggingface:main Oct 23, 2023
18 checks passed

younesbelkada deleted the fix-nllb-moe branch October 23, 2023 12:54

staghado pushed a commit to staghado/transformers that referenced this pull request Oct 24, 2023

[NLLB-MoE] Fix NLLB MoE 4bit inference (huggingface#27012)

fa36ec1

fix NLLB MoE 4bit

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023

[NLLB-MoE] Fix NLLB MoE 4bit inference (huggingface#27012)

d56d9bf

fix NLLB MoE 4bit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`NLLB-MoE`] Fix NLLB MoE 4bit inference #27012

[`NLLB-MoE`] Fix NLLB MoE 4bit inference #27012

younesbelkada commented Oct 23, 2023

HuggingFaceDocBuilderDev commented Oct 23, 2023 •

edited

Loading

ArthurZucker left a comment

CAH9487 commented Oct 23, 2023

[NLLB-MoE] Fix NLLB MoE 4bit inference #27012

[NLLB-MoE] Fix NLLB MoE 4bit inference #27012

Conversation

younesbelkada commented Oct 23, 2023

What does this PR do?

HuggingFaceDocBuilderDev commented Oct 23, 2023 • edited Loading

ArthurZucker left a comment

Choose a reason for hiding this comment

CAH9487 commented Oct 23, 2023

[`NLLB-MoE`] Fix NLLB MoE 4bit inference #27012

[`NLLB-MoE`] Fix NLLB MoE 4bit inference #27012

HuggingFaceDocBuilderDev commented Oct 23, 2023 •

edited

Loading