-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixtral inference breaks when output_router_logits=True
#29087
Comments
output_router_logits=True
When running inference you should set |
Thanks @ArthurZucker, I believe it is a bit hard to spot the correct behaviour from the docs so I was wondering if it is always the case that inference requires turning off the config and if so maybe it should be enforced when |
Actually this should be enforced when call |
Sounds good, I'll happily take care of it @ArthurZucker. Just to make sure do you think it's better raising an assertion when mixtral is used in inference with that configuration or rather raising a warning and ignoring it (even if the user set it to True)? I believe at that stage the first option should be preferred and the second scenario should be handled earlier (maybe when setting the model in inference mode?). |
No I think we should always set it, this is the expected api for |
System Info
transformers
version: 4.38.0.dev0device_map="auto"
Who can help?
@ArthurZucker @gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
The snippet
produces
Important details:
Mixtral-8x7B-v0.1
using axolotl, different weights but same shapesExpected behavior
output_router_logits
(see official docs) and its usage should only be limited during training (as described here). I believe this was set by during training and then stored into the checkpoints, disabling it in the configs produces the expected results.model.eval()
is called?The text was updated successfully, but these errors were encountered: