-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture #12466
base: master
Are you sure you want to change the base?
Conversation
- Adds MoE-based embedding model supporting multilingual embeddings. - Selects architecture variant based on hyperparameter detection (MoE layers). - Removes unnecessary subclass initialization checks for clarity. https://www.nomic.ai/blog/posts/nomic-embed-text-v2 Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Adam Treat <treat.adam@gmail.com>
@@ -702,6 +695,8 @@ def get_vocab_base_pre(self, tokenizer) -> str: | |||
if chkhsh == "ccc2ef013c104be7bae2965776d611e1d7a8a2a9c547dd93a682c9a9fc80352e": | |||
# ref: https://huggingface.co/Xenova/gpt-4o | |||
res = "gpt-4o" | |||
if chkhsh == "a81863d07e75497e2194eb1a1574d5e5cd4d5f85a87a0728b922bf2bed6fb327": | |||
res = "bert" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly added tokenizer is nomic-embed-text-v2-moe
and not bert
, is this expected?
And also this list is auto-generated, please make sure not to modify it manually
|
||
if "mlp.experts.mlp.w1" in name: | ||
data_torch = data_torch.view(self.hparams["num_experts"], self.hparams["n_inner"], self.hparams["n_embd"]) | ||
return [(self.map_tensor_name(name) + ".weight", data_torch)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this work? (no need to return here)
map_tensor_name
will append .weight
if the given original name also have it
return [(self.map_tensor_name(name) + ".weight", data_torch)] | |
name += ".weight" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed something, but llm_build_bert
does not seem to support MoE, right? Should we also update the compute graph?
Working on tests to verify the accuracy of the model |
Signed-off-by: Adam Treat <treat.adam@gmail.com>
https://www.nomic.ai/blog/posts/nomic-embed-text-v2
Make sure to read the contributing guidelines before submitting a PR