-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model: support arch DbrxForCausalLM
#6515
Commits on Apr 6, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1d8de31 - Browse repository at this point
Copy the full SHA 1d8de31View commit details -
Configuration menu - View commit details
-
Copy full SHA for ed582c1 - Browse repository at this point
Copy the full SHA ed582c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3e3d2d1 - Browse repository at this point
Copy the full SHA 3e3d2d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3937100 - Browse repository at this point
Copy the full SHA 3937100View commit details -
Configuration menu - View commit details
-
Copy full SHA for c0beb3c - Browse repository at this point
Copy the full SHA c0beb3cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0921033 - Browse repository at this point
Copy the full SHA 0921033View commit details -
Configuration menu - View commit details
-
Copy full SHA for e4f8ee4 - Browse repository at this point
Copy the full SHA e4f8ee4View commit details -
Configuration menu - View commit details
-
Copy full SHA for a7f9a3e - Browse repository at this point
Copy the full SHA a7f9a3eView commit details -
Configuration menu - View commit details
-
Copy full SHA for e3c1e81 - Browse repository at this point
Copy the full SHA e3c1e81View commit details -
convert: dbrx: fix mixed up and down expert tensors
llama: dbrx: review graph
Configuration menu - View commit details
-
Copy full SHA for 0a35f58 - Browse repository at this point
Copy the full SHA 0a35f58View commit details -
Configuration menu - View commit details
-
Copy full SHA for c8e6f90 - Browse repository at this point
Copy the full SHA c8e6f90View commit details -
Configuration menu - View commit details
-
Copy full SHA for 916b918 - Browse repository at this point
Copy the full SHA 916b918View commit details -
Configuration menu - View commit details
-
Copy full SHA for 03da419 - Browse repository at this point
Copy the full SHA 03da419View commit details -
Configuration menu - View commit details
-
Copy full SHA for 76f266b - Browse repository at this point
Copy the full SHA 76f266bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9c7dedb - Browse repository at this point
Copy the full SHA 9c7dedbView commit details -
Configuration menu - View commit details
-
Copy full SHA for fe80898 - Browse repository at this point
Copy the full SHA fe80898View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f12a58 - Browse repository at this point
Copy the full SHA 4f12a58View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6985629 - Browse repository at this point
Copy the full SHA 6985629View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e7cd53 - Browse repository at this point
Copy the full SHA 7e7cd53View commit details
Commits on Apr 7, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 52c4033 - Browse repository at this point
Copy the full SHA 52c4033View commit details -
Configuration menu - View commit details
-
Copy full SHA for 06a59ab - Browse repository at this point
Copy the full SHA 06a59abView commit details -
Configuration menu - View commit details
-
Copy full SHA for 305ac3b - Browse repository at this point
Copy the full SHA 305ac3bView commit details -
Configuration menu - View commit details
-
Copy full SHA for b6522a9 - Browse repository at this point
Copy the full SHA b6522a9View commit details -
Configuration menu - View commit details
-
Copy full SHA for dccb012 - Browse repository at this point
Copy the full SHA dccb012View commit details -
Configuration menu - View commit details
-
Copy full SHA for 61be4b9 - Browse repository at this point
Copy the full SHA 61be4b9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1fb6d95 - Browse repository at this point
Copy the full SHA 1fb6d95View commit details -
model: dbrx: convert-hf-to-gguf.py fix fix ftype missing, fix tensor …
…names does not suffix with .weight
Configuration menu - View commit details
-
Copy full SHA for 200ce21 - Browse repository at this point
Copy the full SHA 200ce21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9e17dad - Browse repository at this point
Copy the full SHA 9e17dadView commit details -
llama: quantize: remove wrong look for tensor qkv name as it was badl…
…y missing the .weight suffix
Configuration menu - View commit details
-
Copy full SHA for d7546fd - Browse repository at this point
Copy the full SHA d7546fdView commit details -
model: dbrx: convert-hf-to-gguf.py fix 'token_embd.weight' has wrong …
…shape, fix special tokens
Configuration menu - View commit details
-
Copy full SHA for 3a9dc2e - Browse repository at this point
Copy the full SHA 3a9dc2eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8154617 - Browse repository at this point
Copy the full SHA 8154617View commit details -
llama: dbrx: no weight suffix in ffn_gate_exps, ffn_up_exps and ffn_d…
…own_exps. Output tensor not optional.
Configuration menu - View commit details
-
Copy full SHA for 2449ef4 - Browse repository at this point
Copy the full SHA 2449ef4View commit details -
llama: quantize: remove wrong look for tensor qkv name as it was badl…
…y missing the .weight suffix model: dbrx: convert to gguf force experts tensors to have .weight suffix
Configuration menu - View commit details
-
Copy full SHA for 1bd9427 - Browse repository at this point
Copy the full SHA 1bd9427View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9987c6 - Browse repository at this point
Copy the full SHA e9987c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for d151d8f - Browse repository at this point
Copy the full SHA d151d8fView commit details -
Configuration menu - View commit details
-
Copy full SHA for f062b83 - Browse repository at this point
Copy the full SHA f062b83View commit details -
Configuration menu - View commit details
-
Copy full SHA for dbfd591 - Browse repository at this point
Copy the full SHA dbfd591View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7dd84b0 - Browse repository at this point
Copy the full SHA 7dd84b0View commit details -
Configuration menu - View commit details
-
Copy full SHA for c9bddbf - Browse repository at this point
Copy the full SHA c9bddbfView commit details -
Configuration menu - View commit details
-
Copy full SHA for e2c9199 - Browse repository at this point
Copy the full SHA e2c9199View commit details -
Configuration menu - View commit details
-
Copy full SHA for 50b4373 - Browse repository at this point
Copy the full SHA 50b4373View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0ab1bae - Browse repository at this point
Copy the full SHA 0ab1baeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 830e46d - Browse repository at this point
Copy the full SHA 830e46dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2897aa6 - Browse repository at this point
Copy the full SHA 2897aa6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 993f836 - Browse repository at this point
Copy the full SHA 993f836View commit details -
Configuration menu - View commit details
-
Copy full SHA for b01b062 - Browse repository at this point
Copy the full SHA b01b062View commit details -
Configuration menu - View commit details
-
Copy full SHA for 74e6d87 - Browse repository at this point
Copy the full SHA 74e6d87View commit details -
Configuration menu - View commit details
-
Copy full SHA for f8f97e7 - Browse repository at this point
Copy the full SHA f8f97e7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 71f9e47 - Browse repository at this point
Copy the full SHA 71f9e47View commit details
Commits on Apr 8, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 52c6276 - Browse repository at this point
Copy the full SHA 52c6276View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e22688 - Browse repository at this point
Copy the full SHA 8e22688View commit details -
llama: dbrx: rename tensor to actual meaning. Fix normalization in gr…
…aph. Permute expert tensors to the llama.cpp layout
Configuration menu - View commit details
-
Copy full SHA for 35dce3e - Browse repository at this point
Copy the full SHA 35dce3eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 506cc2e - Browse repository at this point
Copy the full SHA 506cc2eView commit details -
Configuration menu - View commit details
-
Copy full SHA for eb0847e - Browse repository at this point
Copy the full SHA eb0847eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 81f308a - Browse repository at this point
Copy the full SHA 81f308aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 21fb24a - Browse repository at this point
Copy the full SHA 21fb24aView commit details -
Configuration menu - View commit details
-
Copy full SHA for f20c04f - Browse repository at this point
Copy the full SHA f20c04fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 48909ed - Browse repository at this point
Copy the full SHA 48909edView commit details -
Configuration menu - View commit details
-
Copy full SHA for 18a84fe - Browse repository at this point
Copy the full SHA 18a84feView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9968952 - Browse repository at this point
Copy the full SHA 9968952View commit details -
Configuration menu - View commit details
-
Copy full SHA for e66f1e3 - Browse repository at this point
Copy the full SHA e66f1e3View commit details -
Configuration menu - View commit details
-
Copy full SHA for f30a73b - Browse repository at this point
Copy the full SHA f30a73bView commit details -
Configuration menu - View commit details
-
Copy full SHA for ea8b58c - Browse repository at this point
Copy the full SHA ea8b58cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 55943a2 - Browse repository at this point
Copy the full SHA 55943a2View commit details -
Configuration menu - View commit details
-
Copy full SHA for c7b9a2e - Browse repository at this point
Copy the full SHA c7b9a2eView commit details -
Configuration menu - View commit details
-
Copy full SHA for ac82aa0 - Browse repository at this point
Copy the full SHA ac82aa0View commit details
Commits on Apr 9, 2024
-
gguf-py: dbrx: reverse again the MOE tensors mapping:
layer.ffn_up_exps -> Up-projection weights (w1) layer.ffn_gate_exps -> Gating weights (v1) layer.ffn_down_exps -> Down-projection weights (w2)
Configuration menu - View commit details
-
Copy full SHA for ac75fbd - Browse repository at this point
Copy the full SHA ac75fbdView commit details -
Configuration menu - View commit details
-
Copy full SHA for e5631cf - Browse repository at this point
Copy the full SHA e5631cfView commit details
Commits on Apr 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6f813dc - Browse repository at this point
Copy the full SHA 6f813dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 74529e5 - Browse repository at this point
Copy the full SHA 74529e5View commit details
Commits on Apr 11, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 06527c6 - Browse repository at this point
Copy the full SHA 06527c6View commit details
Commits on Apr 12, 2024
-
Configuration menu - View commit details
-
Copy full SHA for fc89fee - Browse repository at this point
Copy the full SHA fc89feeView commit details -
Is silu activation function applied to MODEL_TENSOR.FFN_GATE_EXP here…
…? If so, we must change this to w1 for DBRX. Each expert in DBRX has 3 linear layers: w1, v1 and w2. For an input tensor x, output from the expert layer would be (silu(x.w1_t) * x.v1_t) . w2_t). Same math is also used in mixtral, only difference being DBRX uses v1 instead of w3 in mixtral. Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for bdc4efe - Browse repository at this point
Copy the full SHA bdc4efeView commit details -
Is silu activation function applied to MODEL_TENSOR.FFN_GATE_EXP here…
…? If so, we must change this to w1 for DBRX. Each expert in DBRX has 3 linear layers: w1, v1 and w2. For an input tensor x, output from the expert layer would be (silu(x.w1_t) * x.v1_t) . w2_t). Same math is also used in mixtral, only difference being DBRX uses v1 instead of w3 in mixtral. Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 542585f - Browse repository at this point
Copy the full SHA 542585fView commit details -
Wrong input was being fed to moe layer. This needs to be corrected
Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for ecbfb1b - Browse repository at this point
Copy the full SHA ecbfb1bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 647a11b - Browse repository at this point
Copy the full SHA 647a11bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 03bdc36 - Browse repository at this point
Copy the full SHA 03bdc36View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e6758f - Browse repository at this point
Copy the full SHA 8e6758fView commit details -
llama: rename build_moe to build_moe_ffn and fix grok is using gelu i…
…nstead of silu. Do not pass too much time on this function as it will be replaced in #6505
Configuration menu - View commit details
-
Copy full SHA for f1256dc - Browse repository at this point
Copy the full SHA f1256dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for e517585 - Browse repository at this point
Copy the full SHA e517585View commit details
Commits on Apr 13, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 9f77484 - Browse repository at this point
Copy the full SHA 9f77484View commit details