Skip to content

Commit 2ca7c43

Browse files
ikawrakowIwan Kawrakow
authored andcommitted
Bitnet: use the fused mul-silu in the FFN network (#110)
I had forgotten that build_bitnet() does not use the standerd llm_build_ffn function, so the fused mul-silu didn't get used for Bitnet when I added it to llm_build_ffn. This gives us another ~1% speedup for TG-128. Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
1 parent 1884411 commit 2ca7c43

File tree

1 file changed

+1
-6
lines changed

1 file changed

+1
-6
lines changed

src/llama.cpp

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13954,12 +13954,7 @@ struct llm_build_context {
1395413954

1395513955
cb(cur, "ffn_gate", il);
1395613956

13957-
13958-
// combine this with the above scale into ggml_scaled_silu
13959-
cur = ggml_silu(ctx0, cur);
13960-
cb(cur, "ffn_silu", il);
13961-
13962-
cur = ggml_mul(ctx0, cur, tmp);
13957+
cur = ggml_fused_mul_unary(ctx0, cur, tmp, GGML_UNARY_OP_SILU);
1396313958
cb(cur, "ffn_gate_par", il);
1396413959

1396513960
cur = llm_build_norm(ctx0, cur, hparams,

0 commit comments

Comments
 (0)