Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Mixtral LLM model #36

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
002f959
Add Mixtral LLM
archana-ramalingam May 21, 2024
ea9123f
Refactoring attention, moe and ffn blocks
archana-ramalingam May 22, 2024
967c524
Merge branch 'main' into mixtral
archana-ramalingam May 22, 2024
95e3b09
Allow _optional_int_prop to handle missing hyperparameters
archana-ramalingam May 22, 2024
d577efb
Fixing circular dep and imports
archana-ramalingam May 23, 2024
1346208
Fix multiple expert layer weight handling + other issues
archana-ramalingam May 29, 2024
7f92421
Add ffn_moe layers and other fixes
archana-ramalingam Jun 13, 2024
8a4f5f2
Edit theta slicing
archana-ramalingam Jun 13, 2024
5e170fe
Fix ffn_moe theta parsing & wraping
archana-ramalingam Jun 14, 2024
211b5f7
Extract tensor unmerging into a function
archana-ramalingam Jun 14, 2024
e29c591
Cleaning up debug statements
archana-ramalingam Aug 19, 2024
790b70b
Resolve conflict
archana-ramalingam Aug 19, 2024
d602cf1
Fix test failure
archana-ramalingam Aug 19, 2024
9f46f45
Merge branch 'main' into mixtral
archana-ramalingam Aug 19, 2024
4f86051
Merge branch 'main' into mixtral
archana-ramalingam Aug 19, 2024
a8e714a
Add rope_freq_base to llama
archana-ramalingam Aug 19, 2024
823d95c
Merge branch 'mixtral' of https://github.com/archana-ramalingam/shark…
archana-ramalingam Aug 19, 2024
eb91f73
Merge branch 'main' into mixtral
archana-ramalingam Sep 5, 2024
dd29409
Fix rotary_embedding.py
archana-ramalingam Sep 5, 2024
b01d5a5
Merge branch 'main' into mixtral
archana-ramalingam Nov 23, 2024
5a903f6
Revert extra lines
archana-ramalingam Nov 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion sharktank/sharktank/layers/ffn_moe_block.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,12 @@ def __init__(
super().__init__(theta)

if theta.optional_tensor("ffn_gate_exps") is not None:
'''
Expands a single merged expert tensor to individual expert tensors
Eg: Converts blk.0.ffn_gate_exps.weight to blk.0.ffn_gate.0.weight, blk.0.ffn_gate.1.weight, etc.

'''

merged_tensor = theta.tensor("ffn_gate_exps", "weight")

expert_tensor = extract_ffn_layer(
Expand Down Expand Up @@ -130,7 +136,11 @@ def forward(
def extract_ffn_layer(
merged_tensor: DefaultPrimitiveTensor, layer_name: str, expert_idx: int
):
# fetches the block_idx from merged_tensor_name. e.g. blk.0.ffn_gate_exps.weight
'''
Given a merged expert tensor and an expert_idx, extracts the respective expert tensor
and constructs a DefaultPrimitiveTensor with the relevant expert layer name
'''

expert_layer_name = (
f"blk.{merged_tensor.name.split('.')[1]}.{layer_name}.{expert_idx}.weight"
)
Expand Down
Loading