Fix Mixtral-related issues #570

artek0chumak · 2024-04-08T17:02:22Z

This PR fixes problems related to #569:

block initialization
throughput calculation and cache usage
mixtral in tests

BS is removed for Mixtral and Llama for now. Those models use DynamicCache, which requires special function to change: (see https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L161)

src/petals/server/block_utils.py

src/petals/server/throughput.py

tests/test_full_model.py

Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>

mryab · 2024-04-10T08:46:17Z

tests/test_full_model.py

@@ -141,6 +141,10 @@ def test_sampling(tokenizer, model, ref_model, max_new_tokens=10):
                ), f"Sampling is not identical to HF with {options=}, {multiple_calls=}, {inputs.shape=}"


+@pytest.mark.skipif(
+    MODEL_NAME.lower().find("bloom") == -1,
+    reason="Mixtral and Llama uses DynamicCache, which can change based on beam search choices",


Suggested change

reason="Mixtral and Llama uses DynamicCache, which can change based on beam search choices",

reason="Mixtral and Llama use DynamicCache, which can change based on beam search choices",

mryab · 2024-04-10T08:46:33Z

tests/test_full_model.py

@@ -141,6 +141,10 @@ def test_sampling(tokenizer, model, ref_model, max_new_tokens=10):
                ), f"Sampling is not identical to HF with {options=}, {multiple_calls=}, {inputs.shape=}"


+@pytest.mark.skipif(
+    MODEL_NAME.lower().find("bloom") == -1,


Suggested change

MODEL_NAME.lower().find("bloom") == -1,

"bloom" not in MODEL_NAME.lower(),

artek0chumak added 5 commits April 8, 2024 18:04

fix block init

aecf074

fix imports

f06cfd2

fix cache and throughput

204855c

Add mixtral in tests

9eb7928

Style

db3087a

artek0chumak changed the title ~~Fix issue #569 related to Mixtral~~ Fix Mixtral-related issues Apr 8, 2024

artek0chumak added 6 commits April 8, 2024 19:13

Fix type

d41ff56

Fix cache in tests

46e29b2

Fix cache again

0f49881

Return compatibility with tests

2ca5316

Fix get_model_block

ba271dc

Skip BS for mixtral for now

5f91793

artek0chumak mentioned this pull request Apr 9, 2024

DynamicCache and Beam Search #571

Open

2 tasks

artek0chumak added 3 commits April 10, 2024 07:12

Add assert about BS

a447cbe

Fix benchmarks

16d97fc

Another fix

1b4bb1a

artek0chumak marked this pull request as ready for review April 10, 2024 05:38

artek0chumak requested review from mryab and justheuristic April 10, 2024 05:38

mryab reviewed Apr 10, 2024

View reviewed changes

src/petals/server/block_utils.py Outdated Show resolved Hide resolved

src/petals/server/throughput.py Outdated Show resolved Hide resolved

tests/test_full_model.py Outdated Show resolved Hide resolved

artek0chumak and others added 3 commits April 10, 2024 10:32

Update tests/test_full_model.py

6a1596b

Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>

Comments

e3539c0

Add llama to non-bs mix

63a421b

mryab reviewed Apr 10, 2024

View reviewed changes

Comments

5ca2a03

mryab approved these changes Apr 10, 2024

View reviewed changes

justheuristic merged commit d6f4f80 into main Apr 10, 2024
11 checks passed

justheuristic deleted the fix_mixtral branch April 10, 2024 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Mixtral-related issues #570

Fix Mixtral-related issues #570

artek0chumak commented Apr 8, 2024 •

edited

Loading

mryab Apr 10, 2024

artek0chumak Apr 10, 2024

mryab Apr 10, 2024

artek0chumak Apr 10, 2024

	reason="Mixtral and Llama uses DynamicCache, which can change based on beam search choices",
	reason="Mixtral and Llama use DynamicCache, which can change based on beam search choices",

	MODEL_NAME.lower().find("bloom") == -1,
	"bloom" not in MODEL_NAME.lower(),

Fix Mixtral-related issues #570

Fix Mixtral-related issues #570

Conversation

artek0chumak commented Apr 8, 2024 • edited Loading

mryab Apr 10, 2024

Choose a reason for hiding this comment

artek0chumak Apr 10, 2024

Choose a reason for hiding this comment

mryab Apr 10, 2024

Choose a reason for hiding this comment

artek0chumak Apr 10, 2024

Choose a reason for hiding this comment

artek0chumak commented Apr 8, 2024 •

edited

Loading