Add qwen2 #28436

JustinLin610 · 2024-01-10T16:22:43Z

Adding Qwen2

This PR adds the support of codes for the coming Qwen2 models. For information about Qwen, please visit https://github.com/QwenLM/Qwen. @ArthurZucker

src/transformers/models/qwen2/tokenization_qwen2.py

Update dummy_tokenizers_objects.py

HuggingFaceDocBuilderDev · 2024-01-16T15:23:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

LGTM! My only concern is that the normal attention layer does not use the new argument (max_window_layers, so results might differ between that and the Qwen2FlashAttention2 layer)

src/transformers/convert_slow_tokenizer.py

ArthurZucker · 2024-01-16T16:55:18Z

src/transformers/models/qwen2/tokenization_qwen2_fast.py

+        # We need to at least pass vocab_file and merges_file to base class
+        # in case a slow tokenizer needs to be initialized; other can be
+        # configured through files


I think we should also define the extra tokens here as well 😉
copying the way we init them in the slow one!

tests/models/qwen2/test_tokenization_qwen2.py

src/transformers/models/auto/tokenization_auto.py

src/transformers/models/qwen2/modeling_qwen2.py

ArthurZucker · 2024-01-16T16:58:58Z

src/transformers/models/qwen2/modeling_qwen2.py

+    return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim)
+
+
+class Qwen2Attention(nn.Module):


this attention class will not really support the max_swa_layers as it only uses the attention mask to perform sliced attention I doubt that it will have the same results no?

Warning added for not using flash attention.

addressed review comments

ArthurZucker

Alright, just a final nit on the tokenizer and I think we can merge!
(doc + unk_token for fast)

src/transformers/models/qwen2/tokenization_qwen2_fast.py

src/transformers/models/qwen2/tokenization_qwen2.py

src/transformers/models/qwen2/tokenization_qwen2_fast.py

src/transformers/models/qwen2/modeling_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ArthurZucker · 2024-01-17T15:02:19Z

Thanks a lot for this pr and bearing with me! 🤗

* add config, modeling, and tokenization * add auto and init * update readme * update readme * update team name * fixup * fixup * update config * update code style * update for fixup * update for fixup * update for fixup * update for testing * update for testing * fix bug for config and tokenization * fix bug for bos token * not doctest * debug tokenizer * not doctest * debug tokenization * debug init for tokenizer * fix style * update init * delete if in token auto * add tokenizer doc * add tokenizer in init * Update dummy_tokenizers_objects.py * update * update * debug * Update tokenization_qwen2.py * debug * Update convert_slow_tokenizer.py * add copies * add copied from and make style * update files map * update test * fix style * fix merge reading and update tests * fix tests * fix tests * fix style * debug a variable in readme * Update src/transformers/models/qwen2/configuration_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update test and copied from * fix style * update qwen2 tokenization and tests * Update tokenization_qwen2.py * delete the copied from after property * fix style * update tests * update tests * add copied from * fix bugs * update doc * add warning for sliding window attention * update qwen2 tokenization * fix style * Update src/transformers/models/qwen2/modeling_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer fast --------- Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com> Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

xd2333 · 2024-02-11T08:33:18Z

Qwen1.5-0.5B-Chat lose output.weight after load from_pretrained and save_pretrained

fxmarty · 2024-02-27T14:06:13Z

SDPA & eager implem don't seem to match on main (5c341d4), even when not using an attn_mask:

FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_0_float16 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 2.188e+00, torch at...
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_1_bfloat16 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 1.156e+00, torch at...
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_2_float32 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 1.369e+00, torch at...

fxmarty · 2024-02-27T14:11:54Z

cc @JustinLin610 have you tested both code paths?

YuanzeSun · 2024-02-28T07:49:27Z

Hi, Can I use it on qwen1? If not, how can I adapt it to qwen1? Thank you!

ydshieh · 2024-03-13T08:53:18Z

@JustinLin610 In the tests, Qwen/Qwen2-450m-beta is used in several places, but there is no such repository on Huggingface Hub. Could you let us know which repository you used when adding those tests, please. Thank you in advance.

Currently, this kind implies there is no integration tests (being run) at all when this model was added to transformers.

JustinLin610 · 2024-03-18T09:06:18Z

I'll fix the code paths. Previously we have tested it and passed all the test. However, we had name changes and caused troubles. Sry about this.

JustinLin610 · 2024-03-18T09:07:59Z

@JustinLin610 In the tests, Qwen/Qwen2-450m-beta is used in several places, but there is no such repository on Huggingface Hub. Could you let us know which repository you used when adding those tests, please. Thank you in advance.

Currently, this kind implies there is no integration tests (being run) at all when this model was added to transformers.

https://huggingface.co/Qwen/Qwen1.5-0.5B This is the one that corresponds to the original Qwen/Qwen2-450m-beta

JustinLin610 · 2024-03-18T09:08:51Z

SDPA & eager implem don't seem to match on main (5c341d4), even when not using an attn_mask:

FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_0_float16 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 2.188e+00, torch at...
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_1_bfloat16 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 1.156e+00, torch at...
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_2_float32 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 1.369e+00, torch at...

This is strange to me btw as this part of code is exactly copied from the mistral impl.

JustinLin610 · 2024-03-18T09:10:52Z

Hi, Can I use it on qwen1? If not, how can I adapt it to qwen1? Thank you!

No, you can't use it directly btw. You need to transform the state dict for the adaptation

fxmarty · 2024-04-10T12:54:39Z

src/transformers/models/qwen2/tokenization_qwen2.py

+        merges_file,
+        errors="replace",
+        unk_token="<|endoftext|>",
+        bos_token=None,


Cross-posting from slack: is it expected that qwen1.5/qwen1.5-moe models have bos_token_id in their config.json, but not in tokenizer_config.json, while bos_token defaults to None in the tokenizer class?

transformers/src/transformers/models/qwen2/tokenization_qwen2.py

Line 143 in 1854637

bos_token=None,

This results in a loaded tokenizer not having a BOS token and I wonder if this is intended

cc @Giuseppe5

JustinLin610 added 5 commits January 10, 2024 23:59

add config, modeling, and tokenization

f5bf018

add auto and init

3226f3a

update readme

52ab139

update readme

0fe20cd

update team name

8299266

ArthurZucker reviewed Jan 10, 2024

View reviewed changes

src/transformers/models/qwen2/tokenization_qwen2.py Outdated Show resolved Hide resolved

src/transformers/models/qwen2/tokenization_qwen2.py Outdated Show resolved Hide resolved

JustinLin610 and others added 24 commits January 11, 2024 01:45

fixup

b8a8e22

fixup

55802f3

update config

6b587b8

update code style

dfe7d2e

update for fixup

525de40

update for fixup

7f6f5e2

update for fixup

36fcf65

update for testing

42f561a

update for testing

105d98f

fix bug for config and tokenization

47e6e9e

fix bug for bos token

5061a26

not doctest

0987dab

debug tokenizer

bf4e928

not doctest

2658baf

debug tokenization

69b03e7

debug init for tokenizer

725ab32

fix style

57264f3

update init

0489d0f

delete if in token auto

e3b60a6

add tokenizer doc

e1b4a8c

add tokenizer in init

22d3402

Update dummy_tokenizers_objects.py

c05a7be

Merge pull request #1 from jklj077/patch-1

783eec8

Update dummy_tokenizers_objects.py

update

c24ee57

JustinLin610 added 2 commits January 16, 2024 19:53

fix bugs

22f7b1e

update doc

69d3f89

ArthurZucker approved these changes Jan 16, 2024

View reviewed changes

renxuancheng.rxc and others added 5 commits January 17, 2024 11:53

add warning for sliding window attention

2840f0a

update qwen2 tokenization

87f6cf7

Merge pull request #11 from jklj077/patch-8

0a47113

addressed review comments

fix style

047e372

Merge branch 'huggingface:main' into add_qwen2

6b6dae0

ArthurZucker approved these changes Jan 17, 2024

View reviewed changes

JustinLin610 and others added 3 commits January 17, 2024 21:35

Update src/transformers/models/qwen2/modeling_qwen2.py

a4e80ee

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Merge branch 'huggingface:main' into add_qwen2

39cc60b

fix tokenizer fast

a8b0c3c

ArthurZucker merged commit d6ffe74 into huggingface:main Jan 17, 2024
21 checks passed

Jason-CKY mentioned this pull request Feb 19, 2024

Qwen1.5/Qwen2 model additions huggingface/text-generation-inference#1575

Closed

2 tasks

fxmarty mentioned this pull request Feb 26, 2024

How can I export onnx-model for Qwen/Qwen-7B? huggingface/optimum#1703

Open

fxmarty mentioned this pull request Feb 27, 2024

Better SDPA unmasking implementation #29318

Merged

fxmarty reviewed Apr 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add qwen2 #28436

Add qwen2 #28436

JustinLin610 commented Jan 10, 2024

HuggingFaceDocBuilderDev commented Jan 16, 2024

ArthurZucker left a comment

ArthurZucker Jan 16, 2024

ArthurZucker Jan 16, 2024

JustinLin610 Jan 17, 2024

ArthurZucker left a comment

ArthurZucker commented Jan 17, 2024

xd2333 commented Feb 11, 2024

fxmarty commented Feb 27, 2024

fxmarty commented Feb 27, 2024

YuanzeSun commented Feb 28, 2024

ydshieh commented Mar 13, 2024 •

edited

Loading

JustinLin610 commented Mar 18, 2024

JustinLin610 commented Mar 18, 2024

JustinLin610 commented Mar 18, 2024

JustinLin610 commented Mar 18, 2024

fxmarty Apr 10, 2024

fxmarty Apr 10, 2024

		return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim)


		class Qwen2Attention(nn.Module):

Add qwen2 #28436

Add qwen2 #28436

Conversation

JustinLin610 commented Jan 10, 2024

Adding Qwen2

HuggingFaceDocBuilderDev commented Jan 16, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Jan 16, 2024

Choose a reason for hiding this comment

ArthurZucker Jan 16, 2024

Choose a reason for hiding this comment

JustinLin610 Jan 17, 2024

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker commented Jan 17, 2024

xd2333 commented Feb 11, 2024

fxmarty commented Feb 27, 2024

fxmarty commented Feb 27, 2024

YuanzeSun commented Feb 28, 2024

ydshieh commented Mar 13, 2024 • edited Loading

JustinLin610 commented Mar 18, 2024

JustinLin610 commented Mar 18, 2024

JustinLin610 commented Mar 18, 2024

JustinLin610 commented Mar 18, 2024

fxmarty Apr 10, 2024

Choose a reason for hiding this comment

fxmarty Apr 10, 2024

Choose a reason for hiding this comment

ydshieh commented Mar 13, 2024 •

edited

Loading