add Glm #33823

Cyrilvallez · 2024-09-30T15:46:08Z

GLM model!

HuggingFaceDocBuilderDev · 2024-10-01T09:48:37Z

Hey! 🤗 Thanks for your contribution to the transformers library!

Before merging this pull request, slow tests CI should be triggered. To enable this:

Add the run-slow label to the PR
When your PR is ready for merge and all reviewers' comments have been addressed, push an empty commit with the command [run-slow] followed by a comma separated list of all the models to be tested, i.e. [run_slow] model_to_test_1, model_to_test_2
- If the pull request affects a lot of models, put at most 10 models in the commit message
A transformers maintainer will then approve the workflow to start the tests

(For maintainers) The documentation for slow tests CI on PRs is here.

ArthurZucker

Very very nice!

ArthurZucker · 2024-10-01T18:02:49Z

src/transformers/models/glm/__init__.py

@@ -0,0 +1,27 @@
+# Copyright 2020 The HuggingFace Team. All rights reserved.


Suggested change

# Copyright 2020 The HuggingFace Team. All rights reserved.

# Copyright 2024 The HuggingFace Team. All rights reserved.

ArthurZucker · 2024-10-01T18:03:35Z

src/transformers/models/glm/convert_glm_weights_to_hf.py

+STATE_DICT_MAPPING = {
+    "transformer.output_layer.": "lm_head.",
+    "transformer.": "model.",
+    ".embedding.word_embeddings.": ".embed_tokens.",
+    ".encoder.final_layernorm.": ".norm.",
+    ".encoder.layers.": ".layers.",
+    "rotary_pos_embed.": "rotary_emb.",
+    "self_attention.": "self_attn.",
+    "query_key_value.": "qkv_proj.",
+    "dense.": "o_proj.",
+    "dense_h_to_4h.": "gate_up_proj.",
+    "dense_4h_to_h.": "down_proj.",
+}


cool! Let's setup good standards however, see MLLAMA, full explicit regex are more informative IMO! 🤗

ArthurZucker · 2024-10-01T18:04:37Z

src/transformers/models/glm/convert_glm_weights_to_hf.py

+        vocab_size=original_config.pop("padded_vocab_size"),
+        hidden_size=original_config.pop("hidden_size"),
+        intermediate_size=original_config.pop("ffn_hidden_size"),
+        num_hidden_layers=original_config.pop("num_layers"),
+        num_attention_heads=num_attention_heads,
+        num_key_value_heads=(
+            num_attention_heads
+            if not original_config.pop("multi_query_attention")
+            else original_config.pop("multi_query_group_num")
+        ),
+        attention_dropout=original_config.pop("attention_dropout"),
+        max_position_embeddings=original_config.pop("seq_length"),
+        rms_norm_eps=original_config.pop("layernorm_epsilon"),
+        rope_theta=10000.0 * original_config.pop("rope_ratio", 1),
+        use_cache=original_config.pop("use_cache"),
+        head_dim=original_config.pop("kv_channels"),
+        attention_bias=original_config.pop("add_qkv_bias"),
+        eos_token_id=original_config.pop("eos_token_id"),
+        pad_token_id=original_config.pop("pad_token_id"),
+        tie_word_embeddings=original_config.pop("tie_word_embeddings"),


Let's try to use ** here for attributes that have the same name

I didn't to avoid adding unused fields, but I refactored to make that block nicer to read.

ArthurZucker · 2024-10-01T18:05:35Z

src/transformers/models/glm/modular_glm.py

+    pass
+
+
+class GlmSdpaAttention(GlmAttention, GraniteSdpaAttention):


Suggested change

class GlmSdpaAttention(GlmAttention, GraniteSdpaAttention):

class GlmSdpaAttention(GraniteSdpaAttention):

I think this should be enough

ArthurZucker · 2024-10-01T18:06:23Z

src/transformers/models/glm/modular_glm.py

holy molly so nice!

ArthurZucker · 2024-10-01T18:07:21Z

tests/models/glm/test_modeling_glm.py

+    @require_torch_sdpa
+    @slow
+    @is_flaky
+    def test_eager_matches_sdpa_inference(self, torch_dtype: str):


why do we have to overwrite this one?

Unfortunately, based on the random inputs there may be some times when one of the cases fail - I overwrote it to add the flaky decorator (which allows the test to consistently pass)

ArthurZucker · 2024-10-01T18:07:41Z

tests/models/glm/test_modeling_glm.py

Cool! In general the least we have to overwrite the better!

meaning are there ways to remove some of the tests you added?

Unfortunately no -- based on the random seed, some are failing from time to time, and they need to be flaky to consistently pass

Cyrilvallez · 2024-10-02T12:16:09Z

Ready for last review @ArthurZucker, setup_and_quality fail because of the __all__ issue, but will pass once #33859 is merged.

zRzRzRzRzRzRzR · 2024-10-06T04:58:15Z

Thank you very much for your help. I also saw this huggingface PR.
I have replied, and some of the code may need to be modified. Perhaps we can work together to improve it and merge this work.

Thank you again for your support!

ArthurZucker · 2024-10-06T08:54:49Z

Of course!

ArthurZucker

LGTM anything missing before we merge?

Cyrilvallez · 2024-10-18T14:04:52Z

LGTM anything missing before we merge?

No, only issue are the docstrings in the configuration, but this will be solved with the auto-docstrings. In the meantime, I just moved the config outside modular to please the CIs.

Cyrilvallez · 2024-10-18T15:40:58Z

Confimed that slow tests pass for the model. Merging.

* Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md

liyucheng09 · 2024-10-22T23:45:15Z

@Cyrilvallez Hi Cyril, you PR for the 1M version of the model got an unexpected generation. Please refer to here for more information: https://huggingface.co/THUDM/glm-4-9b-chat-1m/discussions/17.

* Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md

Cyrilvallez force-pushed the glm branch 2 times, most recently from 207ec14 to 152569e Compare October 1, 2024 14:08

ArthurZucker reviewed Oct 1, 2024

View reviewed changes

Cyrilvallez marked this pull request as ready for review October 2, 2024 12:14

Cyrilvallez mentioned this pull request Oct 2, 2024

Add GLM-4 and Later GLM Model (Draft) #31977

Closed

3 tasks

ArthurZucker mentioned this pull request Oct 2, 2024

Add GLM4 model #33729

Closed

ArthurZucker added the run-slow label Oct 2, 2024

Trapper4888 mentioned this pull request Oct 2, 2024

Qauntization in glm4-9b failed turboderp/exllamav2#489

Open

LysandreJik mentioned this pull request Oct 3, 2024

[Modular Transformers] Request for comments #33916

Open

ArthurZucker mentioned this pull request Oct 5, 2024

[ChatGlm] Adds support for the ChatGLM model #27883

Closed

Cyrilvallez added 16 commits October 8, 2024 14:55

Create modular_glm.py

6366077

Update modular_glm.py

fa83dab

Finalize architecture without all attentions

11d74d9

Add all attentions modules

a3587d3

Finalize modular

9430273

Update given last version

4efd782

Last update

e9efed1

Finalize model

a411a38

Finalize converter

c4d3c21

Update convert_glm_weights_to_hf.py

1aba7c4

style

ea4d61a

style

b8eed8c

Create __init__.py

4f4e6f6

Aff all inits

de2e058

Update convert_glm_weights_to_hf.py

9392380

Update convert_glm_weights_to_hf.py

0e76b9d

Cyrilvallez added 9 commits October 8, 2024 14:55

Update convert_glm_weights_to_hf.py

7affb64

Update convert_glm_weights_to_hf.py

c8762d6

Update convert_glm_weights_to_hf.py

4792118

Update convert_glm_weights_to_hf.py

6b2f46b

Update convert_glm_weights_to_hf.py

e97978b

Update convert_glm_weights_to_hf.py

806d035

Update convert_glm_weights_to_hf.py

efe17f8

Update convert_glm_weights_to_hf.py

aab9dcf

style

4c14a98

Cyrilvallez force-pushed the glm branch from 32b6e10 to 4c14a98 Compare October 8, 2024 12:59

Cyrilvallez added 4 commits October 8, 2024 15:00

Use new modular converter

409ca3e

add pretrainedmodel to init

35e9623

style

684016c

Update test_modeling_glm.py

764f755

Rocketknight1 mentioned this pull request Oct 15, 2024

Add support for Florence-2 #34160

Draft

ArthurZucker approved these changes Oct 17, 2024

View reviewed changes

Cyrilvallez added 4 commits October 18, 2024 16:38

Move config outside modular to please CI about docstrings

2739181

Add dummies to please CI

1d95488

Update glm.md

828509c

Update glm.md

ceec1f6

Cyrilvallez merged commit 6604764 into main Oct 18, 2024
23 of 27 checks passed

Cyrilvallez deleted the glm branch October 18, 2024 15:41

Qubitium mentioned this pull request Nov 10, 2024

GLM/ChatGLM badly broken in HF #34677

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Glm #33823

add Glm #33823

Cyrilvallez commented Sep 30, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 1, 2024

ArthurZucker left a comment

ArthurZucker Oct 1, 2024

ArthurZucker Oct 1, 2024

ArthurZucker Oct 1, 2024

Cyrilvallez Oct 2, 2024

ArthurZucker Oct 1, 2024

ArthurZucker Oct 1, 2024

ArthurZucker Oct 1, 2024

ArthurZucker Oct 1, 2024

Cyrilvallez Oct 2, 2024 •

edited

Loading

ArthurZucker Oct 1, 2024

ArthurZucker Oct 1, 2024

Cyrilvallez Oct 2, 2024

Cyrilvallez commented Oct 2, 2024

zRzRzRzRzRzRzR commented Oct 6, 2024

ArthurZucker commented Oct 6, 2024

ArthurZucker left a comment

Cyrilvallez commented Oct 18, 2024 •

edited

Loading

Cyrilvallez commented Oct 18, 2024

liyucheng09 commented Oct 22, 2024

		@@ -0,0 +1,27 @@
		# Copyright 2020 The HuggingFace Team. All rights reserved.

	# Copyright 2020 The HuggingFace Team. All rights reserved.
	# Copyright 2024 The HuggingFace Team. All rights reserved.

		pass


		class GlmSdpaAttention(GlmAttention, GraniteSdpaAttention):

	class GlmSdpaAttention(GlmAttention, GraniteSdpaAttention):
	class GlmSdpaAttention(GraniteSdpaAttention):

add Glm #33823

add Glm #33823

Conversation

Cyrilvallez commented Sep 30, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Oct 1, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyrilvallez Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyrilvallez commented Oct 2, 2024

zRzRzRzRzRzRzR commented Oct 6, 2024

ArthurZucker commented Oct 6, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Cyrilvallez commented Oct 18, 2024 • edited Loading

Cyrilvallez commented Oct 18, 2024

liyucheng09 commented Oct 22, 2024

Cyrilvallez commented Sep 30, 2024 •

edited

Loading

Cyrilvallez Oct 2, 2024 •

edited

Loading

Cyrilvallez commented Oct 18, 2024 •

edited

Loading