`load_tf_weights` doesn't handle the weights added to the TF models at the top level #18802

ydshieh · 2022-08-29T08:52:28Z

System Info

transformers version: 4.22.0.dev0
Platform: Windows-10-10.0.22000-SP0
Python version: 3.9.11
Huggingface_hub version: 0.8.1
PyTorch version (GPU?): 1.12.1+cu113 (True)
Tensorflow version (GPU?): 2.9.1 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

@gante

Reproduction

(TF)MarianMTModel has weights final_logits_bias added at the top-level (i.e. not under any layer)

transformers/src/transformers/models/marian/modeling_tf_marian.py

Line 1287 in 5f06a09

self.final_logits_bias = self.add_weight(

However, the method load_tf_weights only handle weights under some layers

transformers/src/transformers/modeling_tf_utils.py

Line 850 in 5f06a09

for layer in model.layers:

This causes problem when we load TF checkpoints for TFMarianMTModel, i.e. final_logits_bias is not loaded.

from transformers import MarianMTModel, TFMarianMTModel
model_name = "Helsinki-NLP/opus-mt-en-ROMANCE"

pt_model = MarianMTModel.from_pretrained(model_name)
tf_model_from_pt = TFMarianMTModel.from_pretrained(model_name, from_pt=True)
tf_model = TFMarianMTModel.from_pretrained(model_name, from_pt=False)

# Only has `TFMarianMainLayer` in `layers`
print(tf_model.layers)

print(pt_model.final_logits_bias.numpy())
print(tf_model_from_pt.final_logits_bias.numpy())
print(tf_model.final_logits_bias.numpy())

Outputs:

[<transformers.models.marian.modeling_tf_marian.TFMarianMainLayer object at 0x000001F00ECE9940>]
[[11.757146  -1.7759448 -7.3816853 ... -1.6559223 -1.6663467  0.       ]]
[[11.757146  -1.7759448 -7.3816853 ... -1.6559223 -1.6663467  0.       ]]
[[0. 0. 0. ... 0. 0. 0.]]

Expected behavior

load_tf_weights should be able to load weights like final_logits_bias, and the TF checkpoint should be loaded correctly.

The text was updated successfully, but these errors were encountered:

ydshieh · 2022-08-29T08:54:00Z

Related to #18149.

ydshieh · 2022-08-29T08:56:16Z

cc @patrickvonplaten as we might need to change the core method load_tf_weights.

ydshieh added the bug label Aug 29, 2022

gante self-assigned this Aug 31, 2022

gante mentioned this issue Aug 31, 2022

TF: TFMarianMTModel final logits bias as a layer #18833

Merged

gante closed this as completed in #18833 Sep 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`load_tf_weights` doesn't handle the weights added to the TF models at the top level #18802

`load_tf_weights` doesn't handle the weights added to the TF models at the top level #18802

ydshieh commented Aug 29, 2022 •

edited

Loading

ydshieh commented Aug 29, 2022

ydshieh commented Aug 29, 2022

load_tf_weights doesn't handle the weights added to the TF models at the top level #18802

load_tf_weights doesn't handle the weights added to the TF models at the top level #18802

Comments

ydshieh commented Aug 29, 2022 • edited Loading

System Info

Who can help?

Reproduction

Expected behavior

ydshieh commented Aug 29, 2022

ydshieh commented Aug 29, 2022

`load_tf_weights` doesn't handle the weights added to the TF models at the top level #18802

`load_tf_weights` doesn't handle the weights added to the TF models at the top level #18802

ydshieh commented Aug 29, 2022 •

edited

Loading