Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bert4keras use electra error #312

Closed
wangxiaoqing1112 opened this issue Mar 16, 2021 · 3 comments
Closed

bert4keras use electra error #312

wangxiaoqing1112 opened this issue Mar 16, 2021 · 3 comments

Comments

@wangxiaoqing1112
Copy link

提问时请尽可能提供如下信息:

基本信息

  • 你使用的操作系统: centos
  • 你使用的Python版本: python3.6
  • 你使用的Tensorflow版本: 1.15
  • 你使用的Keras版本: 2.3.1
  • 你使用的bert4keras版本: 0.10.0
  • 你使用纯keras还是tf.keras: keras
  • 你加载的预训练模型:electra

核心代码

# 加载预训练模型
bert = build_transformer_model(
    model='electra',
    config_path=config_path,
    checkpoint_path=checkpoint_path,
    # with_pool=True,
    return_keras_model=False,
)

...

output = Lambda(lambda x: x[:, 0], name='CLS-token')(bert.model.output)
output = Dense(
    units=2, activation='softmax', kernel_initializer=bert.initializer
)(output)

model = keras.models.Model(bert.model.input, output)
model.summary()

model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=Adam(2e-5),  # 用足够小的学习率
    # optimizer=PiecewiseLinearLearningRate(Adam(5e-5), {10000: 1, 30000: 0.1}),
    metrics=['accuracy'],
)
...
model.fit_generator(
    train_generator.forfit(),
    steps_per_epoch=len(train_generator),
    epochs=6,
    callbacks=[evaluator]
)

输出信息

Epoch 1/6
2021-03-16 15:11:57.653303: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
Traceback (most recent call last):
  File "/home/wuming/software/anaconda3/envs/myKeras/lib/python3.6/site-packages/keras/engine/training_generator.py", line 220, in fit_generator
    reset_metrics=False)
  File "/home/wuming/software/anaconda3/envs/myKeras/lib/python3.6/site-packages/keras/engine/training.py", line 1514, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/wuming/software/anaconda3/envs/myKeras/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py", line 3476, in __call__
    run_metadata=self.run_metadata)
  File "/home/wuming/software/anaconda3/envs/myKeras/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [30522,768] vs. [21128,768]
	 [[{{node training/Adam/sub_4}}]]

自我尝试

for electra 手动提取CLS向量 https://www.gitmemory.com/issue/bojone/bert4keras/102/605467096

copy electra base的config:ymcui/Chinese-ELECTRA#3

add type_vocab_size

electra无NSP,无需with_pool参数

@bojone
Copy link
Owner

bojone commented Mar 16, 2021

config_path里边写的vocab_size跟实际模型的vocab_size不一致。。。。

@wangxiaoqing1112
Copy link
Author

感谢苏神,已解决

@MrSworder
Copy link

感谢苏神,已解决

你好,请问你有没有遇到这个问题:
在加载electra_base的时候,因为hidden_size==embedding_size 导致缺少Embedding-mapping层,在load_weights_from_checkpoint函数中报错。
看到您使用electra_base成功了,想问问您是怎么做的,希望您能回复。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants