Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddlePaddle Hackathon 57 提交 #1128

Merged
merged 17 commits into from
Nov 29, 2021
Merged

PaddlePaddle Hackathon 57 提交 #1128

merged 17 commits into from
Nov 29, 2021

Conversation

renmada
Copy link
Contributor

@renmada renmada commented Oct 9, 2021

Task: #1073

权重文件 链接: https://pan.baidu.com/s/1-FJDmtfO8MuPQgq0EEbUhw 提取码: gst6
添加XLNetLMHeadModel、XLNetForMultipleChoice、XLNetForQuestionAnswering。
新增单元测试代码。XLNetLMHeadModel、XLNetForMultipleChoice、XLNetForQuestionAnswering。

@CLAassistant
Copy link

CLAassistant commented Oct 9, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ yingyibiao
❌ deeplaying
You have signed the CLA already but the status is still pending? Let us recheck it.

@renmada renmada changed the title PaddlePaddle Hackathon 52 提交 PaddlePaddle Hackathon 57 提交 Oct 9, 2021
@yingyibiao yingyibiao self-assigned this Oct 10, 2021
@yingyibiao
Copy link
Contributor

@renmada
Copy link
Contributor Author

renmada commented Oct 14, 2021

新增权重请参考https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

transformers下没有community文件夹,需要自己新建吗?

@yingyibiao
Copy link
Contributor

新增权重请参考https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

transformers下没有community文件夹,需要自己新建吗?

这里写错了,是PaddleNLP/community文件夹

@renmada
Copy link
Contributor Author

renmada commented Oct 15, 2021

新增权重请参考https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

transformers下没有community文件夹,需要自己新建吗?

这里写错了,是PaddleNLP/community文件夹

tokenizer_config_file不是必需的吧?

@yingyibiao
Copy link
Contributor

新增权重请参考https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

transformers下没有community文件夹,需要自己新建吗?

这里写错了,是PaddleNLP/community文件夹

tokenizer_config_file不是必需的吧?

tokenizer_config_file这个文件也是需要的

{
"model_config_file": "https://paddlenlp.bj.bcebos.com/models/transformers/community/renmada/distilbert-base-multilingual-cased/model_config.json",
"model_state": "https://paddlenlp.bj.bcebos.com/models/transformers/community/renmada/distilbert-base-multilingual-cased/model_state.pdparams",
"tokenizer_config_file": "https://paddlenlp.bj.bcebos.com/models/transformers/community/renmada/bert-base-uncased-sst-2-finetuned/tokenizer_config.json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里需要使用模型相对应的tokenizer_config_file

@yingyibiao
Copy link
Contributor

请在百度网盘中添加对应的tokenizer_config_file文件

@renmada
Copy link
Contributor Author

renmada commented Oct 18, 2021

新增权重请参考https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

transformers下没有community文件夹,需要自己新建吗?

这里写错了,是PaddleNLP/community文件夹

tokenizer_config_file不是必需的吧?

tokenizer_config_file这个文件也是需要的

已经上传了,不过默认都是空文件啊

@yingyibiao
Copy link
Contributor

@renmada
Copy link
Contributor Author

renmada commented Oct 19, 2021

上面的问题改好了

@yingyibiao
Copy link
Contributor

image
所有提交commit的人员都需要签署CLA.

@yingyibiao
Copy link
Contributor

新增权重请参考https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

transformers下没有community文件夹,需要自己新建吗?

这里写错了,是PaddleNLP/community文件夹

tokenizer_config_file不是必需的吧?

tokenizer_config_file这个文件也是需要的

已经上传了,不过默认都是空文件啊

tokenizer_config.json 和 model_config.json 两者都不应该为空。
具体格式可以参考对应class的save_pretrained接口保存后的文件格式。
例如"sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english"对应的model_config.json 文件可以参考DistilBertForMaskedLM.save_pretrained接口保存后的model_config.json文件,tokenizer_config.json文件同理参考DistilBertTokenizer.save_pretrained接口保存后的tokenizer_config.json文件。

@yingyibiao
Copy link
Contributor

需要添加DistilBert模型的权重转换代码

@yingyibiao
Copy link
Contributor

Comment on lines 13 to 14
model = DistilBertForMaskedLM.from_pretrained('distilbert-base-multilingual-cased')
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的调用方式为:
model = DistilBertForMaskedLM.from_pretrained('renmada/distilbert-base-multilingual-cased')
tokenizer = DistilBertTokenizer.from_pretrained('renmada/distilbert-base-multilingual-cased')

Comment on lines 11 to 12
model = DistilBertModel.from_pretrained('distilbert-base-multilingual-cased')
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个权重对应的class是DistilBertForSequenceClassification,权重名称修改同上。

# 模型来源
https://huggingface.co/sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english
# 模型使用
这个模型的命名方式用的是bert的前缀,转化成paddle时手动改成了distilbert。由于他的权重里有pooler而paddlenlp的distilbert没有pooler实现,因此例子只显示如何用DistilBertModel加载权重。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个权重对应的class是DistilBertForSequenceClassification

Copy link
Contributor Author

@renmada renmada Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

个人感觉这里好像没意义啊,
原权重在paddlenlp中的DistilBertForSequenceClassification是加载不全的,原因是原权重pooler而paddlenlp的distilbert没有pooler实现

Comment on lines 1468 to 1469
XLNet Model with a language modeling head on top (linear layer with weights tied to the input embeddings).
Args:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args:前面添加空行

# 模型来源
https://huggingface.co/sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english
# 模型使用
这个模型的命名方式用的是bert的前缀,转化成paddle时手动改成了distilbert。由于他的权重里有pooler而paddlenlp的distilbert没有pooler实现,因此例子只显示如何用DistilBertModel加载权重。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不应该存在上述无法对应的情况。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddlenlp的distilbert实现没有pooler啊

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原权重的命名方式更接近bertmodel而不是distilbert
我在转换的时候,前面的transformer layers可以转成distilbert的命名方式,但是它的pooler没有在paddlenlp的distilbert中实现

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

432e3809c68483de67694b34bbad9f84
红框内和Pooler是同样的结构,你需要转换一下参数的key进行映射。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pooler和pre_classifier的激活函数不一样,分别是tanh和relu,会导致最后forward的结果不一样

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

image

看了一下代码,两者是一致的,都是ReLU

paddlenlp/transformers/xlnet/modeling.py Show resolved Hide resolved
paddlenlp/transformers/xlnet/modeling.py Show resolved Hide resolved
paddlenlp/transformers/xlnet/modeling.py Show resolved Hide resolved
Comment on lines 1672 to 1675
"""
XLNet Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear
layers on top of the hidden-states output to compute `span start logits` and `span end logits`).
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加__init__函数的docstring

Comment on lines 1570 to 1573
"""
XLNet Model with a multiple choice classification head on top (a linear layer on top of the pooled output and a
softmax) e.g. for RACE/SWAG tasks.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加__init__函数的docstring

@renmada
Copy link
Contributor Author

renmada commented Oct 25, 2021

需要添加DistilBert模型的权重转换代码
两个问题

  1. 这个代码放在哪里
  2. 两个模型的命名方式不一样,所以转换代码不是完全通用

@yingyibiao
Copy link
Contributor

需要添加DistilBert模型的权重转换代码
两个问题

  1. 这个代码放在哪里
  2. 两个模型的命名方式不一样,所以转换代码不是完全通用
  1. 代码放置在community/renmada目录下
  2. 转换代码需要你对模型参数的key进行映射,模型代码(Pytorch版本和Paddle版本)确定后,转换代码就是确定的了,需要实现的就是该代码。

@renmada
Copy link
Contributor Author

renmada commented Oct 26, 2021

  • 之前的问题都已修复提交
  • model_config 和 tokenizer_config更新为save_pretrained的结果

@yingyibiao
Copy link
Contributor

麻烦签署一下CLA.

return_dict=return_dict, )
output = transformer_outputs if not return_dict \
else transformer_outputs["last_hidden_state"]
logits = self.classifier(output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.classifier没有定义

Comment on lines 1747 to 1860

self.init_weights()

def forward(
self,
input_ids,
token_type_ids=None,
attention_mask=None,
mems=None,
perm_mask=None,
target_mapping=None,
input_mask=None,
head_mask=None,
inputs_embeds=None,
use_mems_train=False,
use_mems_eval=False,
return_dict=False, ):
r"""
The XLNetForQuestionAnswering forward method, overrides the `__call__()` special method.

Args:
input_ids (Tensor):
See :class:`XLNetModel`.
token_type_ids (Tensor, optional):
See :class:`XLNetModel`.
attention_mask (Tensor, optional):
See :class:`XLNetModel`.
mems (Tensor, optional):
See :class:`XLNetModel`.
perm_mask (Tensor, optional):
See :class:`XLNetModel`.
target_mapping (Tensor, optional):
See :class:`XLNetModel`.
input_mask (Tensor, optional):
See :class:`XLNetModel`.
head_mask (Tensor, optional):
See :class:`XLNetModel`.
inputs_embeds (Tensor, optional):
See :class:`XLNetModel`.
use_mems_train (bool, optional):
See :class:`XLNetModel`.
use_mems_eval (bool, optional):
See :class:`XLNetModel`.
return_dict (bool, optional):
See :class:`XLNetModel`.

Returns:
tuple or dict: Returns tensor (`start_logits`, `end_logits`) or a dict with key-value pairs:
{"start_logits": `start_logits`, "end_logits": `end_logits`, "mems": `mems`,
"hidden_states": `hidden_states`, "attentions": `attentions`}

With the corresponding fields:
- `start_logits` (Tensor):
A tensor of the input token classification logits, indicates the start position of the labelled span.
Its data type should be float32 and its shape is [batch_size, sequence_length].
- `end_logits` (Tensor):
A tensor of the input token classification logits, indicates the end position of the labelled span.
Its data type should be float32 and its shape is [batch_size, sequence_length].
- `mems` (List[Tensor]):
See :class:`XLNetModel`.
- `hidden_states` (List[Tensor], optional):
See :class:`XLNetModel`.
- `attentions` (List[Tensor], optional):
See :class:`XLNetModel`.

Example:
.. code-block::

import paddle
from paddlenlp.transformers.xlnet.modeling import XLNetForQuestionAnswering
from paddlenlp.transformers.xlnet.tokenizer import XLNetTokenizer

tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
model = XLNetForQuestionAnswering.from_pretrained('xlnet-base-cased')

inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!")
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}
outputs = model(**inputs)
start_logits = outputs[0]
end_logits = outputs[1]
"""
transformer_outputs = self.transformer(
input_ids,
token_type_ids=token_type_ids,
attention_mask=attention_mask,
mems=mems,
perm_mask=perm_mask,
target_mapping=target_mapping,
input_mask=input_mask,
head_mask=head_mask,
inputs_embeds=inputs_embeds,
use_mems_train=use_mems_train,
use_mems_eval=use_mems_eval,
return_dict=return_dict, )
output = transformer_outputs if not return_dict \
else transformer_outputs["last_hidden_state"]
logits = self.classifier(output)
logits = paddle.transpose(logits, perm=[2, 0, 1])
start_logits, end_logits = paddle.unstack(x=logits, axis=0)
return start_logits, end_logits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XLNetForQuestionAnswering 这个任务的逻辑和HuggingFace参考代码不一致?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 这里实现的是HuggingFace的XLNetForQuestionAnsweringSimple,整体逻辑与paddlenlp的其他模型比较一致
  • HuggingFace的XLNetForQuestionAnswering比较复杂,是否需要实现?

@yingyibiao
Copy link
Contributor

麻烦尽快按照review意见修改,解决conflicts~

@renmada
Copy link
Contributor Author

renmada commented Nov 28, 2021

麻烦签署一下CLA.
CLA签了老是不更新,还是未签状态,不知道怎么回事

Copy link
Contributor

@yingyibiao yingyibiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZeyuChen ZeyuChen merged commit 2afd760 into PaddlePaddle:develop Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants