-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PaddlePaddle Hackathon 57 提交 #1128
Conversation
|
transformers下没有community文件夹,需要自己新建吗? |
这里写错了,是PaddleNLP/community文件夹 |
tokenizer_config_file不是必需的吧? |
tokenizer_config_file这个文件也是需要的 |
{ | ||
"model_config_file": "https://paddlenlp.bj.bcebos.com/models/transformers/community/renmada/distilbert-base-multilingual-cased/model_config.json", | ||
"model_state": "https://paddlenlp.bj.bcebos.com/models/transformers/community/renmada/distilbert-base-multilingual-cased/model_state.pdparams", | ||
"tokenizer_config_file": "https://paddlenlp.bj.bcebos.com/models/transformers/community/renmada/bert-base-uncased-sst-2-finetuned/tokenizer_config.json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要使用模型相对应的tokenizer_config_file
请在百度网盘中添加对应的tokenizer_config_file文件 |
已经上传了,不过默认都是空文件啊 |
上面的问题改好了 |
tokenizer_config.json 和 model_config.json 两者都不应该为空。 |
需要添加DistilBert模型的权重转换代码 |
model = DistilBertForMaskedLM.from_pretrained('distilbert-base-multilingual-cased') | ||
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的调用方式为:
model = DistilBertForMaskedLM.from_pretrained('renmada/distilbert-base-multilingual-cased')
tokenizer = DistilBertTokenizer.from_pretrained('renmada/distilbert-base-multilingual-cased')
model = DistilBertModel.from_pretrained('distilbert-base-multilingual-cased') | ||
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个权重对应的class是DistilBertForSequenceClassification,权重名称修改同上。
# 模型来源 | ||
https://huggingface.co/sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english | ||
# 模型使用 | ||
这个模型的命名方式用的是bert的前缀,转化成paddle时手动改成了distilbert。由于他的权重里有pooler而paddlenlp的distilbert没有pooler实现,因此例子只显示如何用DistilBertModel加载权重。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个权重对应的class是DistilBertForSequenceClassification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
个人感觉这里好像没意义啊,
原权重在paddlenlp中的DistilBertForSequenceClassification是加载不全的,原因是原权重pooler而paddlenlp的distilbert没有pooler实现
XLNet Model with a language modeling head on top (linear layer with weights tied to the input embeddings). | ||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Args:前面添加空行
# 模型来源 | ||
https://huggingface.co/sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english | ||
# 模型使用 | ||
这个模型的命名方式用的是bert的前缀,转化成paddle时手动改成了distilbert。由于他的权重里有pooler而paddlenlp的distilbert没有pooler实现,因此例子只显示如何用DistilBertModel加载权重。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不应该存在上述无法对应的情况。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddlenlp的distilbert实现没有pooler啊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原权重的命名方式更接近bertmodel而不是distilbert
我在转换的时候,前面的transformer layers可以转成distilbert的命名方式,但是它的pooler没有在paddlenlp的distilbert中实现
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pooler和pre_classifier的激活函数不一样,分别是tanh和relu,会导致最后forward的结果不一样
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | ||
XLNet Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear | ||
layers on top of the hidden-states output to compute `span start logits` and `span end logits`). | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加__init__函数的docstring
""" | ||
XLNet Model with a multiple choice classification head on top (a linear layer on top of the pooled output and a | ||
softmax) e.g. for RACE/SWAG tasks. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加__init__函数的docstring
|
|
|
麻烦签署一下CLA. |
return_dict=return_dict, ) | ||
output = transformer_outputs if not return_dict \ | ||
else transformer_outputs["last_hidden_state"] | ||
logits = self.classifier(output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.classifier没有定义
|
||
self.init_weights() | ||
|
||
def forward( | ||
self, | ||
input_ids, | ||
token_type_ids=None, | ||
attention_mask=None, | ||
mems=None, | ||
perm_mask=None, | ||
target_mapping=None, | ||
input_mask=None, | ||
head_mask=None, | ||
inputs_embeds=None, | ||
use_mems_train=False, | ||
use_mems_eval=False, | ||
return_dict=False, ): | ||
r""" | ||
The XLNetForQuestionAnswering forward method, overrides the `__call__()` special method. | ||
|
||
Args: | ||
input_ids (Tensor): | ||
See :class:`XLNetModel`. | ||
token_type_ids (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
attention_mask (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
mems (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
perm_mask (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
target_mapping (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
input_mask (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
head_mask (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
inputs_embeds (Tensor, optional): | ||
See :class:`XLNetModel`. | ||
use_mems_train (bool, optional): | ||
See :class:`XLNetModel`. | ||
use_mems_eval (bool, optional): | ||
See :class:`XLNetModel`. | ||
return_dict (bool, optional): | ||
See :class:`XLNetModel`. | ||
|
||
Returns: | ||
tuple or dict: Returns tensor (`start_logits`, `end_logits`) or a dict with key-value pairs: | ||
{"start_logits": `start_logits`, "end_logits": `end_logits`, "mems": `mems`, | ||
"hidden_states": `hidden_states`, "attentions": `attentions`} | ||
|
||
With the corresponding fields: | ||
- `start_logits` (Tensor): | ||
A tensor of the input token classification logits, indicates the start position of the labelled span. | ||
Its data type should be float32 and its shape is [batch_size, sequence_length]. | ||
- `end_logits` (Tensor): | ||
A tensor of the input token classification logits, indicates the end position of the labelled span. | ||
Its data type should be float32 and its shape is [batch_size, sequence_length]. | ||
- `mems` (List[Tensor]): | ||
See :class:`XLNetModel`. | ||
- `hidden_states` (List[Tensor], optional): | ||
See :class:`XLNetModel`. | ||
- `attentions` (List[Tensor], optional): | ||
See :class:`XLNetModel`. | ||
|
||
Example: | ||
.. code-block:: | ||
|
||
import paddle | ||
from paddlenlp.transformers.xlnet.modeling import XLNetForQuestionAnswering | ||
from paddlenlp.transformers.xlnet.tokenizer import XLNetTokenizer | ||
|
||
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased') | ||
model = XLNetForQuestionAnswering.from_pretrained('xlnet-base-cased') | ||
|
||
inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!") | ||
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()} | ||
outputs = model(**inputs) | ||
start_logits = outputs[0] | ||
end_logits = outputs[1] | ||
""" | ||
transformer_outputs = self.transformer( | ||
input_ids, | ||
token_type_ids=token_type_ids, | ||
attention_mask=attention_mask, | ||
mems=mems, | ||
perm_mask=perm_mask, | ||
target_mapping=target_mapping, | ||
input_mask=input_mask, | ||
head_mask=head_mask, | ||
inputs_embeds=inputs_embeds, | ||
use_mems_train=use_mems_train, | ||
use_mems_eval=use_mems_eval, | ||
return_dict=return_dict, ) | ||
output = transformer_outputs if not return_dict \ | ||
else transformer_outputs["last_hidden_state"] | ||
logits = self.classifier(output) | ||
logits = paddle.transpose(logits, perm=[2, 0, 1]) | ||
start_logits, end_logits = paddle.unstack(x=logits, axis=0) | ||
return start_logits, end_logits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XLNetForQuestionAnswering 这个任务的逻辑和HuggingFace参考代码不一致?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这里实现的是HuggingFace的XLNetForQuestionAnsweringSimple,整体逻辑与paddlenlp的其他模型比较一致
- HuggingFace的XLNetForQuestionAnswering比较复杂,是否需要实现?
麻烦尽快按照review意见修改,解决conflicts~ |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Task: #1073
权重文件 链接: https://pan.baidu.com/s/1-FJDmtfO8MuPQgq0EEbUhw 提取码: gst6
添加XLNetLMHeadModel、XLNetForMultipleChoice、XLNetForQuestionAnswering。
新增单元测试代码。XLNetLMHeadModel、XLNetForMultipleChoice、XLNetForQuestionAnswering。