Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to train on MASC downstream task? #7

Open
BigHyf opened this issue Jul 18, 2022 · 15 comments
Open

how to train on MASC downstream task? #7

BigHyf opened this issue Jul 18, 2022 · 15 comments

Comments

@BigHyf
Copy link

BigHyf commented Jul 18, 2022

hello,
I would like to ask if the current code supports the downstream MASC task, because there is a difference between its input and the trunk task input, because the MASC task is known by the entity

@lyhuohuo
Copy link
Collaborator

The code support the MASC task and you just need to modify a little part during inference.
In our experiments, during training, we use the full output format like "entity1_start entity1_end sentiment_1 entity2_start entity2_end sentiment_2..." and optimize both entities and sentiments.
During inference, we input the golden entities and generate the corresponding sentiment.

@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

Thank you for your answer!
I would like to ask you what kind of input form is the minor modification you said to input target to the model for emotion classification? Can you elaborate on that or provide some code

@lyhuohuo
Copy link
Collaborator

lyhuohuo commented Jul 19, 2022

The input target is the same as JMASA task. The difference is only in the inference. For JMASA task, the generation starts from the start token like "bos" "JMASA" and then generates the full output sequence. For MASC task, during generation, we only generate the sentiment as we input the golden entities. For example, first, we input the first entity and the generation starts from "bos" "MASC" "entity1" and then generate the corresponding sentiment. In the following, we concat the second entity with the current sequence and the generation continues from"bos" "MASC" "entity1" "sentiment1" "entity2".

@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

那就是说如果一条数据存在多个target和情感的话,在输入的时候我们会将多个target合并在一起输入到模型中,而不是先前别人做情感分类那样 一条数据 只存在一条target。在做情感分类预测的时候,需要对该条数据的每个target的情感都识别对才认为是正确的?

@lyhuohuo
Copy link
Collaborator

输入是的,但评估还是单独算F值的。

@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

可以开源这个子任务的代码吗?
输入的时候 相当于多了一个target 他是放在文本的最后吗 然后送入encoder层,可以问一下decoder的输入是什么?是encoder的那些输入的右移吗?

@lyhuohuo
Copy link
Collaborator

输入的时候没有改变,刚刚我没有完全理解你说的意思,encoder端还是只有图像和原始的文本,decoder端训练的时候还是一样的,只是测试的时候对于entity的部分直接送真值而已。

@BigHyf BigHyf closed this as completed Jul 19, 2022
@BigHyf BigHyf reopened this Jul 19, 2022
@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

也就是在测试和训练的时候 encoder端都是一样的 decoder不一样,decoder在训练和测试的时候,训练的时候decoder输入是只有? 测试的时候是什么样的形式把ground truth输入进去

@lyhuohuo
Copy link
Collaborator

先输入start token, “bos” "MASC" 然后拼上第一个golden entity的span,生成第一个sentiment,然后拼上第二个entity,继续生成

@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

bart这种生成式的模型 你的意思是 decoder输入的时候是连续输入 golden entity <留空> golden entity <留空>,然后预测留空部分的内容?
什么时候可以开放这部分的代码?

@lyhuohuo
Copy link
Collaborator

训练的时候输入是完整的,测试的时候一个entity一个entity输入,golden entity1 生成sentiment1
golden entity1 sentiment1 golden entity2然后预测sentiment2
相关代码我后续会更新

@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

"一个entity一个entity输入" 这个部分在bart decoder部分是怎么做到这种先后顺序的。还输入会分成两种数据,一种是对于第一个target的情况,只输入golden entity1 ;然后第二条数据 输入的时候输入golden entity1 sentiment1 golden entity2?

@lyhuohuo
Copy link
Collaborator

对MASC任务的数据将相同文本的样本进行合并,处理成和JAMSA任务相同的输出形式

@BigHyf
Copy link
Author

BigHyf commented Jul 19, 2022

你可能没有理解我的意思 我的意思是bart decoder怎么做到先后的顺序 bart生成第一个sentiment后,再接入后一个target,我的理解decoder的内容不是同时输入的吗?或者可以私发我一下情感分类任务中 这部分的训练测试代码吗

@lyhuohuo
Copy link
Collaborator

加个微信?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants