Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于再训练 #1

Open
shawroad opened this issue Aug 4, 2021 · 8 comments
Open

关于再训练 #1

shawroad opened this issue Aug 4, 2021 · 8 comments

Comments

@shawroad
Copy link

shawroad commented Aug 4, 2021

直接使用权重,效果已经很好了。但是在专有领域,为了进一步提升效果,进行了再训练。 训练的loss为: 两个句子最终的编码向量计算cos,然后再进行回归计算(和原始的标签做MSELoss), 不知道和你的预训练是否有gap? 望指点!!!

@renmada
Copy link
Owner

renmada commented Aug 6, 2021

@shawroad
Copy link
Author

shawroad commented Aug 6, 2021

我试过了。 发现这个权重效果特别好。比simcse的有监督在我的任务上还高3%个点(准确率)。

@Jh10555
Copy link

Jh10555 commented Apr 14, 2022

我试过了。 发现这个权重效果特别好。比simcse的有监督在我的任务上还高3%个点(准确率)。

人工构造了三元组数据吗

@shawroad
Copy link
Author

没有 直接用全部的正样本训练的。

@Jh10555
Copy link

Jh10555 commented Apr 14, 2022

没有 直接用全部的正样本训练的。

simcse的有监督是三元组输入呀

@shawroad
Copy link
Author

我知道 但是我输入正样本 损失相当于采样无监督的损失。

@shawroad
Copy link
Author

@Jh10555
Copy link

Jh10555 commented Apr 18, 2022

https://github.com/shawroad/Semantic-Textual-Similarity-Pytorch 可以参考我的这个仓库

好的 感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants