Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about No unsupervised representation learning experiment #2

Open
bigheiniu opened this issue Jan 13, 2023 · 1 comment
Open

Comments

@bigheiniu
Copy link

bigheiniu commented Jan 13, 2023

Hi Tianduo,
I really appreciated your work in developing the learnable data augmentation for sentence representation learning. Your proposed method DiffAug has shown really good performance in semi-supervised and supervised settings.

However, I was wondering how is the performance of DiffAug on unsupervised settings.

  • If you have already tried, did DiffAug still show better performance than SimCSE?
  • If not, how do you think we first train the prefix on unsupervised contrastive learning (freeze the language model), and then jointly train the language model and prefix?
@TianduoWang
Copy link
Owner

Hi Yichuan,

Thanks for your question!

In our preliminary experiments, we did try to use unsupervised learning objectives (e.g., MLM), but the final performance is not satisfying.

For your question that whether it is possible to do contrastive learning twice (one for prefix-tuning, the other for joint tuning), I suggest you may read this paper. The idea is quite relevant to yours.

I believe it is interesting and worthwhile to explore whether we can train a data augmentation module (e.g., prefix) with only unsupervised data. As we suggested in our paper, making positive pairs meaningfully different is a promising way to improve the performance of contrastive learning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants