Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the paper #1

Open
Gyuyeong opened this issue Sep 1, 2023 · 2 comments
Open

Questions about the paper #1

Gyuyeong opened this issue Sep 1, 2023 · 2 comments

Comments

@Gyuyeong
Copy link

Gyuyeong commented Sep 1, 2023

After reading the paper, I have some questions:

  1. Is CARP applied on an already fine-tuned LLM like Chat GPT? If so, if I am trying to apply this concept in a model that has not been fine-tuned at all (for example GPT variants that can be found in Huggingface), how should I prepare the training data to fine-tune the LLM so that CARP can be effectively applied?
  2. I do not understand what the paper says regarding using training sets. From what I understand, there is a training set, and you sample some of them out using SimCSE for few-shot learning demonstration examples. However, I do not understand where the training set is used other than when sampling out the few-shot examples. Were they used to fine-tune the LLM and you keep them for future use?

I apologize if I asked anything that was already mentioned in the paper and I was not paying close attention to it. Thank you in advance

@PeterXiaTian
Copy link

看完这篇文章,个人觉得作者没有用训练集微调任何大模型,只是挑选了部分样本来做few-shot. 但是这里挑选样本的时候比较疑惑,比如用simces来选择和问题相似的文本,问题样本来自哪里??[mask了部分带label的样本吗?] 但是如果没有任何先验样本的话,这个该如何处理了?

@pranerd
Copy link

pranerd commented Feb 1, 2024

table 4 in this paper show that as the training set grows,16,128,256,512,1024, the carp models accuracy increase, how did it use the increased training set, did LLM model fine-tune on the training set?
screenshot_36

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants