Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练数据 #4

Open
Olivia-xu opened this issue May 15, 2023 · 2 comments
Open

训练数据 #4

Olivia-xu opened this issue May 15, 2023 · 2 comments

Comments

@Olivia-xu
Copy link

想咨询下,楼主如何保证训练数据的准确率的

@donote
Copy link

donote commented May 16, 2023

想咨询下,楼主如何保证训练数据的准确率的

对领域知识使用chatgpt生成指令样本,假定了chatgpt有足够能力对给定的领域知识进行理解,通过prompt尽量挖掘出chatgpt的这种领域知识理解能力,进而转换为所需要的指令样本。
「楼主如何保证训练数据的准确率的」在没有人为介入的情况下,没法保证数据完全准确,实际上在开源底座模型上进行指令微调,可以看着是在追赶&拟合chatgpt的能力,所以把chatgpt做为teacher得到的微调数据是可以接受的。

@applepieiris
Copy link

数据的开源太重要了,现在才觉得instruction tuning的开拓者斯坦福的羊驼模型是多么的慷慨,开源了自己的finetune数据集

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants