Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练数据的疑问 #21

Open
xiki-1014 opened this issue Feb 28, 2024 · 1 comment
Open

关于训练数据的疑问 #21

xiki-1014 opened this issue Feb 28, 2024 · 1 comment

Comments

@xiki-1014
Copy link

作者您好,关于训练数据我有一些问题想向您请教:
训练数据是通用数据与医疗领域数据的混合,还是只使用了医疗数据。如果是前者想问一下两者的比例。我只使用医疗数据进行全参微调出现了灾难性遗忘。
期待您的回复。

@jymChen
Copy link
Contributor

jymChen commented Jun 25, 2024

@xiki-1014 你好,
HuatuoGPT-II里面只使用了ShareGPT的通用微调指令,并且于医疗SFT数据混合一起,不过我们没有测试我们模型在通用领域的表现。
如果出现比较严重的通用领域灾难性遗忘,建议可以多加点通用数据。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants