-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
实际上model0的效果就说得过去了 #3
Comments
model3_sft1的结果
我觉得可以在model0-1的量级上停止增加model复杂度了,dim不用再增加到1024以上了 下面重点应该在训练数据上下功夫了 我就两张3090和单卡4090还要和别的同学抢资源,大规模分布式训练和大数据量我是玩不起来了。 |
好的,我在检查一下 |
我就推理测试了一下model0,效果惊人: 阿里巴巴的寓言故事? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
我拿你的预训练模型测试结果
之前效果不行完全是因为这个bug
可以重新测试了,readme里的效果和实际不符的
The text was updated successfully, but these errors were encountered: