Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型并行问题 #82

Open
juemifuji opened this issue Apr 17, 2023 · 4 comments
Open

模型并行问题 #82

juemifuji opened this issue Apr 17, 2023 · 4 comments

Comments

@juemifuji
Copy link

训练chatglm-6b模型,可以使用模型并行的方式了!!! 请点击链接查看Chatglm6b_ModelParallel,目前这个版本,虽然在训练的过程中,loss下降了,但是模型学习不到内容,这个问题我还在排查。

请问这个问题解决了吗

@yuanzhoulvpi2017
Copy link
Owner

没解决~,但是过几天将发布一个新的lora训练代码(支持多卡进行模型并行)

@huangxd-
Copy link

mymusise/ChatGLM-Tuning#59 (comment)
我用ChatGLM-Tuning,原先没啥效果,替换成target_modules=["query_key_value", "dense", "dense_h_to_4h", "dense_4h_to_h"]就有效果了

@cxj01
Copy link

cxj01 commented Apr 20, 2023

@yuanzhoulvpi2017
用仓库代码,虽然电脑上有两块GPU,但是还是加载一块GPU,如果指定各个层在不同GPU上,会报Tensor不在一个device上的错误。

@yuanzhoulvpi2017
Copy link
Owner

@yuanzhoulvpi2017 用仓库代码,虽然电脑上有两块GPU,但是还是加载一块GPU,如果指定各个层在不同GPU上,会报Tensor不在一个device上的错误。

你是不是用的还是老的模型文件?要用最新的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants