-
Notifications
You must be signed in to change notification settings - Fork 1.1k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions about FP16 training #841
Comments
Some codes are ready, but we haven't fully implemented this feature. you need to modify some source codes and possibly solve some potential bugs. Currently, mmediting support fp16 based on To enable fp16, you need to modify the Previously, there is a PR #320, but it is some how out-of-dated. |
I turn fp16_enabled to True,but nothing happened. The training gpu memory did not become small. |
Oh, my mistake. Depending on whether you are using distributed or non-distributed training, you need to register the hook around line 179 or 304 in |
https://github.com/open-mmlab/mmdetection/blob/98949809b7179fab9391663ee5a4ab5978425f90/mmdet/apis/train.py#L153 |
It's a dictionary like At current stage, maybe mmdetection is a good reference. A good trick is to search some keywords in the repo. |
Hello. I have tried to use fp16 but find the loss nan problem after around 300~500 iterations. I have tried to use loss_scale=512/128/64/32 but all of them didn't work. I have also tried gradient clip by the way. Do you have any ideas about how to solve this problem? |
@LeoXing1996 |
Hey @Shuweis and @ArchipLab-LinfengZhang , currently MMEdit 1.x has already supported auto-fp16 training, and you are welcome to have a try. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Hi! I see that mmediting is support for fp16 training , how can I use it?
The text was updated successfully, but these errors were encountered: