Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compare traditional training result with swa result at same epoch level #7

Open
Haijunlv opened this issue Jan 13, 2021 · 3 comments
Open

Comments

@Haijunlv
Copy link

Haijunlv commented Jan 13, 2021

nice work to make swa work in object detection!
i have one question about same epoch level comparison.
the result looks like faster rcnn r50 1x + 1x swa extra training get same result as faster rcnn r50 2x?
i think maybe some problem.

  1. faster rcnn r50 1x + 1x swa extra training use cyclic training, but origin faster rcnn r50 2x use step down lr training.
    this mismatch may lead to differenct converge. i think the best way is to train models from scratch with cyclic training to get a fair comaprison.
  2. swa needs to change batch norm param to match average weight. frozen bn may harm the final ensemble result

image
image

@hyz-xmaster
Copy link
Owner

hyz-xmaster commented Jan 13, 2021

I think it is normal to have the same results. Because 1x training is actually not enough to achieve the performance saturation point in this case. What you should expect for swa training is if it can further improve the saturation performance.

image

  • You can try training Faster RCNN from scratch with cyclic learning rates and probably you get slight worse results. Do not ask me why I know this.

  • The reason for freezing BN in backbone is that the batch size used in training object detectors is not large enough to compute accurate statistics for BN. There are experiments in Section 5.2 of the MMDetection paper. In practice, there is still a considerable AP improvement with frozen BN. So it should not be a big problem.

@Haijunlv
Copy link
Author

ok, then may be frcnn r50 2x +swa may get further improvement.
thx to you answer

@hyz-xmaster
Copy link
Owner

Yes, I think it would have further improvement. No worries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants