You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using a dataset with about 3500 images for training and 400 for validation, 50 classes. I'm running in a Multigpu environment (HPC machine). At the moment the global AP is about 18. The following are settings gives best AP:
Why the execution time is the same when I run with 2,4,8 GPU? (Gpus are running, I've checked)
How can I improve the Global AP, quite low at the moment? I tried a lot of Learning rate, iteration and LR decay (using cfg.SOLVER.STEPS), with no success.
How can I stop my training when the AP is not increasing for a lot of iterations? Or in general, which is the best criteria to stop training when the iterations are too much?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
I'm using a dataset with about 3500 images for training and 400 for validation, 50 classes. I'm running in a Multigpu environment (HPC machine). At the moment the global AP is about 18. The following are settings gives best AP:
These are the AP and execution time among 2,4,8 GPUs:
2 GPU: AP: 18.6 -- 55 minutes
4 GPU: AP: 18.2 -- 57 minutes
8 GPU: AP: 18 -- 56 minutes
So my question are:
Beta Was this translation helpful? Give feedback.
All reactions