You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I believe there's a small bug in training_loop.py line @
782 which seems to crash my WGAN trainer, having 3 optimizers.
Details:
Take a look at training_loop.py @ line 872: batch_outputs[opt_idx].append(opt_closure_result.training_step_output_for_epoch_end)
opt_idx should point to the optimizer index (0,1,2 in my case) and batch_outputs should be a list of length 3 (in my application).
However, batch_outputs is initialized to a list @ line 818: batch_outputs = [[] for i in range(len(self._get_optimizers_iterable()))]
If self.optimizer_frequencies is defined then the output of _self.get_optimizers_iterable will be a list of a single tuple, therefore batch_outputs shall be a list with a single item only and the loop will fail @ line 872 due to out of index error.
Fix:
@ line 872 replace with: batch_outputs = [[] for i in range(len(self.optimizers))]
replace line 561 with: if optimizer_idx_outputs:
sample_output = optimizer_idx_outputs[-1]
else:
sample_output = None
Attached is a modified training_loop.py
Unfortunately, I don't have a minimal example to replicate the bug and I can't publish my full training testbench.
The text was updated successfully, but these errors were encountered:
awaelchli
changed the title
probably a bug: training multiple optimizers with different frequency
training multiple optimizers with different frequency
Aug 27, 2020
Hi,
I believe there's a small bug in training_loop.py line @
782 which seems to crash my WGAN trainer, having 3 optimizers.
Details:
Take a look at training_loop.py @ line 872:
batch_outputs[opt_idx].append(opt_closure_result.training_step_output_for_epoch_end)
opt_idx should point to the optimizer index (0,1,2 in my case) and batch_outputs should be a list of length 3 (in my application).
However, batch_outputs is initialized to a list @ line 818:
batch_outputs = [[] for i in range(len(self._get_optimizers_iterable()))]
If self.optimizer_frequencies is defined then the output of _self.get_optimizers_iterable will be a list of a single tuple, therefore batch_outputs shall be a list with a single item only and the loop will fail @ line 872 due to out of index error.
Fix:
@ line 872 replace with:
batch_outputs = [[] for i in range(len(self.optimizers))]
replace line 561 with:
if optimizer_idx_outputs:
sample_output = optimizer_idx_outputs[-1]
else:
sample_output = None
Attached is a modified training_loop.py
Unfortunately, I don't have a minimal example to replicate the bug and I can't publish my full training testbench.
The text was updated successfully, but these errors were encountered: