-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
The step call is updating self.decay, which will then be used to compute the step-dependent-decay in the next steps, resulting in a bit of a mess that makes EMA not really work as it should.
diffusers/examples/text_to_image/train_text_to_image.py
Lines 281 to 293 in ab0e92f
| def get_decay(self, optimization_step): | |
| """ | |
| Compute the decay factor for the exponential moving average. | |
| """ | |
| value = (1 + optimization_step) / (10 + optimization_step) | |
| return 1 - min(self.decay, value) | |
| @torch.no_grad() | |
| def step(self, parameters): | |
| parameters = list(parameters) | |
| self.optimization_step += 1 | |
| self.decay = self.get_decay(self.optimization_step) |
The issue seems to be with the self.decay = ... which should be decay = ... in order not to affect the original decay value.
Edit: Just checked the original implementation, this does seem to be a bug/typo indeed: https://github.com/CompVis/stable-diffusion/blob/main/ldm/modules/ema.py
Reproduction
A replication of the algorithm currently in there shows that it makes little sense:
decay = 0.9999
for i in range(10000):
value = (1 + i) / (10 + i)
decay = 1 - min(decay, value)
print(f"{i:07d} {decay:.6f}")
Step Decay
0000000 0.900000
0000001 0.818182
0000002 0.750000
0000003 0.692308
0000004 0.642857
0000005 0.600000
0000006 0.562500
0000007 0.529412
0000008 0.500000
0000009 0.500000
Logs
No response
System Info
0.12 dev
patrickvonplaten
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working