You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your attention to our work!
This mask regularizer is a variation of the sigmoid function. As the training step increases, the slope of the function becomes steeper, gradually pushing the continuous scalar values toward 0 or 1. However, we observed that as the step approaches the global step, the slope increases sharply, causing instability in the training process. After testing several parameter sets, we decided to fix the slope at 0.8, meaning that at 0.8 * global_step, the slope is held constant (already quite steep at this point). Despite this, the mask continues to update and is still driven towards 0 or 1 under the influence of the active loss (Equation 9 in the paper, dimdown_loss in the code).
If you have any other questions, feel free to discuss them with us! And if you come up with any clever ideas, we’d be glad to explore them together!
你好,看代码的时候有一些疑惑,可以帮忙解答一下吗?DimDownModule.py中mask在step到达self.global_step*0.8的时候就停止更新了,是出于什么考虑呢?
The text was updated successfully, but these errors were encountered: