-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add lamb optimizer #7389
add lamb optimizer #7389
Conversation
@@ -14,6 +14,7 @@ See the License for the specific language governing permissions and | |||
limitations under the License. | |||
*/ | |||
|
|||
#include <memory> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
stream, n, scale, l1, l2, beta1, beta2, epsilon, weight_decay, learning_rate, scale_by_ptr, | ||
skip_if, reinterpret_cast<const half*>(model_diff), adam_diff, model, m, v, norm_buffer, | ||
beta1_t, beta2_t); | ||
stream, n, scale, l1, l2, beta1, beta2, epsilon, weight_decay, learning_rate_val, do_bias_correction, bias_correction1_val, bias_correction2_val, learning_rate_ptr, bias_correction1_ptr, bias_correction2_ptr, scale_by_ptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 of_format 不会做分行吗?
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally. |
CI failed when running job: cuda-misc. PR label automerge has been removed |
…com:Oneflow-Inc/oneflow into dev_lxy_lambOptim
Speed stats:
|
为 eager 和 graph 增加 lamb optimizer interface
原始公式如下
第一步 normalize gradients 可以在 clip grad 中实现。