Closed
Description
Click to expand!
Issue Type
Feature Request
Source
source
Tensorflow Version
2.8
Custom Code
No
OS Platform and Distribution
No response
Mobile device
No response
Python version
No response
Bazel version
No response
GCC/Compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current Behaviour?
tf.keras.optimizers.experimental.AdamW only supports constant weight decay. But usually we want the weight_decay value to decay with learning rate schedule.
Standalone code to reproduce the issue
The legacy tfa.optimizers.AdamW supports callable weight_decay, which is much better.
Relevant log output
No response