diff --git a/docs/_pages/config-json.md b/docs/_pages/config-json.md index dc5c37a4c202..b8512b4621af 100755 --- a/docs/_pages/config-json.md +++ b/docs/_pages/config-json.md @@ -301,6 +301,7 @@ Enabling and configuring ZeRO memory optimizations "elastic_checkpoint" : [true|false], "stage3_gather_fp16_weights_on_model_save": [true|false], "ignore_unused_parameters": [true|false] + "round_robin_gradients": [true|false] } ``` @@ -358,6 +359,12 @@ Enabling and configuring ZeRO memory optimizations | ------------------------------------------------------------------------------------------------------------------------------------------ | ------- | | For use with ZeRO stage 1, enable backward hooks to reduce gradients during the backward pass or wait until the end of the backward pass. | `True` | +***round_robin_gradients***: [boolean] + +| Description | Default | +| ------------------------------------------------------------------------------------------------------------------------------------------ | ------- | +| Stage 2 optimization for CPU offloading that parallelizes gradient copying to CPU memory among ranks by fine-grained gradient partitioning. Performance benefit grows with gradient accumulation steps (more copying between optimizer steps) or GPU count (increased parallelism). | `False` | + ***offload_param***: [dictionary] | Description | Default |