Skip to content

[Feature Request] EarlyStopping logging on rank 0 only #13162

Closed
@austinmw

Description

@austinmw

🚀 Feature

Toggle switch to turn off EarlyStopping logging for processes other than rank 0

Motivation

EarlyStopping logging can be a bit spammy when viewing aggregate logs across all processes. For example, with my custom CloudWatch logger:

xnpww4j62d-algo-1-vr8o9 | 14:17:49 [INFO] Epoch 9: [ Training | 100%  iter# 49/49    19.28 batches/s ] train/loss_step=0.764418, train/loss_epoch=0.773, train/acc=0.68356
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] Epoch 9: [ Validation | 100%  iter# 10/10     2.34 batches/s ] val/loss_step=1.253475, val/loss_epoch=1.278802, val/acc=0.6107
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 0] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 2] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 1] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 3] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 4] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 5] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 6] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:17:55 [INFO] [rank: 7] Metric val/acc improved by 0.195 >= min_delta = 0.0. New best score: 0.611
xnpww4j62d-algo-1-vr8o9 | 14:18:20 [INFO] Epoch 14: [ Training | 100%  iter# 49/49    18.94 batches/s ] train/loss_step=0.611876, train/loss_epoch=0.55, train/acc=0.80096
xnpww4j62d-algo-1-vr8o9 | 14:18:26 [INFO] Epoch 14: [ Validation | 100%  iter# 10/10     2.29 batches/s ] val/loss_step=0.748429, val/loss_epoch=0.828285, val/acc=0.726

Pitch

It would be nice if we could turn off printing of this message on processes other than rank 0. I understand that this is actually useful to monitor in some cases, so maybe this toggle could be set to False by default.

Alternatives

Custom EarlyStopping callback?

cc @Borda @carmocca @awaelchli @rohitgr7

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions