-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch ConfigError when attempting to set num_parameters in WANDB #33440
Conversation
37d6702
to
c05e87c
Compare
The current version of the code catches AttributeError when attempting to set the model's number of parameters in the Weights & Biases config. ConfigError is another exception that needs catching in the case when the current number of model parameters differs from that of the config. This happens, for example, on resuming from checkpoints if extra layers have been added. Example error message: ``` wandb.sdk.lib.config_util.ConfigError: Attempted to change value of key "model/num_parameters" from 0 to 700416 ``` Since the current code already includes support for ignoring AttributeError, it feels safe to add this extra catch.
c05e87c
to
d667888
Compare
Hello, cc @parambharat since you wrote the code that catches the first exception. Thank you! |
@@ -853,6 +853,10 @@ def setup(self, args, state, model, **kwargs): | |||
self._wandb.config["model/num_parameters"] = model.num_parameters() | |||
except AttributeError: | |||
logger.info("Could not log the number of model parameters in Weights & Biases.") | |||
except self._wandb.sdk.lib.config_util.ConfigError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a path for the ConfigError
that doesn't depend on self
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to #33464, but we could import the error at __init__
instead imo:
transformers/src/transformers/integrations/integration_utils.py
Lines 764 to 769 in d667888
def __init__(self): | |
has_wandb = is_wandb_available() | |
if not has_wandb: | |
raise RuntimeError("WandbCallback requires wandb to be installed. Run `pip install wandb`.") | |
if has_wandb: | |
import wandb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello thanks for the comment, yes I tried to remain consistent with the other wandb
references in this file which are all made through self._wandb
. Feel free to close this PR in favor of #33464 if you prefer the other implementation :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both! IMO it's a bit cleaner to have the error be imported here rather than as a reference to self
, so I'll go ahead and merge the other PR.
I appreciate your contribution, please keep them coming!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I will abandon this PR.
What does this PR do?
The current version of the code catches AttributeError when attempting to set the model's number of parameters in the Weights & Biases config.
ConfigError is another exception that needs catching in the case when the current number of model parameters differs from that of the config. This happens, for example, on resuming from checkpoints if extra layers have been added.
Example error message:
Since the current code already includes support for ignoring AttributeError, it feels safe to add this extra catch.
I did not find a test suite for the WANDB integration.