Catch ConfigError when attempting to set num_parameters in WANDB #33440

gheinrich · 2024-09-11T14:55:29Z

What does this PR do?

The current version of the code catches AttributeError when attempting to set the model's number of parameters in the Weights & Biases config.

ConfigError is another exception that needs catching in the case when the current number of model parameters differs from that of the config. This happens, for example, on resuming from checkpoints if extra layers have been added.

Example error message:

wandb.sdk.lib.config_util.ConfigError: Attempted to change value of key "model/num_parameters" from 0 to 700416

Since the current code already includes support for ignoring AttributeError, it feels safe to add this extra catch.

I did not find a test suite for the WANDB integration.

The current version of the code catches AttributeError when attempting to set the model's number of parameters in the Weights & Biases config. ConfigError is another exception that needs catching in the case when the current number of model parameters differs from that of the config. This happens, for example, on resuming from checkpoints if extra layers have been added. Example error message: ``` wandb.sdk.lib.config_util.ConfigError: Attempted to change value of key "model/num_parameters" from 0 to 700416 ``` Since the current code already includes support for ignoring AttributeError, it feels safe to add this extra catch.

gheinrich · 2024-09-12T15:58:15Z

Hello, cc @parambharat since you wrote the code that catches the first exception. Thank you!

LysandreJik · 2024-09-13T15:13:32Z

src/transformers/integrations/integration_utils.py

@@ -853,6 +853,10 @@ def setup(self, args, state, model, **kwargs):
                self._wandb.config["model/num_parameters"] = model.num_parameters()
            except AttributeError:
                logger.info("Could not log the number of model parameters in Weights & Biases.")
+            except self._wandb.sdk.lib.config_util.ConfigError:


Is there a path for the ConfigError that doesn't depend on self?

Similar to #33464, but we could import the error at __init__ instead imo:

transformers/src/transformers/integrations/integration_utils.py

Lines 764 to 769 in d667888

def __init__(self):

has_wandb = is_wandb_available()

if not has_wandb:

raise RuntimeError("WandbCallback requires wandb to be installed. Run `pip install wandb`.")

if has_wandb:

import wandb

Hello thanks for the comment, yes I tried to remain consistent with the other wandb references in this file which are all made through self._wandb. Feel free to close this PR in favor of #33464 if you prefer the other implementation :-)

Thanks both! IMO it's a bit cleaner to have the error be imported here rather than as a reference to self, so I'll go ahead and merge the other PR.

I appreciate your contribution, please keep them coming!

Thank you, I will abandon this PR.

gheinrich force-pushed the wandb-config-error branch from 37d6702 to c05e87c Compare September 11, 2024 15:15

gheinrich force-pushed the wandb-config-error branch from c05e87c to d667888 Compare September 12, 2024 12:39

vasqu mentioned this pull request Sep 13, 2024

fix the wandb logging issue #33464

Merged

LysandreJik reviewed Sep 13, 2024

View reviewed changes

gheinrich closed this Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catch ConfigError when attempting to set num_parameters in WANDB #33440

Catch ConfigError when attempting to set num_parameters in WANDB #33440

gheinrich commented Sep 11, 2024

gheinrich commented Sep 12, 2024

LysandreJik Sep 13, 2024

vasqu Sep 13, 2024

gheinrich Sep 13, 2024

LysandreJik Sep 17, 2024

gheinrich Sep 17, 2024

	def __init__(self):
	has_wandb = is_wandb_available()
	if not has_wandb:
	raise RuntimeError("WandbCallback requires wandb to be installed. Run `pip install wandb`.")
	if has_wandb:
	import wandb

Catch ConfigError when attempting to set num_parameters in WANDB #33440

Catch ConfigError when attempting to set num_parameters in WANDB #33440

Conversation

gheinrich commented Sep 11, 2024

What does this PR do?

gheinrich commented Sep 12, 2024

LysandreJik Sep 13, 2024

Choose a reason for hiding this comment

vasqu Sep 13, 2024

Choose a reason for hiding this comment

gheinrich Sep 13, 2024

Choose a reason for hiding this comment

LysandreJik Sep 17, 2024

Choose a reason for hiding this comment

gheinrich Sep 17, 2024

Choose a reason for hiding this comment