Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pretrained_model setting in hparams.yaml has no effect #156

Closed
djangodesmet opened this issue Jul 27, 2023 · 1 comment
Closed

pretrained_model setting in hparams.yaml has no effect #156

djangodesmet opened this issue Jul 27, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@djangodesmet
Copy link

🐛 Bug

When calling the load_from_checkpoint function to load a model from a checkpoint, the hparams.yml file located in the parent folder does not get taken into account. For example, the pretrained_model setting in hparams.yml has no effect.

To Reproduce

contents of hparams.yml:

activations: Tanh
batch_size: 4
class_identifier: regression_metric
dropout: 0.1
encoder_learning_rate: 1.0e-06
encoder_model: XLM-RoBERTa
final_activation: null
hidden_sizes:
  - 3072
  - 1024
keep_embeddings_frozen: true
layer: mix
layer_norm: false
layer_transformation: sparsemax
layerwise_decay: 0.95
learning_rate: 1.5e-05
loss: mse
nr_frozen_epochs: 0.3
optimizer: AdamW
pool: avg
pretrained_model: /home/jovyan/nllb-gpu/models/xlm-roberta-large
train_data:
  - data/1720-da.csv
validation_data:
  - data/wmt-ende-newstest2021.csv
  - data/wmt-enru-newstest2021.csv
  - data/wmt-zhen-newstest2021.csv
comet = load_from_checkpoint(os.path.join(model_path, "checkpoint", "model.ckpt"))`

The roberta model is located in models/xlm-roberta-large as indicated by pretrained_model but an error is thrown because it still expects the roberta model to be in xlm-roberta-large in root. This gives the following error message:

EnvironmentError(
OSError: Can't load tokenizer for 'xlm-roberta-large'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'xlm-roberta-large' is the correct path to a directory containing all relevant files for a XLMRobertaTokenizerFast tokenizer.

Expected behaviour

I would expect that the pretrained_model parameter is used to determine the location of the model.

This could be achieved by adding the hparams_file as an argument to the model_class.load_from_checkpoint function in models/__init__.py

def load_from_checkpoint(checkpoint_path: str) -> CometModel:
    """Loads models from a checkpoint path.

    Args:
        checkpoint_path (str): Path to a model checkpoint.

    Return:
        COMET model.
    """
    checkpoint_path = Path(checkpoint_path)

    if not checkpoint_path.is_file():
        raise Exception(f"Invalid checkpoint path: {checkpoint_path}")

    parent_folder = checkpoint_path.parents[1]  # .parent.parent
    hparams_file = parent_folder / "hparams.yaml"

    if hparams_file.is_file():
        with open(hparams_file) as yaml_file:
            hparams = yaml.load(yaml_file.read(), Loader=yaml.FullLoader)
        model_class = str2model[hparams["class_identifier"]]
        model = model_class.load_from_checkpoint(
            checkpoint_path, load_pretrained_weights=False, hparams_file=hparams_file
        )
        return model
    else:
        raise Exception(f"hparams.yaml file is missing from {parent_folder}!")

Environment

OS: Linux
Packaging: pip
Version 2.0.1

@djangodesmet djangodesmet added the bug Something isn't working label Jul 27, 2023
@ricardorei
Copy link
Collaborator

Hi @djangodesmet I added a bit more flexibility to load_from_checkpoint with my last push. To solve your case I added an argument reload_hparams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants