Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in RAG finetuning script #8345

Closed
shamanez opened this issue Nov 5, 2020 · 4 comments · Fixed by #8585
Closed

Error in RAG finetuning script #8345

shamanez opened this issue Nov 5, 2020 · 4 comments · Fixed by #8585
Assignees

Comments

@shamanez
Copy link
Contributor

shamanez commented Nov 5, 2020

Environment info

ut the missing fields in that output! -->

  • transformers version: 3.4.0
  • Platform: Linux-4.18.0-147.5.1.el8_1.x86_64-x86_64-with-centos-8.1.1911-Core
  • Python version: 3.6.8
  • PyTorch version (GPU?): 1.7.0 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help

@patrickvonplaten, @lhoestq

Information

I am using RAG fine-tuning script. During the fine-tuning process, it says

torch.nn.modules.module.ModuleAttributeError: 'GenerativeQAModule' object has no attribute 'opt'

The bug exactly appears in [line 332 in finetune.py (https://github.com/huggingface/transformers/blob/master/examples/rag/finetune.py#L332)

To reproduce

I have installed the transformers library from the source. Not the pip.

@shamanez
Copy link
Contributor Author

shamanez commented Nov 6, 2020

It is related with the optimizer initialization in the finetune.py script. Seems like even in the lightning_base.py there is no initialization for the optimizer.

@shamanez
Copy link
Contributor Author

@lhoestq

any idea on this?

I managed to work around by calling the optimizer initialization inside the train_dataloader function in finetune.py.

@lhoestq
Copy link
Member

lhoestq commented Nov 12, 2020

Well the optimizer/scheduler is already defined in examples/lightningbase.py in BaseTransoformer.configure_optimizers. Not sure why the train_dataloader() function in finetune.py tries to define the scheduler. This must have been a bad copy paste...

I think we should remove those lines

t_total = (
(len(dataloader.dataset) // (self.hparams.train_batch_size * max(1, self.hparams.gpus)))
// self.hparams.accumulate_grad_batches
* float(self.hparams.max_epochs)
)
scheduler = get_linear_schedule_with_warmup(
self.opt, num_warmup_steps=self.hparams.warmup_steps, num_training_steps=t_total
)
if max(scheduler.get_last_lr()) > 0:
warnings.warn("All learning rates are 0")
self.lr_scheduler = scheduler

I just tried to remove them and now I'm getting this other issue #7816
I'll fix this one as well and make a PR

@lhoestq lhoestq self-assigned this Nov 12, 2020
@shamanez
Copy link
Contributor Author

That what I was thinking since there is a specific def in lightningbase.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants