Error in RAG finetuning script #8345

shamanez · 2020-11-05T22:32:36Z

Environment info

ut the missing fields in that output! -->

transformers version: 3.4.0
Platform: Linux-4.18.0-147.5.1.el8_1.x86_64-x86_64-with-centos-8.1.1911-Core
Python version: 3.6.8
PyTorch version (GPU?): 1.7.0 (True)
Tensorflow version (GPU?): not installed (NA)
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help

@patrickvonplaten, @lhoestq

Information

I am using RAG fine-tuning script. During the fine-tuning process, it says

torch.nn.modules.module.ModuleAttributeError: 'GenerativeQAModule' object has no attribute 'opt'

The bug exactly appears in [line 332 in finetune.py (https://github.com/huggingface/transformers/blob/master/examples/rag/finetune.py#L332)

To reproduce

I have installed the transformers library from the source. Not the pip.

The text was updated successfully, but these errors were encountered:

shamanez · 2020-11-06T23:51:56Z

It is related with the optimizer initialization in the finetune.py script. Seems like even in the lightning_base.py there is no initialization for the optimizer.

shamanez · 2020-11-10T03:58:02Z

@lhoestq

any idea on this?

I managed to work around by calling the optimizer initialization inside the train_dataloader function in finetune.py.

lhoestq · 2020-11-12T11:07:07Z

Well the optimizer/scheduler is already defined in examples/lightningbase.py in BaseTransoformer.configure_optimizers. Not sure why the train_dataloader() function in finetune.py tries to define the scheduler. This must have been a bad copy paste...

I think we should remove those lines

transformers/examples/rag/finetune.py

Lines 326 to 336 in 17b1fd8

    
           t_total = ( 
        
               (len(dataloader.dataset) // (self.hparams.train_batch_size * max(1, self.hparams.gpus))) 
        
               // self.hparams.accumulate_grad_batches 
        
               * float(self.hparams.max_epochs) 
        
           ) 
        
           scheduler = get_linear_schedule_with_warmup( 
        
               self.opt, num_warmup_steps=self.hparams.warmup_steps, num_training_steps=t_total 
        
           ) 
        
           if max(scheduler.get_last_lr()) > 0: 
        
               warnings.warn("All learning rates are 0") 
        
           self.lr_scheduler = scheduler

I just tried to remove them and now I'm getting this other issue #7816
I'll fix this one as well and make a PR

shamanez · 2020-11-12T21:11:29Z

That what I was thinking since there is a specific def in lightningbase.py.

lhoestq self-assigned this Nov 12, 2020

lhoestq mentioned this issue Nov 17, 2020

Fix rag finetuning + add finetuning test #8585

Merged

lhoestq closed this as completed in #8585 Nov 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in RAG finetuning script #8345

Error in RAG finetuning script #8345

shamanez commented Nov 5, 2020

shamanez commented Nov 6, 2020 •

edited

Loading

shamanez commented Nov 10, 2020

lhoestq commented Nov 12, 2020

shamanez commented Nov 12, 2020

Error in RAG finetuning script #8345

Error in RAG finetuning script #8345

Comments

shamanez commented Nov 5, 2020

Environment info

Who can help

Information

To reproduce

shamanez commented Nov 6, 2020 • edited Loading

shamanez commented Nov 10, 2020

lhoestq commented Nov 12, 2020

shamanez commented Nov 12, 2020

shamanez commented Nov 6, 2020 •

edited

Loading