RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x6 and 7x4096) #144

Nikitala0014 · 2023-03-23T19:31:35Z

I have two NVIDIA A10 24GB. When I start the finetune.py script I recieve this error on batch size 24. Can you help me understand what is the real problem? Thank's.

chiayisu · 2023-03-23T21:42:22Z

I encountered this errror as well. However, in my case, after I added "os.environ["CUDA_VISIBLE_DEVICES"] = "1" to restrict to use only one gpu, the problem was resolved.

Nikitala0014 · 2023-03-24T05:34:12Z

Thank you for your response. Indeed, this solved not only this problem, but for some reason only with this approach to launching on one gpu, 24GB of memory is enough for training.

AngainorDev · 2023-03-24T16:49:16Z

See this other - same - issue and answers

#8 (comment)

Training on multiple GPUs is possible with torchrun, you'll double batch size and half the training time.
Take care of veryfiying the math by hand so gradient accumulation steps is right.

Nikitala0014 · 2023-03-25T06:06:09Z

See this other - same - issue and answers

#8 (comment)

Training on multiple GPUs is possible with torchrun, you'll double batch size and half the training time. Take care of veryfiying the math by hand so gradient accumulation steps is right.

Thank you. That's what I need ;)

Nikitala0014 closed this as completed Mar 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x6 and 7x4096) #144

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x6 and 7x4096) #144

Nikitala0014 commented Mar 23, 2023

chiayisu commented Mar 23, 2023

Nikitala0014 commented Mar 24, 2023

AngainorDev commented Mar 24, 2023

Nikitala0014 commented Mar 25, 2023

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x6 and 7x4096) #144

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x6 and 7x4096) #144

Comments

Nikitala0014 commented Mar 23, 2023

chiayisu commented Mar 23, 2023

Nikitala0014 commented Mar 24, 2023

AngainorDev commented Mar 24, 2023

Nikitala0014 commented Mar 25, 2023