Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported. Please call module.cuda() before module.load_state_dict() #16

Closed
NanoCode012 opened this issue May 7, 2023 · 2 comments

Comments

@NanoCode012
Copy link
Collaborator

I get the below at the end of training. I suspect it's due to loading 8 bit and https://github.com/winglian/axolotl/blob/47ad3890bc35985b9046f403312887035e19f96f/src/axolotl/utils/trainer.py#L99

Stack trace

File "/workspace/scripts/finetune.py", line 246, in <module> 
    fire.Fire(train) 
  File "/usr/local/lib/python3.9/dist-packages/fire/core.py", line 141, in Fire 
    component_trace = _Fire(component, args, parsed_flag_args, context, name) 
  File "/usr/local/lib/python3.9/dist-packages/fire/core.py", line 475, in _Fire 
    component, remaining_args = _CallAndUpdateTrace( 
  File "/usr/local/lib/python3.9/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace 
    component = fn(*varargs, **kwargs) 
  File "/workspace/scripts/finetune.py", line 235, in train 
    trainer.train(resume_from_checkpoint=resume_from_checkpoint) 
  File "/usr/local/lib/python3.9/dist-packages/transformers/trainer.py", line 1664, in train 
    return inner_training_loop( 
  File "/usr/local/lib/python3.9/dist-packages/transformers/trainer.py", line 2054, in _inner_training_loop 
    self._load_best_model() 
  File "/usr/local/lib/python3.9/dist-packages/transformers/trainer.py", line 2230, in _load_best_model 
    load_result = model.load_state_dict(state_dict, False) 
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 2027, in load_state_dict 
    load(self, state_dict) 
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 2015, in load 
    load(child, child_state_dict, child_prefix) 
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 2015, in load 
    load(child, child_state_dict, child_prefix) 
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 2015, in load 
    load(child, child_state_dict, child_prefix) 
  [Previous line repeated 4 more times] 
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 2009, in load 
    module._load_from_state_dict( 
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/nn/modules.py", line 298, in _load_from_state_dict 
    raise RuntimeError("Loading a quantized checkpoint into non-quantized Linear8bitLt is " 
RuntimeError: Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported. Please call module.cuda() before module.load_state_dict()

Info

Commit: Before dev merge winglian/axolotl@cb9a887

@winglian
Copy link
Collaborator

winglian commented May 7, 2023

might be easiest to change that training_argument to false if load_in_8bit is True.

@NanoCode012
Copy link
Collaborator Author

Related issue upstream: huggingface/peft#394

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants