-
Notifications
You must be signed in to change notification settings - Fork 14.6k
fix: TypeError when loading base model remotely in convert_lora_to_gguf #17385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CISC
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was planning to address this, but hadn't gotten around to it yet.
I was thinking of changing this function instead, and then the rest only need minor changes:
def load_hparams_from_hf(hf_model_id: str) -> tuple[dict[str, Any], Path | None]:
from huggingface_hub import try_to_load_from_cache
# normally, adapter does not come with base model config, we need to load it from AutoConfig
config = AutoConfig.from_pretrained(hf_model_id)
cache_dir = try_to_load_from_cache(hf_model_id, "config.json")
cache_dir = Path(cache_dir).parent if isinstance(cache_dir, str) else None
return config.to_dict(), cache_dir|
Hi @CISC, thank you for your guidance :D My implementation introduced additional variables, which is not elegant enough. I directly copied and used this code snippet, and made adjustments to several calling locations: def load_hparams_from_hf(hf_model_id: str) -> tuple[dict[str, Any], Path | None]:
from huggingface_hub import try_to_load_from_cache
# normally, adapter does not come with base model config, we need to load it from AutoConfig
config = AutoConfig.from_pretrained(hf_model_id)
cache_dir = try_to_load_from_cache(hf_model_id, "config.json")
cache_dir = Path(cache_dir).parent if isinstance(cache_dir, str) else None
return config.to_dict(), cache_dirAre there any other parts of the code that need to be adjusted? Related Tests:
python convert_lora_to_gguf.py --base Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model: Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:gguf: indexing model part 'model.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
python convert_lora_to_gguf.py --base-model-id Qwen/Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
python convert_lora_to_gguf.py lora_path
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf |
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
|
I've resubmitted the code. Related Tests:
python convert_lora_to_gguf.py --base Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model: Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:gguf: indexing model part 'model.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
python convert_lora_to_gguf.py --base-model-id Qwen/Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:Using remote model with HuggingFace id: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Qwen-Qwen2.5-1.5B-Instruct-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Qwen-Qwen2.5-1.5B-Instruct-F16.gguf
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf |
…ora_to_gguf (ggml-org#17385) * fix: TypeError when loading base model remotely in convert_lora_to_gguf * refactor: simplify base model loading using cache_dir from HuggingFace * Update convert_lora_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * feat: add remote_hf_model_id to trigger lazy mode in LoRA converter --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
When loading base model from Hugging Face,
dir_base_modelisNone, causingTypeErrorinindex_tensors().Passes
remote_hf_model_idtoLoraModelto load tensors from Hugging Face.Related issue: