Skip to content

Conversation

@o7si
Copy link
Contributor

@o7si o7si commented Nov 19, 2025

When loading base model from Hugging Face, dir_base_model is None, causing TypeError in index_tensors().

TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'

Passes remote_hf_model_id to LoraModel to load tensors from Hugging Face.

Related issue:

@o7si o7si requested a review from CISC as a code owner November 19, 2025 15:44
@github-actions github-actions bot added the python python script changes label Nov 19, 2025
Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I was planning to address this, but hadn't gotten around to it yet.

I was thinking of changing this function instead, and then the rest only need minor changes:

def load_hparams_from_hf(hf_model_id: str) -> tuple[dict[str, Any], Path | None]:
    from huggingface_hub import try_to_load_from_cache

    # normally, adapter does not come with base model config, we need to load it from AutoConfig
    config = AutoConfig.from_pretrained(hf_model_id)
    cache_dir = try_to_load_from_cache(hf_model_id, "config.json")
    cache_dir = Path(cache_dir).parent if isinstance(cache_dir, str) else None

    return config.to_dict(), cache_dir

@o7si
Copy link
Contributor Author

o7si commented Nov 20, 2025

Hi @CISC, thank you for your guidance :D My implementation introduced additional variables, which is not elegant enough.

I directly copied and used this code snippet, and made adjustments to several calling locations:

def load_hparams_from_hf(hf_model_id: str) -> tuple[dict[str, Any], Path | None]:
    from huggingface_hub import try_to_load_from_cache

    # normally, adapter does not come with base model config, we need to load it from AutoConfig
    config = AutoConfig.from_pretrained(hf_model_id)
    cache_dir = try_to_load_from_cache(hf_model_id, "config.json")
    cache_dir = Path(cache_dir).parent if isinstance(cache_dir, str) else None

    return config.to_dict(), cache_dir

Are there any other parts of the code that need to be adjusted?

Related Tests:

  • Test Case 1: Local Base Model Directory
python convert_lora_to_gguf.py --base Qwen2.5-1.5B-Instruct lora_path 
INFO:lora-to-gguf:Loading base model: Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:gguf: indexing model part 'model.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
  • Test Case 2: Explicit Remote Model ID
python convert_lora_to_gguf.py --base-model-id Qwen/Qwen2.5-1.5B-Instruct lora_path 
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
  • Test Case 3: Auto-infer from Adapter Config
python convert_lora_to_gguf.py lora_path     
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf

o7si and others added 2 commits November 20, 2025 17:33
@o7si
Copy link
Contributor Author

o7si commented Nov 20, 2025

I've resubmitted the code.

Related Tests:

  • Test Case 1: Local Base Model Directory
python convert_lora_to_gguf.py --base Qwen2.5-1.5B-Instruct lora_path 
INFO:lora-to-gguf:Loading base model: Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:gguf: indexing model part 'model.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
  • Test Case 2: Explicit Remote Model ID
python convert_lora_to_gguf.py --base-model-id Qwen/Qwen2.5-1.5B-Instruct lora_path 
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:Using remote model with HuggingFace id: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Qwen-Qwen2.5-1.5B-Instruct-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Qwen-Qwen2.5-1.5B-Instruct-F16.gguf
  • Test Case 3: Auto-infer from Adapter Config
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf

@CISC CISC merged commit 5088b43 into ggml-org:master Nov 20, 2025
6 checks passed
Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
…ora_to_gguf (ggml-org#17385)

* fix: TypeError when loading base model remotely in convert_lora_to_gguf

* refactor: simplify base model loading using cache_dir from HuggingFace

* Update convert_lora_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* feat: add remote_hf_model_id to trigger lazy mode in LoRA converter

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

convert_lora_to_gguf.py fails with dir_model is None when loading base model from HF

2 participants