fix: #3558 wrong model metadata import or download from HuggingFace #3725
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe Your Changes
When downloading the HF GGUF model via URL import from the model hub, the model is using default settings instead of the correct settings. The fixed context length is set to 2048, the prompt template is incorrect, and there is an issue with the stop word. However, importing GGUF directly seems to resolve these issues.
Steps to reproduce
Download the HF GGUF model from the model hub using URL import.
Check the settings for context length, prompt template, and stop word.
Compare the settings with the correct settings for the GGUF model.
Expected behavior
The HF GGUF model downloaded via URL import should have the correct settings for context length, prompt template, and stop word, matching the settings when GGUF is imported directly.
Additional context
The issue seems to be specific to downloading the HF GGUF model via URL import from the model hub. Importing GGUF directly does not exhibit the same issue with default settings.
Fixes Issues
model.yaml
for Model Downloaded via URL #3558Screenshots
Code Changes
extensions/model-extension/src/index.test.ts
extensions/model-extension/src/index.ts
extensions/model-extension/src/node/index.ts
extensions/model-extension/src/node/node.test.ts
Here's a summary of the major changes across these files:
index.test.ts
downloadMock
,mkdirMock
,writeFileSyncMock
,copyFileMock
).fetch
mock.renderJinjaTemplate
,Template
).downloadModel
with invalid gguf metadata.index.ts
retrieveGGUFMetadata
that returns a partially updated model based on metadata.parameters
andsettings
using GGUF metadata.node/index.ts
renderedTemplate
from the GGUF metadata parsing process.renderJinjaTemplate
for parsing Jinja templates based on the metadata.node/node.test.ts
renderJinjaTemplate
function.Detailed Examples
GGUF Metadata Handling: Added code to check for and update model settings based on metadata attributes like
eos_token_id
,context_length
,block_count
, etc.Infrastructural Mocks in Tests: Created mocks and their respective test setups to emulate filesystem and network behaviors for more deterministic tests.
These changes all seem aimed to improve the robustness and flexibility of model handling within the codebase, especially concerning dynamically adjusting to metadata from fetched models.