-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model File Manager #789
base: master
Are you sure you want to change the base?
Model File Manager #789
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few comments, the main issue is about being completely clear about ownership and resource disposal (models are expensive to load and keep loaded, so it's important that resource management is completely clear!)
* seed manageR * model manager init * interface * test * tests * no default configurator * Rename class * handle already disposed --------- Co-authored-by: Pat Hov <hov@hov.com> Co-authored-by: pat_hov <hov@hovbook>
Co-authored-by: pat_hov <hov@hovbook>
* organization * disposable and ref counter --------- Co-authored-by: pat_hov <hov@hovbook>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I left some comments and please let me know if any help is needed.
* organization * disposable and ref counter * separate concerns a bit more * check * tweak --------- Co-authored-by: pat_hov <hov@hovbook>
* organization * disposable and ref counter * separate concerns a bit more * check * tweak * stash * note --------- Co-authored-by: pat_hov <hov@hovbook>
LLama/Model/HuggingFaceModelRepo.cs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
essentially for demo purposes. wanted to see how abstract the interface is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just a test do we want to keep it in this PR? I think @AsakusaRinne was working on HF integrations for model loading, so you might want to check what the status of that work is and add something in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious what the thoughts are on that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply, I agree that it could be removed from this PR and added in a separate PR to LLama.Experimental
project. With HuggingfaceHub, it's easy to download a model from the huggingface. It's easy to implement a remote model manager for gguf files but the APIs might change in the future if we want to support other formats (.safetensors, .bin) based on GGMLSharp. So I would recommend putting it in the LLama.Experimental
first.
/// <summary> | ||
/// Unload and dispose of a model with the given id | ||
/// </summary> | ||
/// <param name="modelId"></param> | ||
/// <returns></returns> | ||
public bool UnloadModel(string modelId); | ||
|
||
/// <summary> | ||
/// Unload all currently loaded models | ||
/// </summary> | ||
public void UnloadAllModels(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few challenges with supporting this and I'm thinking maybe it's better to just drop these methods and have the caller explicitly call Dispose
on the weight?
A few questions
- Should multiple instances of the same model be allowed to be loaded? I think yes. If so, we'll need a way to ensure calling unload is specific to a model.
- Should we force model aliases in the cache to be unique? This would help the issue where I load multiple instances of the model and then calling dispose is guaranteed to operate on that same model. If we don't want to enforce this restriction, unloading the correct model because tricker and might require the original model to be passed in as opposed to the alias.
- Are we better off getting rid of this class altogether? My main goal was to have something like
IModelRepo
and quickly went out of scope with some of this but if we think it'll be useful I'm happy to leave it but I'm struggling a bit to justify it especially given the unload/dispose challenges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should multiple instances of the same model be allowed to be loaded? I think yes. If so, we'll need a way to ensure calling unload is specific to a model.
Definitely. You can load a model with different settings (e.g. loras) which affect the output almost as if it's another model.
Should we force model aliases in the cache to be unique?
Do you mean the modelId
string? If so then I would think loading a model with an ID which is in the cache would either:
- throw an error
- return the existing model
I'd lean towards throwign the error out of those choices.
Are we better off getting rid of this class altogether?
The model repo idea is interesting, and I think it's something Rinne has looked at as well. But yeah maybe it would be better to remove it from this PR, and make this PR all about the new shared lifetimes (which I think is a pretty big improvement in itself). The model repo can be done in a followup PR, fully taking advantage of the new lifetimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything, I'd rather keep the model repo and the file system class than this but I've further reworked this to be more explicit. I haven't been able to finalize what a "good" api is for something like this (name aside) but the discussion here has helped add context.
cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have any thoughts on this @AsakusaRinne
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall of the code looks good. Since this PR added many new APIs, could you please add a document to introduce the usage of model cache and model repo? Please refer to #747 to see how to add a doc. :)
Create a simple manager to make administration of local models simpler