-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backend] Support embedding model #242
Conversation
Signed-off-by: csh <458761603@qq.com>
If the model-card repository contains an embedding.json file, the corresponding embedding model will be downloaded, and both the embedding and chat models will be loaded. |
We can set the environment variable |
Thanks. Please see my comment on the model card repo on the startup parameters for the nomic-embed model. Also, when does the moxin app download the embedding model? What if the embedding model is not yet fully downloaded when the user starts to chat? |
The PR approach is to download the embedding model at Moxin's start time.. which takes many seconds in my machine, but it could be slower for other people. Maybe it is a good start to move this download to a background task, having the chat interactions to use it whenever is available. Of course, we would need to show some UI indication to communicate to the user if the text embedding is in use or not, but we can do it in a separate PR. |
@L-jasmine I tested this PR but the embedding json file was not downloading in my first attempts. Only when I removed the downloaded models-cards repo, the app recreated it (cloned it) and the embedding model was downloaded. So, I'm not sure if existing users will have the embedding models downloaded. Looks like the "git pull" (in the models-cards repo folder) is not working for some reason. |
@juntao @L-jasmine I'm not merging this because I feel the commented issues are worth to be addressed. However, I'm curious to know what you think about. Do you think it is possible/convenient to download the embedding model on a thread to do not block application startup. Or we should just merge and try to address it on the frontend somehow? BTW @L-jasmine let me know if you were able to reproduce the other issue, when you already have a local copy of the model-cards repo but fails to pull the new json file. |
Yeah. Let's download it in a separate thread. We could prevent the user from chatting before this download completes. Thanks. @L-jasmine |
Signed-off-by: csh <458761603@qq.com>
Signed-off-by: csh <458761603@qq.com>
@L-jasmine Tested again. Here are the results: 1- The git pull issue is resolved 👍 2- I'm hitting an issue with the download of the embedded schema in the background. It may require some thinking from the UI to properly handle the first time the embedded is download, how the user is informed about. But let's handle it in another PR. Accepting and merging now |
Yes, because I have no way to notify the frontend that the embedding model has already been downloaded, so I can only load the embedding model during the reload. |
No description provided.