[Backend] Support embedding model #242

L-jasmine · 2024-09-01T16:00:09Z

No description provided.

Signed-off-by: csh <458761603@qq.com>

L-jasmine · 2024-09-01T16:04:26Z

If the model-card repository contains an embedding.json file, the corresponding embedding model will be downloaded, and both the embedding and chat models will be loaded.

moxin-org/model-cards#8

L-jasmine · 2024-09-01T16:07:13Z

https://github.com/moxin-org/moxin/pull/242/files#diff-d406d942d2259b118601d9d68766930f7b8a6bf5d6700fb22011065649d4024fR207-R208

We can set the environment variable MODEL_CARDS_REPO at compile time to control which model-card repository the backend should clone.

juntao · 2024-09-01T16:15:12Z

Thanks. Please see my comment on the model card repo on the startup parameters for the nomic-embed model.

Also, when does the moxin app download the embedding model? What if the embedding model is not yet fully downloaded when the user starts to chat?

jmbejar · 2024-09-02T19:50:58Z

Thanks. Please see my comment on the model card repo on the startup parameters for the nomic-embed model.

Also, when does the moxin app download the embedding model? What if the embedding model is not yet fully downloaded when the user starts to chat?

The PR approach is to download the embedding model at Moxin's start time.. which takes many seconds in my machine, but it could be slower for other people.

Maybe it is a good start to move this download to a background task, having the chat interactions to use it whenever is available. Of course, we would need to show some UI indication to communicate to the user if the text embedding is in use or not, but we can do it in a separate PR.

jmbejar · 2024-09-02T19:52:53Z

@L-jasmine I tested this PR but the embedding json file was not downloading in my first attempts. Only when I removed the downloaded models-cards repo, the app recreated it (cloned it) and the embedding model was downloaded. So, I'm not sure if existing users will have the embedding models downloaded.

Looks like the "git pull" (in the models-cards repo folder) is not working for some reason.

jmbejar · 2024-09-05T17:52:54Z

@juntao @L-jasmine I'm not merging this because I feel the commented issues are worth to be addressed.

However, I'm curious to know what you think about. Do you think it is possible/convenient to download the embedding model on a thread to do not block application startup. Or we should just merge and try to address it on the frontend somehow?

BTW @L-jasmine let me know if you were able to reproduce the other issue, when you already have a local copy of the model-cards repo but fails to pull the new json file.

juntao · 2024-09-05T17:55:17Z

Yeah. Let's download it in a separate thread. We could prevent the user from chatting before this download completes. Thanks. @L-jasmine

Signed-off-by: csh <458761603@qq.com>

jmbejar · 2024-09-09T18:17:12Z

@L-jasmine Tested again. Here are the results:

1- The git pull issue is resolved 👍

2- I'm hitting an issue with the download of the embedded schema in the background.
I tested having a chat conversation while the embedded was being downloaded, but later I realized I need to reload the load manually (from the UI) to have the embedded included in the chat section.

It may require some thinking from the UI to properly handle the first time the embedded is download, how the user is informed about. But let's handle it in another PR.

Accepting and merging now

L-jasmine · 2024-09-09T19:04:00Z

@L-jasmine Tested again. Here are the results:

1- The git pull issue is resolved 👍

2- I'm hitting an issue with the download of the embedded schema in the background. I tested having a chat conversation while the embedded was being downloaded, but later I realized I need to reload the load manually (from the UI) to have the embedded included in the chat section.

It may require some thinking from the UI to properly handle the first time the embedded is download, how the user is informed about. But let's handle it in another PR.

Accepting and merging now

Yes, because I have no way to notify the frontend that the embedding model has already been downloaded, so I can only load the embedding model during the reload.

[Backend] Support embedding model

5232bd8

Signed-off-by: csh <458761603@qq.com>

L-jasmine requested review from juntao and jmbejar September 1, 2024 16:00

L-jasmine added 2 commits September 9, 2024 14:13

[Backend] Optimize sync model-card repo

e7d2137

Signed-off-by: csh <458761603@qq.com>

[Backend] Download the embedding model in the background.

688d36f

Signed-off-by: csh <458761603@qq.com>

jmbejar approved these changes Sep 9, 2024

View reviewed changes

jmbejar merged commit 6714258 into moxin-org:dev Sep 9, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backend] Support embedding model #242

[Backend] Support embedding model #242

L-jasmine commented Sep 1, 2024

L-jasmine commented Sep 1, 2024

L-jasmine commented Sep 1, 2024

juntao commented Sep 1, 2024

jmbejar commented Sep 2, 2024

jmbejar commented Sep 2, 2024 •

edited

Loading

jmbejar commented Sep 5, 2024

juntao commented Sep 5, 2024

jmbejar commented Sep 9, 2024

L-jasmine commented Sep 9, 2024

[Backend] Support embedding model #242

[Backend] Support embedding model #242

Conversation

L-jasmine commented Sep 1, 2024

L-jasmine commented Sep 1, 2024

L-jasmine commented Sep 1, 2024

juntao commented Sep 1, 2024

jmbejar commented Sep 2, 2024

jmbejar commented Sep 2, 2024 • edited Loading

jmbejar commented Sep 5, 2024

juntao commented Sep 5, 2024

jmbejar commented Sep 9, 2024

L-jasmine commented Sep 9, 2024

jmbejar commented Sep 2, 2024 •

edited

Loading