Chain server needs restart after initial file upload #61

MattFeinberg · 2024-11-15T12:58:54Z

On a fresh clone and running of NIM anywhere, the first time you upload files to the knowledge base, the chain server won't be able to pick up the new documents without being restarted.

Steps to reproduce:

Remove any instances of NIM anywhere on your workbench (you might not have to do this, but this is what I've been doing to simplify my life when reproducing this error)

Remove the milvus docker volume created by NIM Anywhere

Clone the NIM anywhere project and go through the startup/config steps

When it comes time to turn applications on, turn them on in the following order:

Milvus
Redis
Chain Server
Chat frontend
Jupyter Lab

Then, go to the Jupyter notebook and run through the steps to upload files.

I don't think that it reliably uploads the same files every time, so, you should copy one of the uploaded file names from the jupyter notebook output and find the file in the /data/dataset folder. Open the file and find the article's author's name. Try to find an article by a single person rather than a group.

You can either ask the chatbot or the chain server directly (make sure you check the use knowledge base checkbox):

"Tell me about an article in the context by AUTHOR_NAME_HERE"

You should notice that the chatbot will not correctly identify the article

Then, restart the chain server, try again, and you should see that the chatbot will either name the article directly, or sometimes it will just describe it/tell you what it is about

rmkraus · 2024-11-17T01:05:25Z

I've run through a couple permutations and here is what I've found.

If the Milvus collection has not been populated when the chain server starts, the you currently have to restart langchain after data is added to the collection. This seems to mostly be because the collection's tables cannot be (easily) initialized until an embedded vector is provided. When the data gets added, the tables get created and the collection works, but langchain never goes back to check.

I think the best thing to do for this is to add a gui for uploading documents and have that upload function trigger a reload of the chain if it is uploading data into an empty collection.

MattFeinberg · 2024-11-19T08:44:35Z

I agree. I'll take a stab at implementing this

rmkraus added enhancement New feature or request user interface chain server labels Nov 17, 2024

AmeliaYe mentioned this issue Nov 26, 2024

Docker Compose Integration #65

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chain server needs restart after initial file upload #61

Chain server needs restart after initial file upload #61

MattFeinberg commented Nov 15, 2024

rmkraus commented Nov 17, 2024

MattFeinberg commented Nov 19, 2024

Chain server needs restart after initial file upload #61

Chain server needs restart after initial file upload #61

Comments

MattFeinberg commented Nov 15, 2024

rmkraus commented Nov 17, 2024

MattFeinberg commented Nov 19, 2024