[Version 0.2.50] Models get unloaded from memory after each query #181

shebrien · 2023-03-21T13:44:30Z

I am testing the llama 30B model and noticed that whenever I write a query it takes a long time to load the model to memory, then it writes the response, THEN it deallocates the used memory. I think this overhead will make it inefficient to use the web interface. This was also noticed when choosing alpaca 7B but the effect isn't as noticeable given the smaller model.

Would appreciate if there was an option to keep the models loaded to eliminate this overhad.

yosr481 · 2023-03-22T21:28:22Z

+1 to this! I'm using version 0.3.1 and have experienced the same issue with models getting unloaded from memory after each query. It would be great if there was an option to keep the models loaded in memory to reduce the overhead and make it more efficient to use the web interface. Thanks for considering this feature request!

trevtravtrev · 2023-03-23T08:38:03Z

If this is true, and this currently loads the entire model into memory then deallocates after every prompt, I would recommend this being the #1 project priority to sort out. This is extremely inefficient and even makes the application unusable for any large weights or non-beefy PC's. Anyone have an idea how to tackle this?

MasMedIm · 2023-03-23T20:18:38Z

I have the same problem, any one can do a PR to solve that? This is a major issue.

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

yosr481 mentioned this issue Mar 24, 2023

Keep model in RAM between requests? #247

Open

mirroredkube pushed a commit to mirroredkube/dalai that referenced this issue Mar 26, 2023

Add --ignore-eos parameter (cocktailpeanut#181)

50fae10

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

michaelwdombek mentioned this issue Mar 30, 2023

Is it possible to keep model loaded #327

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Version 0.2.50] Models get unloaded from memory after each query #181

[Version 0.2.50] Models get unloaded from memory after each query #181

shebrien commented Mar 21, 2023

yosr481 commented Mar 22, 2023

trevtravtrev commented Mar 23, 2023

MasMedIm commented Mar 23, 2023

[Version 0.2.50] Models get unloaded from memory after each query #181

[Version 0.2.50] Models get unloaded from memory after each query #181

Comments

shebrien commented Mar 21, 2023

yosr481 commented Mar 22, 2023

trevtravtrev commented Mar 23, 2023

MasMedIm commented Mar 23, 2023