Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Version 0.2.50] Models get unloaded from memory after each query #181

Open
shebrien opened this issue Mar 21, 2023 · 3 comments
Open

[Version 0.2.50] Models get unloaded from memory after each query #181

shebrien opened this issue Mar 21, 2023 · 3 comments

Comments

@shebrien
Copy link

I am testing the llama 30B model and noticed that whenever I write a query it takes a long time to load the model to memory, then it writes the response, THEN it deallocates the used memory. I think this overhead will make it inefficient to use the web interface. This was also noticed when choosing alpaca 7B but the effect isn't as noticeable given the smaller model.

Would appreciate if there was an option to keep the models loaded to eliminate this overhad.

@yosr481
Copy link

yosr481 commented Mar 22, 2023

+1 to this! I'm using version 0.3.1 and have experienced the same issue with models getting unloaded from memory after each query. It would be great if there was an option to keep the models loaded in memory to reduce the overhead and make it more efficient to use the web interface. Thanks for considering this feature request!

@trevtravtrev
Copy link

If this is true, and this currently loads the entire model into memory then deallocates after every prompt, I would recommend this being the #1 project priority to sort out. This is extremely inefficient and even makes the application unusable for any large weights or non-beefy PC's. Anyone have an idea how to tackle this?

@MasMedIm
Copy link

I have the same problem, any one can do a PR to solve that? This is a major issue.

mirroredkube pushed a commit to mirroredkube/dalai that referenced this issue Mar 26, 2023
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants