Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make it faster? #53

Closed
theytookcake opened this issue Jan 26, 2024 · 3 comments
Closed

How to make it faster? #53

theytookcake opened this issue Jan 26, 2024 · 3 comments

Comments

@theytookcake
Copy link

Hello!

What settings should I tweak to make it faster? I dont know if its even possible.

I know that smaller models answer faster, But for that I need I'm using a mistral 7b instruct model and it takes around 10 seconds to answer. Anything I can tweak to make it answer faster?

@amakropoulos
Copy link
Collaborator

Hi! 10 seconds is too long.
Is this the time it takes to get the first response or every response?
In a 8-year old 6-core CPU I have it takes ~2-3 seconds for the response to arrive.
Can you build the project and see if it takes the same amount of time?

@theytookcake
Copy link
Author

Hello! I'm now running the WarmUp Function in the Start and its much faster, using stream it takes around a second for text to start arriving!

@amakropoulos
Copy link
Collaborator

Perfect! Yes, that was the reason I was asking :).

The first response needs to process the character prompt, and if it is large it takes some time.
That's why the Warmup helps. It processes the character prompt and the result is cached.
I should added it in all the samples!

I'll close the issue, let me know if you get stuck with anything else!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants