Is it possible to keep model loaded #327

Showgofar · 2023-03-30T03:15:09Z

Hello
Is it possible to keep model loaded?
Every request takes longer time because it is need also time to load model.

michaelwdombek · 2023-03-30T09:09:10Z

Hey, think there is already an issue #181 open with your request :)

64jcl · 2023-04-04T07:58:22Z

Yes this would speed up requests immensely. I don't know exactly how these models work but it looks a bit odd that it seems slow to add each token in the network. Is the response that includes a copy of the input the first prediction done, so it sort of predicts the same tokens that you initially fed it and it always has to go through this process for every query into it? The reason for me thinking this is that it outputs these in the same rate as it outputs the actual response tokens.

So perhaps the reloading of the model is done every time because it needs to be in a "clean state" for another request otherwise it would use the previous state as part of it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to keep model loaded #327

Is it possible to keep model loaded #327

Showgofar commented Mar 30, 2023

michaelwdombek commented Mar 30, 2023

64jcl commented Apr 4, 2023 •

edited

Loading

Is it possible to keep model loaded #327

Is it possible to keep model loaded #327

Comments

Showgofar commented Mar 30, 2023

michaelwdombek commented Mar 30, 2023

64jcl commented Apr 4, 2023 • edited Loading

64jcl commented Apr 4, 2023 •

edited

Loading