-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latency high after loading a new model. #385
Comments
Some tensorflow graphs perform lazy initialization, making the first request (or few requests) to a newly-loaded model slow. The best way to handle that is to add initialization or dummy "warm-up requests" to the init op which tf-serving calls while loading the model. |
@chrisolston Thanks for your explanation and suggestion very much, problem is clear to me. b) Add lazy-loading model: |
For (a), the recommended approach is to do it within the tf graph, triggered from tf-serving calling the init op during load. For (b), interesting idea. I would expect various I/O queues to smooth it out anyway but maybe you are hitting timeouts? You could write a custom SourceAdapter that acts as the identity function but adds a random delay -- that would do the trick. Feel free to contribute the SourceAdapter via a PR. |
Hi, I have the exact same problem. However, I do not understand how one can add initialization or dummy "warm-up requests" to the init op (I used Keras for training and the SavedModelBuilder for exporting the model). Can you please explain it in more detail, e.g. with a code example? Thanks! |
Ping @chrisolston |
same problem |
Hi @chrisolston , I have the same problem, can you provide an example on how to call the init op during load? |
Hi @chrisolston,Current version of tf serving try to load warmup request from |
I'm using Tensorflow Serving load a widendeep model as online predict service, and the model will update every 10 minutes, we found that the first few requests' latency are high right after the new model is loaded, is this a known issue or any suggestion to figure out this problem?
The text was updated successfully, but these errors were encountered: