-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Evict requests if the client has disconnected #208
Feat: Evict requests if the client has disconnected #208
Conversation
looking good @bhimrazy so far! We might have to run some benchmarks to verify that we don't lose performance because of multiprocessing synchronization. But really good approach 😄 |
Thanks, @aniketmaurya! for the feedback. |
@bhimrazy I'd suggest to go ahead with this technique and maybe implement is for single non batched loop then we can run some tests. and if everything goes well then we can implement it for other loops too. |
Sure, that sounds great! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #208 +/- ##
===================================
- Coverage 82% 81% -0%
===================================
Files 13 13
Lines 1048 1084 +36
===================================
+ Hits 855 881 +26
- Misses 193 203 +10 |
for more information, see https://pre-commit.ci
…razy/LitServe into feat/evict-req-on-client-disconnect
@bhimrazy please add a proper PR description. LitServe is live now, so it needs to follow Lightning AI's guidelines for production-ready OSS software.
thanks! |
Closing this PR. Due to the complexity involved, the streaming and non-streaming cases will be handled separately in new PRs (in more better and cleaner way). You can find the streaming case PR here: #223. |
Before submitting
What does this PR do?
Fixes #165.
This PR addresses the situation when client requests are disconnected before completion. It tracks canceled requests and stops ongoing/running tasks, thereby saving computational resources.
Approach
The solution involves checking whether the client has disconnected during both predict and stream predict operations.
If a disconnection is detected, the system adds the corresponding request ID to the
request_evicted_status
multiprocessing dictionary.This dictionary is then monitored by the running loops (both streaming and non-streaming) in worker process. If the worker loop detects that the current running request ID is present in the
request_evicted_status
, it immediately terminates the ongoing task associated with that request.Potential Impacts
This approach might impact performance due to the additional overhead of monitoring and terminating tasks, which could introduce minor delays in processing. Benchmarking
TODO
run_single_loop
).PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃