You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Around 6/14, latency for sending a request with 1MB payload through a serve DeploymentHandle increased from ~3.4s to ~4.6s.
From bisecting, d729815 seems to be the offending commit.
Versions / Dependencies
n/a
Reproduction script
Run python release/serve_tests/workloads/microbenchmarks.py.
Issue Severity
None
The text was updated successfully, but these errors were encountered:
zcin
added
bug
Something that is supposed to be working; but isn't
P0
Issues that should be fixed in short order
serve
Ray Serve Related Issue
core
Issues that should be addressed in Ray Core
labels
Jul 3, 2024
Talked to @edoakes > no leads... @kevin85421 is going to go into Ray Serve source ... this is interrupt important ... and we don't want to revert the DAG PR for this either... so no choice have to investigate.
Found that the slowness is from Router._resolve_deployment_responses which is basically pickle.dump and pickle.load. It's unclear how #45699 affects it since if we just run handle 1mb, it's the same before and after that PR. It's only slower if we run handle noop before it.
Instead of figuring out why pickle.dump is slower, we decided to remove the call of Router._resolve_deployment_responses` all together since it turns out to contribute a large portion of the latency.
What happened + What you expected to happen
Around 6/14, latency for sending a request with 1MB payload through a serve
DeploymentHandle
increased from ~3.4s to ~4.6s.From bisecting, d729815 seems to be the offending commit.
Versions / Dependencies
n/a
Reproduction script
Run
python release/serve_tests/workloads/microbenchmarks.py
.Issue Severity
None
The text was updated successfully, but these errors were encountered: