Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve] Avoid looping over all snapshot ids for each long poll request #45881
[Serve] Avoid looping over all snapshot ids for each long poll request #45881
Changes from 3 commits
01c8528
6bd6d6c
5eab50e
b58c40e
7ec2781
4da4794
fd37d93
2cf1ab1
1eb0af7
a7488d3
2011d76
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid raising and catching an error by using a conditional block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a
defaultdict
anymore :(There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was previously a random number https://github.com/ray-project/ray/pull/45881/files#diff-f138b21f7ddcd7d61c0b2704c8b828b9bbe7eb5021531e2c7fabeb20ec322e1aL195 but I couldn't figure out why - I can of course use a random number here but this seems simpler if that's not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shrekris-anyscale do you know why this was a random number / is that necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We set it to a random number to handle an edge case when the controller crashes and recovers. Suppose the controller always starts with ID 0, and it's currently at ID 1. Suppose all the clients are also at ID 1. Now suppose:
ServeController
(which runs theLongPollHost
) crashes and restarts. TheLongPollHost
's snapshot IDs are reset to 0.LongPollHost
on theServeController
. Their snapshot IDs are still 1, so the controller propagates an update, and all the connected clients' snapshot IDs are set back to 0.LongPollHost
usingnotify_changed
. The controller bumps the snapshot ID to 1 and updates the connected clients.To correctly avoid this edge case, we should cache the snapshot IDs and restore them when the controller recovers. That's pretty complex though, so instead we initialize the snapshot IDs to a random number between 0 and 1,000,000. That makes this edge case very unlikely.
Could we switch back to using a default dict with the random number generator as the factory function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! That totally makes sense, and I will restore that behavior.
Let me play around with this, I might be able to make it work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I just remembered why I changed it do a plain dict - the problem is that
snapshot_ids
is a defaultdict, butobject_snapshots
isn't, so I was trying to get items from one that didn't exist in the other. Now I'm wondering if those two mappings can be combined, since they have the same keys 🤔