Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the server reject misbehaving connections #212

Merged
merged 1 commit into from
Jun 18, 2024

Conversation

lhchavez
Copy link
Contributor

Why

We currently have a setup where the server just trusts whatever the client requests, and stomps over whatever was already there. That's okay in some cases, but not all! For example, if the client's timers are all wonky (like they are when the tab is hidden), we might get in a situation where the client tries to reconnect after its server-side grace period has been exceeded, and they don't agree on what the state should be. Also, if pid2 restarts for whatever reason, both parties should quickly agree that they're not agreeing upfront.

What changed

This change now adds a new (optional) field to the handshake, expectedState. The client should fill this up and explicitly tell the server whether this is a reconnect attempt (currently this is done by checking if the advertisedSessionId is set, but that can be done by explicitly tracking whether the session has seen the server at least once when we move away from this model), and tell the server what its expected next sequence number is (so that the server can also detect drifts in this). If the clients don't fill this information up, the server will behave in a backwards-compatible fashion for safety.

Versioning

  • Breaking protocol change 💆‍♂️ (backwards-compatible)
  • Breaking ts/js API change 💆‍♂️

@lhchavez lhchavez requested a review from a team as a code owner June 18, 2024 17:38
@lhchavez lhchavez requested review from masad-frost and jackyzha0 and removed request for a team and masad-frost June 18, 2024 17:38
@lhchavez lhchavez force-pushed the lh-make-the-server-reject-misbehaving-connections branch from aed012a to 1c24ddb Compare June 18, 2024 17:40
Copy link
Member

@masad-frost masad-frost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay for explicitness, much easier to reason about.

Let's update PROTOCOL.md?

We currently have a setup where the server just trusts whatever the
client requests, and stomps over whatever was already there. That's okay
in some cases, but not all! For example, if the client's timers are all
wonky (like they are when the tab is hidden), we might get in a
situation where the client tries to reconnect after its server-side
grace period has been exceeded, and they don't agree on what the state
should be. Also, if pid2 restarts for whatever reason, both parties
should quickly agree that they're not agreeing upfront.

This change now adds a new (optional) field to the handshake,
`expectedState`. The client should fill this up and explicitly tell the
server whether this is a reconnect attempt (currently this is done by
checking if the `advertisedSessionId` is set, but that can be done by
explicitly tracking whether the session has seen the server at least
once when we move away from this model), and tell the server what its
expected next sequence number is (so that the server can also detect
drifts in this). If the clients don't fill this information up, the
server will behave in a backwards-compatible fashion for safety.
@lhchavez lhchavez force-pushed the lh-make-the-server-reject-misbehaving-connections branch from 1c24ddb to 9ef652b Compare June 18, 2024 18:28
@lhchavez lhchavez enabled auto-merge (squash) June 18, 2024 18:28
@lhchavez lhchavez merged commit 9cbf62f into main Jun 18, 2024
4 checks passed
@lhchavez lhchavez deleted the lh-make-the-server-reject-misbehaving-connections branch June 18, 2024 18:28
lhchavez added a commit to replit/river-python that referenced this pull request Jun 18, 2024
Why
===

We have a few cases where the server recreates a connection that was
meant to be a reconnect.

What changed
============

This change brings the Python implementation in line with the TypeScript
one so that the server can proactively tell clients to recreate their
sessions. See replit/river#212 for reference

Test plan
=========

Logs like
https://app.datadoghq.com/logs?query=%40replid%3Ae2698176-9a19-4058-8619-7f5793b02cde&agg_m=count&agg_m_source=base&agg_t=count&cols=host%2Cservice%2C%40river.sessionId&event=AgAAAZAssKXztskj6QAAAAAAAAAYAAAAAEFaQXNzS2NBQUFBR1hKdG9Bd1NLLVFBbwAAACQAAAAAMDE5MDJjYjItYzE2Ny00YmZiLThjMjctNzdkNjI5Y2NkY2Vj&fromUser=true&messageDisplay=inline&refresh_mode=paused&storage=hot&stream_sort=desc&viz=stream&from_ts=1718734624299&to_ts=1718737623927&live=false
should completely go away
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants