Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Session consistency does not hold across elections #3952

Closed
achamayou opened this issue Jun 17, 2022 · 2 comments · Fixed by #4595
Closed

Session consistency does not hold across elections #3952

achamayou opened this issue Jun 17, 2022 · 2 comments · Fixed by #4595
Assignees
Labels
Milestone

Comments

@achamayou
Copy link
Member

If a user submits a stream of transactions across an election, they may observe an inconsistent session history as one or more transactions are rolled back.

We can mitigate this by closing user TLS sessions when an election is observed.

@eddyashton
Copy link
Member

Summarising my recent thoughts on this:

Closing all user TLS sessions during an election is a bad experience (dead connection with no explanation), and pessimistic (killing sessions which would not observe any inconsistencies).

I think we can precisely track/detect inconsistencies by recording every TxID that is reported on each session. Then during request execution, we can read the TxID of the previous response, and only proceed if it is still valid. If it has been rolled back, then we know we have lost the ability to present session consistency on this session, and need to close it, but can first return a clear HTTP error for this request to the user.

We need to work out when it is safe to check the validity of the previous TxID. If we do it too late, where we currently set the new TxID response header, then we've already committed writes from this session that may relied on (session-implicit) rolled back state. If we do it too early, it is possible that the previous TxID in question is rolled back between the point we ask and the point this request gets a read version. I think it is safe to do at any point between the transaction getting a read version and being committed, but in practice I think that means after the user-app handler has executed.

@eddyashton
Copy link
Member

Deeper discussion of this in #4401.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment