Close connection if server rejects writes while maintaining write connection #1722

vladsud · 2020-04-06T23:54:19Z

+Fix Fluid debugger

…neciton Fix Fluid debugger;

arinwt · 2020-04-07T10:07:53Z

packages/loader/container-loader/src/deltaManager.ts

@@ -812,6 +812,9 @@ export class DeltaManager extends EventEmitter implements IDeltaManager<ISequenc
            if (this.readonlyPermissions) {
                this.close(createWriteError("WriteOnReadOnlyDocument"));
            }
+            if (this.connectionMode === "write") {


Does this mean that we never retry on nack now? If not, then what is the scenario?
Seems like if we do not have write permissions, then readonlyPermissions is true, and we get "WriteOnReadOnlyDocument". If we do have write permissions, then connectionMode is "write", and we get "ServerRejectsWrites".
Is the last case of checking for autoReconnect still relevant then?

So I'm not sure when nack can happen beyond those two cases - read-only permissions and view-only connection.
I was thinking to wrap both IFs with IF (target === -1), but did not have it for first case, so not sure when target !== -1 can happen. @GaryWilber, @tanviraumi , can you please comment?

I believe checking for this.autoReconnect is still relevant, because if it's neither read-only permissions nor view-only mode (as optimization on top of r/w permissions), then we can reconnect, and we are getting nack because someone is generating an op (presumably because of user actions). In such case, if reconnect is off, that very likely points to a bug in host code that deals with user presence. Bots will not hit it, because they start with "write" mode.

the server can NACK a client if they send an op that has a reference seq# below the minimum seq#. We should reconnect in that case.

In reply to: 404885819 [](ancestors = 404885819)

What is the right way to differentiate between those cases? Is testing for target telling anything?
I'd really hate to go with solution that starts counting number of disconnects per period of time as the only signal that something went wrong.

Are we trying to prevent reconnection attempts when a "read-only permission" user sends an op? So what happens when a user switches from read to write? We have never formalized Nacks very well but in r11s, alfred sends you a nack when you can't write. That means either you don't have read permission or you joined as readonly. In both cases, the value is -1.(https://github.com/microsoft/FluidFramework/blob/master/server/routerlicious/packages/lambdas/src/utils/messageGenerator.ts#L15)

Deli nacks you on different occasions. And in all those occasions client should always reconnect. The value in all those cases is the current min sequence number (https://github.com/microsoft/FluidFramework/blob/master/server/routerlicious/packages/lambdas/src/deli/lambda.ts#L641). The cases are:

There is a gap so kafka might have missed something. Client should reconnect to resend the op.

refseq < minseq.

Stale client.

Tried to summarize without a summary scope. This is more future proofing and r11s/Push always provides a summary claim with the write claim. May be we should not reconnect in this case.

I am pretty sure Deli behaves similarly in Push and r11s. Not sure about the alfred part of Push but @GaryWilber will know.

We might want to add tenantId-documentId here just to be safe...

Yea it's sent to a specific room, but the above image is what's received on the websocket. Unfortunately the client doesn't know what room that was sent to.
I believe rooms are purely a server-side concept to aid in sending messages to specific clients.

Gary and I talked. If we need an urgent fix, you can check with -2 and Gary can starting send them from push. We will work on a better way to handle all cases in the meantime

We should got with right fix. I can wait of make other temp fix if needed (track rate of disconnects). Can you please share bug / issue links to track?

vladsud · 2020-04-13T22:43:40Z

Forking Fluid debugger fix into #1773
Closing this PR as issue is opened to provide client proper signal to properly differentiate nack errors

Close conneciton if server rejects writes while maintaining write con…

15928d4

…neciton Fix Fluid debugger;

vladsud requested review from anthony-murphy, markfields and jatgarg April 6, 2020 23:54

spaces

1c58e45

arinwt reviewed Apr 7, 2020

View reviewed changes

arinwt mentioned this pull request Apr 7, 2020

Summarize one more time before closing #643

Closed

vladsud closed this Apr 13, 2020

vladsud deleted the ConnectionBailSooner branch April 24, 2020 06:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Close connection if server rejects writes while maintaining write connection #1722

Close connection if server rejects writes while maintaining write connection #1722

vladsud commented Apr 6, 2020

arinwt Apr 7, 2020

vladsud Apr 7, 2020

anthony-murphy Apr 8, 2020

vladsud Apr 9, 2020

tanviraumi Apr 9, 2020

tanviraumi Apr 9, 2020

GaryWilber Apr 9, 2020 •

edited

Loading

tanviraumi Apr 9, 2020

vladsud Apr 9, 2020

tanviraumi Apr 9, 2020

vladsud commented Apr 13, 2020

Close connection if server rejects writes while maintaining write connection #1722

Close connection if server rejects writes while maintaining write connection #1722

Conversation

vladsud commented Apr 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GaryWilber Apr 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vladsud commented Apr 13, 2020

GaryWilber Apr 9, 2020 •

edited

Loading