-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: connection resume does not fire channel detached #539
Conversation
@mattheworiordan PTAL and if it's good then I fix the client. |
guard let stateChange = stateChange else { | ||
fail("ConnectionStateChange is nil"); done(); return | ||
} | ||
expect(client.connection.id).to(equal(disconnectedMessage.connectionId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this be equal to the fabricated message that was received? Realtime will assign a new connectionId
as that ID is now invalid
|
||
let disconnectedMessage = ARTProtocolMessage() | ||
disconnectedMessage.action = .Disconnected | ||
disconnectedMessage.connectionId = "xxxxxxxxx" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unusual as a connectionId
is never sent with Disconnected, only Connected
. Could you not instead simple do one or both of the following:
- Send a new
CONNECTED
message with a newconnectionId
whilst the connection isCONNECTED
. This should trigger channels to detach anyway, but that behaviour is only really expected in the 0.9 spec. - Manually terminate the connection, change the
connectionKey
in the object programatically so that it can no longer resume, check that theCONNECTED
message received when resuming has a differentconnectionId
to the original, then check that the channels have becomeDETACHED
with a suitable error message as well.
defer { client.dispose(); client.close() } | ||
let channel = client.channels.get("test") | ||
|
||
expect(client.connection.state).toEventually(equal(ARTRealtimeConnectionState.Connected), timeout: testTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct, this is not the behaviour we'd expect as a channel ATTACH that does not become ATTACHED would be resent or timeout.
The process I said we needed to test is:
- Connect
- Attach a channel (i.e wait for it to be attached)
- Token expires and connection is disconnected by Ably
- Resume happens automatically, wait until connected
- Publish a message over REST on the channel
- Ensure the message is received on the channel and the channel is still attached
@@ -1949,7 +1949,7 @@ class RealtimeClientConnection: QuickSpec { | |||
context("System's response to a resume request") { | |||
|
|||
// RTN15c1 | |||
pending("CONNECTED ProtocolMessage with the same connectionId as the current client, and no error") { | |||
it("CONNECTED ProtocolMessage with the same connectionId as the current client, and no error") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you confirm which other tests are marked as pending
still? I was not aware we had pending tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan Remaining pending tests:
Pending: when publishing a Message with an explicit clientId that is incompatible with the identified client’s clientId
Pending: should be used a PresenceMap to maintain a list of members
Pending: should result in an error if the channel is in the FAILED state
Pending: should result in an error if the channel moves to the FAILED state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ricardopereira please add an issue for those pending tests as we should look at them ASAP in case there are other spec requirements that the lib does not meet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done #541.
} | ||
} | ||
|
||
expect(channel.state).to(equal(ARTRealtimeChannelState.Detached)) | ||
expect(client.connection.id).toNot(equal(initialConnectionId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the logs, when the token expires the connectionId
always changes, so I decided to add that and additionally it checks if the channel gets detached.
@mattheworiordan PTAL at the last commit. |
The problems I noticed while debugging:
|
0c627a1
to
61c4b8b
Compare
@ricardopereira can you address |
it("should resume the connection after an auth renewal and detach all channels if the connectionId changes") { | ||
let options = AblyTests.commonAppSetup() | ||
options.disconnectedRetryTimeout = 1.0 | ||
options.tokenDetails = getTestTokenDetails(ttl: 2.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit brittle, 2s to connect & attach in CI may fail, can we not just give it a bit of buffer, say 6 seconds to make sure we don't get test failures because of delays in running code?
// Wait for token to expire | ||
waitUntil(timeout: testTimeout) { done in | ||
let partialDone = AblyTests.splitDone(2, done: done) | ||
channel.once(.Detached) { error in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the channel becoming detached? If a connection is closed because of an auth failure, the channel does not become detached, it remains attached until the connection is closed or moves to the suspended state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I answered in the other comment #539 (comment)
fail("ConnectionID is nil"); return | ||
} | ||
expect(client.connection.id).to(equal(receivedConnectionId)) | ||
expect(client.connection.id).toNot(equal(initialConnectionId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow why the connectionId has changed? Why? In a normal operation it won't, and as far as I can tell, you are not doing anything that would force it to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan the connectionId
changes and that's why the channel gets detached.
First CONNECTED:
2016-11-23 13:16:04.508 xctest[32837:4092858] VERBOSE: RS:0x7f955c623000 ARTJsonLikeEncoder<msgpack>: protocolMessageFromDictionary {
action = 4;
connectionDetails = {
connectionKey = "e91RTOWlwA0vIq!tCwrawkZGCTS118T-ca6e91RTOWlwA0vIq";
connectionStateTtl = 120000;
maxFrameSize = 262144;
maxIdleInterval = 15000;
maxInboundRate = 250;
maxMessageSize = 16384;
serverId = "frontend.64d2.1.eu-central-1-A.i-d35f816e";
};
connectionId = tCwrawkZGC;
connectionKey = "e91RTOWlwA0vIq!tCwrawkZGCTS118T-ca6e91RTOWlwA0vIq";
connectionSerial = "-1";
}
It gets the DISCONNECTED from Ably:
2016-11-23 13:16:05.998 xctest[32837:4092858] VERBOSE: RS:0x7f955c623000 ARTJsonLikeEncoder<msgpack>: protocolMessageFromDictionary {
action = 6;
error = {
code = 40142;
message = "Key/token status changed (expire)";
statusCode = 401;
};
}
The second CONNECTED:
2016-11-23 13:16:06.614 xctest[32837:4092858] VERBOSE: RS:0x7f955c623000 ARTJsonLikeEncoder<msgpack>: protocolMessageFromDictionary {
action = 4;
connectionDetails = {
connectionKey = "e91NTgaIgA0vIq!guMApg59VswaKAc4-d86e91NTgaIgA0vIq";
connectionStateTtl = 120000;
maxFrameSize = 262144;
maxIdleInterval = 15000;
maxInboundRate = 250;
maxMessageSize = 16384;
serverId = "frontend.25af.1.eu-central-1-A.i-305dca8c";
};
connectionId = guMApg59Vs;
connectionKey = "e91NTgaIgA0vIq!guMApg59VswaKAc4-d86e91NTgaIgA0vIq";
connectionSerial = "-1";
}
Full log: https://gist.github.com/ricardopereira/15a80610b22c9216d5587f2375359921
This is not correct, please review. Also, can you confirm that we are testing:
|
@mattheworiordan Sorry, can you point me to a log entry where it is missing the URL ouput? UPDATE: Ok, I think HTTP requests are not logging everthing. |
The log I sent you had no WebSocket URLs? |
@mattheworiordan i.e. line 467 |
Ah, my bad, mind including the host in there too though? I was scanning the logs for the host, my mistake, but I am sure it will happen to others. |
7c169fa
to
66355dc
Compare
expect(channel.errorReason!.code).to(equal(80008)) | ||
expect(channel.errorReason!.message).to(contain("Unable to recover connection")) | ||
guard let channelError = channel.errorReason else { | ||
fail("Channel error is nil"); return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. Don't change this, but in future we should have an error message that explains the problem rather than the boolean logic i.e. anyone can see from this code that channel error is nil, but what the reader cares about is why nil
is a problem i.e. this should instead be fail("Connection resume failed and error should be propagated to the cannel")
. This helps the reader of the tests in future understand why this is a problem as opposed to just seeing that it is expected to be nil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, 👍
Can you confirm when we're ready for the user with the issue to test this now?
@mattheworiordan Yes, sure. If it passes then it's good to go. If not then I'll see what's wrong. |
faadb45
to
286b516
Compare
#538