Exhaustive state conditions for all realtime operations. #112

tcard · 2016-04-25T17:40:28Z

When calling Connection.close() before reaching the CONNECTED state,
the iOS library was trying to send a WebSocket message before the
handshake was even finished. That caused it to crash sometimes.

That situation wasn't contemplated on the spec. I've gone through it
filling a table and filling the gaps, so that the spec is exhaustive
about all possible combinations of state and operations.

(So, yes, zwopple/PocketSocket#48 wasn't fixing
the right problem.)

tcard · 2016-04-25T17:42:08Z

This is what it looks like, which may not be obvious from the source:

mattheworiordan · 2016-04-25T18:44:04Z

Thanks @tcard, this is really good. I will double check all the links soon, merge & update the unified spec sheet

* Add EventEmitter.timed method. * Use EventEmitter instead of manual timers for Realtime timeouts. This should simplify the logic there. * Improve error reporting of Obj-C enum in tests. * Conform to spec changes at ably/docs#112. * Remove outdated test. * Replace hardcoded realtime request timeout.

mattheworiordan · 2016-04-27T23:33:04Z

content/client-lib-development-guide/features.textile

@@ -301,12 +302,16 @@ h3(#realtime-connection). Connection
 ** @(RTN11a)@ Explicitly connects to the Ably service if not already connected
 ** @(RTN11b)@ An error will be indicated if the state is @CLOSING@ as the connection must complete the close request before reconnecting
 * @(RTN12)@ @Connection#close@ function:
-** @(RTN12a)@ Sends a @CLOSE@ @ProtocolMessage@ to the server, sets the state to @CLOSING@ and waits for a @CLOSED@ @ProtocolMessage@ to be received
+** @(RTN12f)@ If the connection state is @CONNECTING@, do the operation once the connection state is @CONNECTED@


Why, why not simply close the connection and move the connection to CLOSED? So I propose If the connection state is @CONNECTING@, close all transports and immediately transition the state to @CLOSED@

If the client simply transitions to CLOSED, then the realtime system will never get a close event, and then the connection state will remain for 2 minutes. It is preferable that, once a connection has been initiated, it is allowed to connect and is then closed.

It would be acceptable if there was a shorter timeout waiting for the CONNECTED state (eg the client waits for 5s) because we would still catch the majority of cases. However, I'm not sure it is worth the complexity of that special case - so if CONNECTING, wait for either a CONNECTED (then doing an explicit CLOSE), or any other state, in which case the client transitions to CLOSED without further interaction.

mattheworiordan · 2016-04-27T23:57:00Z

I have added a few comments, great bit of work as these were definitely areas not covered previously.

@paddybyers I would suggest you take a look to be sure.
@SimonWoolf if you can cast your eye too to spot any issues that would be good

SimonWoolf · 2016-04-28T16:20:30Z

content/client-lib-development-guide/features.textile

 * @(RTN12)@ @Connection#close@ function:
-** @(RTN12a)@ Sends a @CLOSE@ @ProtocolMessage@ to the server, sets the state to @CLOSING@ and waits for a @CLOSED@ @ProtocolMessage@ to be received
+** @(RTN12f)@ If the connection state is @CONNECTING@, do the operation once the connection state is @CONNECTED@


Why, why not simply close the connection and move the connection to CLOSED? So I propose If the connection state is @CONNECTING@, close all transports and immediately transition the state to @CLOSED@

If the client simply transitions to CLOSED, then the realtime system will never get a close event, and then the connection state will remain for 2 minutes. It is preferable that, once a connection has been initiated, it is allowed to connect and is then closed.

It would be acceptable if there was a shorter timeout waiting for the CONNECTED state (eg the client waits for 5s) because we would still catch the majority of cases. However, I'm not sure it is worth the complexity of that special case - so if CONNECTING, wait for either a CONNECTED (then doing an explicit CLOSE), or any other state, in which case the client transitions to CLOSED without further interaction.

I see the point, but the result is still quite weird. Presumably, a lot of the time if someone's trying to close while the client is CONNECTING, it's because the client has network issues and the client never had a chance of getting through to realtime at all. In which case the only effect of this would be to make close() appear to have no effect for the 10s it takes for realtimeRequestTimeout to expire, which is a bit rubbish.

(Or if the client's network connection fails after doing a /connect but before it receives the CONNECTED - in which case waiting for the realtimeRequestTimeout to expire still doesn't help anything, as realtime keeping the connection state for 2 minutes is inevitable).

So while this does improve behaviour in the case of a slow-but-reliable network (ie where there's a larger chance of the close() happening between the connect request but before the CONNECTED), it's at the expense of making it seem unresponsive when the network is failing/unreliable. :/

(Also, even if we do go with for waiting-for-connected, need wording for if the first state change from CONNECTING is not CONNECTED. eg on the first state change from @CONNECTING@: if @CONNECTED@ do the operation per RTN12a; else if @DISCONNECTED@ or @SUSPENDED@ set the state to @CLOSED@; else if @FAILED@ do nothing.
Or more elegantly, maybe If the connection state is @CONNECTING@, wait for the first state change, then follow the corresponding RTN12 spec for the new state)

So while this does improve behaviour in the case of a slow-but-reliable network (ie where there's a larger chance of the close() happening between the connect request but before the CONNECTED), it's at the expense of making it seem unresponsive when the network is failing/unreliable. :/

Well, yes, but if they don't care about closing connections cleanly then they would never need to call close(); they just exit. So if they are calling close() then they are making a good faith attempt to release the resources associated with the connection, so why not do that?

At the expense of further complexity we might say that the wait only occurs if it is a resume and not a new connection?

I did also suggest the possibility of a shorter timeout as another way to catch the majority case.

I don't think it's complex to state that calling #close, if in the connecting state, has a timer for 5s then forcibly moves to the closed state. However, would this then apply in the disconnected state as well?

I don't think it's complex to state that calling #close, if in the connecting state, has a timer for 5s then forcibly moves to the closed state. However, would this then apply in the disconnected state as well?

The timeout for close() could be 5s unconditionally.

I think we should keep CONNECTING and CONNECTED as semantically close as possible. It simplifies everything. Assuming the caller means something different calling close while CONNECTING than while CONNECTED is reading too much, I think. We provide all this queuing before reaching CONNECTED so that operations (publishes, presence actions, etc.) called while INITIALIZED/CONNECTING transparently happen once CONNECTED and I don't think we should depart from that behavior here.

We should tell the developer that, if she's having connection problems and wants to handle it herself instead of letting Ably's library do its reconnection/recovering thing, just call close() to clean up on Ably's side if possible, forget about that instance and make a new one.

I'd be against adding another timer (we've had bugs in ably-js before from various timers interacting in unforeseen ways, and annoyingness-to-debug seems to be exponential with number of timers going on at once. Also 5s isn't that much less than 10s, which is the current realtimeRequestTimeout)

So, OK, one more suggestion: how about we follow @tcard's proposal, with the small modification that calling close() while CONNECTING will immediately put you in the CLOSING state (but will not touch the transports). Then, if a transport connects within the realtimeRequestTimeout, send a CLOSE on it, and await the CLOSED reply (and do not change state). If not, then go into the CLOSED state upon expiration of that timeout.

That allows realtime to release resources, but avoids the weirdness of staying in a CONNECTING state for a while after someone calls close(), and avoids having to go from CONNECTING to CLOSING via an instant of CONNECTED. And from the user's perspective, it makes CONNECTING behave the same as CONNECTED as calling close on each will immediately put you in CLOSING.

I don't really see that much of a problem with what you call "weirdness"; to me, it makes for a simpler state machine, which is easier to reason about. But I'm OK with what you say if you think that's more sensible.

👍 for @SimonWoolf's proposal

Done at 4c177e8. (Sorry for the delay, I was under the impression I had already done this.)

tcard · 2016-05-04T08:20:32Z

Should we wrap this up? What do @mattheworiordan @paddybyers think of @SimonWoolf 's proposal?

When calling `Connection.close()` before reaching the CONNECTED state, the iOS library was trying to send a WebSocket message before the handshake was even finished. That caused it to crash sometimes. That situation wasn't contemplated on the spec. I've gone through it filling a table and filling the gaps, so that the spec is exhaustive about all possible combinations of state and operations. (So, yes, zwopple/PocketSocket#48 wasn't fixing the right problem.)

@SimonWoolf

Incorporating @SimonWoolf's feedback from #112 (diff).

@SimonWoolf

Incorporating @SimonWoolf's feedback from ably/docs#112 (diff).

tcard mentioned this pull request Apr 26, 2016

RTN17b ably/ably-cocoa#385

Merged

tcard added a commit to ably/ably-cocoa that referenced this pull request Apr 27, 2016

Conform to spec changes at ably/docs#112.

00a62ec

tcard mentioned this pull request Apr 27, 2016

Conform to spec changes at ably/docs#112. ably/ably-cocoa#441

Merged

mattheworiordan reviewed Apr 27, 2016
View reviewed changes

tcard force-pushed the exhaustive-states branch from 5a5f1dd to 0675514 Compare April 28, 2016 11:34

tcard added a commit that referenced this pull request Apr 28, 2016

https://github.com/ably/docs/pull/112#discussion_r61354799

aa9675e

SimonWoolf reviewed Apr 28, 2016
View reviewed changes

This was referenced Apr 28, 2016

RTN13b ably/ably-cocoa#436

Merged

RTN11b ably/ably-cocoa#419

Merged

tcard mentioned this pull request May 4, 2016

RTL6c ably/ably-cocoa#406

Merged

tcard and others added 24 commits June 16, 2016 10:50

Channel.attach when connection is INITIALIZED: queue operation.

51f98a0

Actually, have INITIALIZED to always queue instead of failing.

dc9f805

Channel.detach: when connection is CLOSED, channel is DETACHED.

275d1e8

detach: when connection is SUSPENDED, channel is DETACHED.

926039f

Unify RTN12d and RTN12e.

57c7f70

Refine RTL6c2 and RTP16b.

2ae305d

Attach when INITIALIZED does RTL4c.

10efbda

https://github.com/ably/docs/pull/112#discussion_r61354799

e8c490f

Change connect behavior when DISCONNECTED, SUSPENDED and FAILED.

62c43fd

RTL4e was duplicated.

457da6b

RTN11b: don't reuse the transport.

e7b3db0

Fix tables formatting and wording.

33b163f

Fix typo.

17f6407

Publish on INITIALIZED channel should queue.

8986e48

RTN12f: move immediately to CLOSING.

5d5bf7c

Incorporating @SimonWoolf's feedback from #112 (diff).

spec: Consistent use of @null@

5b417c7

spec: Allow a FAILED channel to be reattached

80c669d

spec: Few minor additions to state change effects on channel operations

150d28b

Exhaustive conditions for attach failures.

8accc74

Implicit attach only when INITIALIZED.

8620a08

spec: Simplify channel state rules with pending state vocab

768ebb3

spec: Return to previous channel state on failure

c83dc0e

Sync state ops table with spec.

74fff20

mattheworiordan force-pushed the exhaustive-states branch from aea51a4 to 74fff20 Compare June 16, 2016 09:50

mattheworiordan merged commit 74fff20 into source Jun 16, 2016

mattheworiordan deleted the exhaustive-states branch June 16, 2016 09:50

QuintinWillison mentioned this pull request May 12, 2021

RTN11c: Explicit connect should keep trying to recover connection state #1090

Merged

QuintinWillison pushed a commit to ably/specification that referenced this pull request Sep 20, 2022

https://github.com/ably/docs/pull/112#discussion_r61354799

976b524

QuintinWillison pushed a commit to ably/specification that referenced this pull request Sep 20, 2022

RTN12f: move immediately to CLOSING.

97c8ac8

Incorporating @SimonWoolf's feedback from ably/docs#112 (diff).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exhaustive state conditions for all realtime operations. #112

Exhaustive state conditions for all realtime operations. #112

tcard commented Apr 25, 2016

tcard commented Apr 25, 2016

mattheworiordan commented Apr 25, 2016

mattheworiordan Apr 27, 2016 •

edited

Loading

paddybyers Apr 28, 2016

mattheworiordan commented Apr 27, 2016

SimonWoolf Apr 28, 2016

SimonWoolf Apr 28, 2016

paddybyers Apr 28, 2016 •

edited

Loading

mattheworiordan Apr 28, 2016

paddybyers Apr 28, 2016

tcard Apr 28, 2016

SimonWoolf Apr 28, 2016

tcard May 2, 2016

mattheworiordan May 4, 2016

tcard May 11, 2016

tcard commented May 4, 2016

Exhaustive state conditions for all realtime operations. #112

Exhaustive state conditions for all realtime operations. #112

Conversation

tcard commented Apr 25, 2016

tcard commented Apr 25, 2016

mattheworiordan commented Apr 25, 2016

mattheworiordan Apr 27, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattheworiordan commented Apr 27, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paddybyers Apr 28, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcard commented May 4, 2016

mattheworiordan Apr 27, 2016 •

edited

Loading

paddybyers Apr 28, 2016 •

edited

Loading