-
Notifications
You must be signed in to change notification settings - Fork 341
Fix for start messages being sent before connection ack (#766) #767
Conversation
@calvernaz: Thank you for submitting a pull request! Before we can merge it, you'll need to sign the Apollo Contributor License Agreement here: https://contribute.apollographql.com/ |
The build seems to be failing from unrelated causes and previous builds had the same issues. |
We had some issues with subscriptions being "lost" sometimes. While we had a hard time reproducing the issue, this change seems to have resolved it for us. Would be great to have this merged as we're currently rolling our own package version based on a fork. |
@calvernaz can this change be dealt with? |
Hi @Gilwe, I'm not sure what you mean by that, but this has been up for a while and haven't seen any signs of movement. I keep a private fork of this, just because of that. I'd be more than happy to do anything to make this upstream, but at this stage not sure when that's gonna happen. |
Hi, is any chance this will be merged (released) soon? It also fixes #339 which is more than two years old. |
ditto on this... i think we see the same behavior |
I am currently implementing subscriptions with AWS Lambda, DynamoDB and ApiGatewayV2. Because of this bug I am getting a "race condition" where the start request tries to execute a dynamodb read to get context before the connection_init has finished and put the context into dynamodb. I do this put/get because lambdas are stateless/cannot trust its state.This causes the start to fail about 50% of the times. I am using Graphql Playground to test with. |
@@ -593,6 +591,11 @@ export class SubscriptionClient { | |||
throw new Error(`Message must be JSON-parseable. Got: ${receivedData}`); | |||
} | |||
|
|||
if (MessageTypes.GQL_CONNECTION_ACK === parsedMessage.type && !this.operations[opId]) { | |||
this.flushUnsentMessagesQueue(); | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning at this point will prevent client from emitting connected/reconnected events, calling connectionCallback and do other stuff as per switch/case below on GQL_CONNECTION_ACK message.
My suggestion is to move this.flushUnsentMessagesQueue();
in some proper location (most probably, after connectionCallback() call) within switch/case below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious did you had a situation where this was a problem?
I ask because this was a fix for me and to others that clearly had experience some out of order events, but at the moment I don't have a scenario to test that and my memory is a bit foggy in the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious did you had a situation where this was a problem?
I stumble upon this issue recently and tried your PR patch, unfortunately it did not do the trick for me. Messages were still being sent before acknoledgement So I came up with my own fix this very morning.
You can check it out here springboardVR@4fc02dc#diff-25d66d74617fe2e23d7946bd6e3ba95640ab1b9bc8947445d604fc271c7c1f12
My intention was to submit another for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, if I understand it correctly you are queuing the start message in the bootstrap process and only flush it after receiving the ACK from the server, basically postponing the start message. It makes sense for me 👍
In my case for some reason (maybe a consequence of some lazy initialization) flush whatever was queued on the ACK and returning was sufficient. I like yours better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, reading through more and more issues and PR, it seems to me that issue #766 is a duplicate of #339, consequently, this PR ( #767 ) and #599 are alternative solution proposals for the same problem.
IMO #599 is better elaborated and more promising (for example, it adds test cases to check the proposed solution, as well all tests pass after this patch; although, there are some issues with tests, but basically due to my environment differences). However, I have had no chance yet to test this patch thoroughly in our project.
Although it is quite an old PR and at the time of writing this comment, it is 143 commits behind apollographql:master, it was quite easy to rebase it on master (with just one trivial conflict to resolve).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice fix - tried it out and it works as expected 👍
Would be good to see this merged in 🙏 and have #339 finally closed
Hey... what's the status of this now? |
No description provided.