Recreate CXProvider if a call cannot be hung up #1420
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves element-hq/element-ios#5189
Ways to reproduce
As mentioned in the ticket, this issue can be reproduced more directly by:
=> The app hungup the call, but the system still shows it as ongoing.
I must have repeated this over 200x, experiencing this issue about 10% of the times, without doing anything obviously different.
Root cause
It does not help that the Matrix SDK ignores all errors when requesting CallKit transactions. In fact once they are logged, two errors in particular happen at different times when hanging up the call:
CXErrorCodeRequestTransactionErrorUnknownCallProvider
andCXErrorCodeRequestTransactionErrorUnknownCallUUID
. Every time I was able to reproduce the issue with "stuck" call, one of these errors was reported.Through extensive logging I was able to rule out that we have some kind of
UUID
issue and so this must be some internal problem ofCallKit
that we do not see into. Similarly "unknown provider" suggests that ourCXProvider
was not correctly setup, but this is also not the case.What I can conclude from this is that some kind of race condition happens between the different
requestTransaction
calls, which complete asynchronously, and this leaves the call in an inconsistent state (e.g. trying to end the call before theCXAnswerCallAction
has completed, even though the call is ongoing).Solution
The subtitle is a misnomer because I did not find an actual solution to the assumed race condition. I have however added logs for all the errors, and in the case of error in
CXEndCallAction
which causes the hangup we now call aresetProvider
manually, which if nothing else, will actually shut down the hanging call (verified experimentally). Whilst this does not address the root cause, it resolves the symptop and makes sure the call is ended.