Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slower active users not included in user list returned by signaling #1534

Conversation

danxuliu
Copy link
Member

@danxuliu danxuliu commented Feb 13, 2019

Follow up to #1522

When a user requests the signaling messages from the internal signaling server first the last ping of the user is updated. Then, after waiting for at most 30 seconds, the list of users active in the room is returned.

That list was based on the users whose last ping happened around 30 seconds ago or less (it could be a bit longer than 30 seconds, but the described problem remains in that case too); if other user pulled the messages slightly before the current user and that other user did not pull the messages again (or the chat messages, as that updates the last ping too) before the user list was returned that other user was not included in the list, as her last ping happened more than 30 seconds ago.

Now the elapsed time since the last ping for users returned in the list is longer than the timeout used for pulling messages (and chat messages) to ensure (up to a point) that active users will be included in the list even if it took a bit longer for them to pull messages again.

Unfortunately I have not found any easy and consistent way to reproduce the bug, so I can not provide a step by step testing scenario, except for "leave a call between two peers running for a lot of time until the bug happens" :-(

When a user requests the signaling messages from the internal signaling
server first the last ping of the user is updated. Then, after waiting
for at most 30 seconds, the list of users active in the room is
returned.

That list was based on the users whose last ping happened around 30
seconds ago or less (it could be a bit longer than 30 seconds, but the
described problem remains in that case too); if other user pulled the
messages slightly before the current user and that other user did not
pull the messages again (or the chat messages, as that updates the last
ping too) before the user list was returned that other user was not
included in the list, as her last ping happened more than 30 seconds
ago.

Now the elapsed time since the last ping for users returned in the list
is longer than the timeout used for pulling messages (and chat messages)
to ensure (up to a point) that active users will be included in the list
even if it took a bit longer for them to pull messages again.

The drawback of this approach is that the internal signaling server will
now need a few more seconds to notice when a user left a call abruptly,
but before it was not immediate anyway and it should not be a common
scenario either.

Finally, note that it is unlikely that more than 40 seconds pass between
the ping is updated for the current user and the user list is returned,
but the condition to handle that case gracefully was kept to be on the
safe side.

Signed-off-by: Daniel Calviño Sánchez <danxuliu@gmail.com>
@danxuliu
Copy link
Member Author

/backport to stable15

@nickvergessen nickvergessen merged commit 13433b1 into master Feb 13, 2019
@nickvergessen nickvergessen deleted the fix-slower-active-peers-not-included-in-user-list-returned-by-signaling branch February 13, 2019 14:43
@backportbot-nextcloud
Copy link

backport to stable15 in #1543

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review bug feature: signaling 📶 Internal and external signaling backends
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants