Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebRTC Publish connection/reconnection bug possibility #4071

Closed
SelimEmre opened this issue Mar 21, 2022 · 14 comments
Closed

WebRTC Publish connection/reconnection bug possibility #4071

SelimEmre opened this issue Mar 21, 2022 · 14 comments
Labels

Comments

@SelimEmre
Copy link
Contributor

Short description

It seems that stream is stuck in broadcasting status.

Steps to reproduce

Start a normal streaming procedure from android app to ant media server (getting "publish_started" is the final notification which means it's working)
On android device, turn WIFI off, wait ~15s, and then turn WIFI on
After WIFI is back, we've implemented the "stream refresh" mechanism which should make the reconnect faster (since it usually takes from 30s to 3min, depending on the network)

"stream refresh": if any disruptive ant media error (e.g. "streamIdInUse", "publishTimeoutError", "already_publishing" etc.) is received, OR if ICE connection state is changed to "DISCONNECTED"/"FAILED", we react programmatically by:

  • sending "stop" message and stopping the signaling client
  • waiting 10-30s (depending on the error)
  • sending "publish" message again (to restart the streaming process, for the same stream id)

After "stream refresh", ant media server reacts differently in different network environments, however what is usually the case is that android app receives "streamIdInUse" several times, until finally reconnecting again and changing its ICE connection state to "CONNECTED"
Now, in 50% cases streaming is continued and everything works, but in other 50% of cases the "publish_started" event callback is never received, and the streaming is never again working (only solution is to start streaming on different stream id)
With the description of issue, I'm also sending the ant-media-server logs in the attachments. I was reproducing the issue today (18.03.2022.), the streaming started at 13:26 CET (stream id: "AfyLXKiZbp3m1647606389752"), and around 13:27:53 the stream stopped working, after I took the steps explained above.

Logs

ams-server-logs.zip

@SelimEmre
Copy link
Contributor Author

I investigated this issue deeply. It seems that it's related to the reconnection timeout parameter. It was 100 ms. It means that if there is a disconnection, Android sends publish request in 100ms. So, some requests getting streamIdInUse, already_publishing and publishTimeoutError.

@mekya mekya moved this to 👀 In review in Ant Media Server Jan 30, 2023
@rafabarros
Copy link

Same here.
When i turn off wi-fi or change the network its go crazy !
cannot re-establish connection
( react sdk ) - "@antmedia/react-native-ant-media": "^1.0.4",

@mekya mekya moved this from 👀 In review to Next Sprint in Ant Media Server May 2, 2023
@mekya
Copy link
Contributor

mekya commented May 2, 2023

Thank you for bring up this issue @rafabarros.

I've created a new issue and let's follow up there.

Btw, we have a fix in the JS SDK for this issue so closing this one.

@mekya mekya closed this as completed May 2, 2023
@github-project-automation github-project-automation bot moved this from Next Sprint to ✅ Done in Ant Media Server May 2, 2023
@rafabarros
Copy link

There's a way to recreate connection with react-sdk methods ?
How can i measure if its gone ?
iam trying to re-try using joinRoom ... but it just not respond.

@mekya
Copy link
Contributor

mekya commented May 8, 2023

H @rafabarros,

You can listen the callbacks about ice_connection_state and websocket's closed parameters.

If it's ok for you, let's continue discussion here by creating a new discussion? -> https://github.com/orgs/ant-media/discussions

@MaZZly
Copy link

MaZZly commented Mar 21, 2024

There are still many problems in the JS SDK when switching from Wifi <-> Cellular...

Problem 1

webRTCAdaptor.playStreamId will start containing duplicate ids when switching network connection, which it probably shouldn't.
I'm guessing this one should try to be kept in sync with what webRTCAdaptor.remotePeerConnection contains..

Problem 2

  1. Have 2 other people in the room sharing streams
  2. Toggle own network wifi/cellular and wait for streams to start showing again.
  3. Turn off one of the other 2 peoples streams
  4. Toggle own network again.
  5. Starts throwing Error Callbacks for no_stream_exist
    • Even if we have webRTCAdaptor.stop(message.streamId); on the no_stream_exist-error handler, it still keeps throwing that error repeatedly, meaning it is still trying to connect to a stream that has already been stopped.

This might be related to the above error, as webRTCAdaptor.playStreamId starts containing more and more duplicates the more times you toggle network and let it recover..

Problem 3

When toggling network and having multiple streams open, it doesn't trigger newTrackAvailable for all the streams during reconnection, although the ice_connection_state_changed for the streams are switching from disconnected state to connected.
So there is no way to get a valid MediaStream to set as srcObject on a <video>-element for those streams...


@mustafaboleken Can you get someone to prioritize this problem, and test having multiple streams in a room, and then toggle phone wifi / cellular connection and ensure that there are not any broken states and that events fire correctly for all streams? :)

This is a show-stopper for us putting ANT media server into usage in production, as being able to restore a spotty network connection is mandatory to work...

@mustafaboleken mustafaboleken moved this from ✅ Done to Next Sprint in Ant Media Server Mar 21, 2024
@mustafaboleken
Copy link
Contributor

mustafaboleken commented Mar 21, 2024

Hi @MaZZly
Thanks for the report. I put it into next spring, I will let you know about the status

@MaZZly
Copy link

MaZZly commented Apr 9, 2024

@mustafaboleken what is the status on this one?

@mustafaboleken
Copy link
Contributor

Hi @MaZZly

I'm currently working on it and there is a small amount of work left before the pull request. I'm in a vacation for the rest of the week. I will create a pull request probably next week.

@MaZZly
Copy link

MaZZly commented Apr 15, 2024

@mustafaboleken please let me know when there is a version with fix released and I'll do more testing to see if the odd states can still be triggered :)

@MaZZly
Copy link

MaZZly commented Apr 22, 2024

@mustafaboleken I think I've found another bug where the publish_started callback doesn't always trigger, even though the ice_connection_state_changed callback triggers and goes to connected. However, in this case, it happens without any network change, and when the user has just joined and starts to publish a stream.. It might still be related to Problem 3 🤔

Is there a testable version soon we could verify?

@MaZZly
Copy link

MaZZly commented Apr 30, 2024

@mustafaboleken, what is the status here? We've been waiting on this showstopper for 3 weeks now. 😬

@mustafaboleken
Copy link
Contributor

Hi @MaZZly,

I'm sorry to not finalize it earlier. I had internal meetings with the team to finalize the changes but it took some time. Let me talk with them to fix it in 2 days.

@MaZZly
Copy link

MaZZly commented May 7, 2024

@mustafaboleken, it has now been a couple of days. Are they progressing on getting this fixed?

@burak-58 burak-58 moved this from Next Sprint to ✅ Done in Ant Media Server May 20, 2024
@burak-58 burak-58 moved this from ✅ Done to Icebox in Ant Media Server May 20, 2024
@burak-58 burak-58 moved this from Icebox to ✅ Done in Ant Media Server May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

5 participants