This repository has been archived by the owner on Aug 21, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 0
Fixed issues with Mediasoup port allocation. #7979
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Testing of instanceservers on microk8s on a 24-thread CPU demonstrated some scaling problems with the port allocation to mediasoup. In order to make (data)producers consumable on every core, each (data)producer needs to be piped to every other core. This uses up two local ports per other core, as each router must create a pipeTransport to the other router, which also needs a pipeTransport. This scales exponentially; if mediasoup is given 200 ports to use, it cannot support a 16-core/thread processor, as that alone would require 240 ports, which does not even factor in the ports needed for incoming transports from clients. The solution involved a newer feature of mediasoup, WebRtcServers. Prior to this, each external transport used up a port of its own, leading to 2n transports per client (one for recvTransport and one for sendTransport). WebRtcServers can handle a near-infinite number of transports on a single inbound port. Now, when an instanceserver starts, it makes a WebRtcServer on each worker and saves a reference on that worker. When a (data)transport is created, instead of passing listenIps to createWebRtcTransport, the routers' worker's WebRtcSerer is passed instead. The solution also involves greatly expanding the number of ports that mediasoup uses, while trusting that only the first 100-200 are publicly exposed. Prior to this, only the 200 (by default) ports specified in the instanceserver fleet specification in the etherealengine Helm chart were used by mediasoup, under the theory that that many were needed to adequately handle 50-100 connecting clients. With the use of WebRtcServers, only the number of cores' worth of ports need to be exposed publicly, and since they are the first thing to start and be assigned ports, they will get ports somewhere in the range 40000-40199, starting with 40000. Mediasoup is now given a 10,000-port block in total to work with. The first n ports will be used by the WebRtcServers, and the rest will be free to be allocated to pipeTransports as requested. This supports up to a 100-core/thread CPU, which seems sufficiently future-proof. If a higher-thread-count CPU needs to be supported, all that's needed is to set the environment variable NUM_RTC_PORTS, which will override the default of 10000. Added an environment variable DEV_CHANNEL='true' on instanceservers' start-channel and dev-channel scripts. This allows those processes to run on a different port range so that, in dev mode, the channel server does not confliect with the ports used by the world server (by default channel will start at port 30000 instead of 40000). Also made some more explicit closings of (data)producers, rather than just relying on the closing transport to close them.
HexaField
reviewed
May 12, 2023
@barankyle I tried deploying this branch to microk8s and I am still unable to connect to instance server. Do I need to update rtc port range in config? |
…into mediasoup-port-refactor
hanzlamateen
approved these changes
May 30, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working fine on microk8s and minikube
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Testing of instanceservers on microk8s on a 24-thread CPU demonstrated some scaling problems with the port allocation to mediasoup. In order to make (data)producers consumable on every core, each (data)producer needs to be piped to every other core. This uses up two local ports per other core, as each router must create a pipeTransport to the other router, which also needs a pipeTransport. This scales exponentially; if mediasoup is given 200 ports to use, it cannot support a 16-core/thread processor, as that alone would require 240 ports, which does not even factor in the ports needed for incoming transports from clients.
The solution involved a newer feature of mediasoup, WebRtcServers. Prior to this, each external transport used up a port of its own, leading to 2n transports per client (one for recvTransport and one for sendTransport). WebRtcServers can handle a near-infinite number of transports on a single inbound port. Now, when an instanceserver starts, it makes a WebRtcServer on each worker and saves a reference on that worker. When a (data)transport is created, instead of passing listenIps to createWebRtcTransport, the routers' worker's WebRtcSerer is passed instead.
The solution also involves greatly expanding the number of ports that mediasoup uses, while trusting that only the first 100-200 are publicly exposed. Prior to this, only the 200 (by default) ports specified in the instanceserver fleet specification in the etherealengine Helm chart were used by mediasoup, under the theory that that many were needed to adequately handle 50-100 connecting clients. With the use of WebRtcServers, only the number of cores' worth of ports need to be exposed publicly, and since they are the first thing to start and be assigned ports, they will get ports somewhere in the range 40000-40199, starting with 40000.
Mediasoup is now given a 10,000-port block in total to work with. The first n ports will be used by the WebRtcServers, and the rest will be free to be allocated to pipeTransports as requested. This supports up to a 100-core/thread CPU, which seems sufficiently future-proof. If a higher-thread-count CPU needs to be supported, all that's needed is to set the environment variable NUM_RTC_PORTS, which will override the default of 10000.
Added an environment variable DEV_CHANNEL='true' on instanceservers' start-channel and dev-channel scripts. This allows those processes to run on a different port range so that, in dev mode, the channel server does not confliect with the ports used by the world server (by default channel will start at port 30000 instead of 40000).
Also made some more explicit closings of (data)producers, rather than just relying on the closing transport to close them.
References
closes #insert number here
Checklist
QA Steps
List any additional steps required to QA the changes of this PR, as well as any supplemental images or videos.