-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio consumer changeProducer API implementation (PoC) #768
base: v3
Are you sure you want to change the base?
Audio consumer changeProducer API implementation (PoC) #768
Conversation
@vpalmisano, what about the rest of Consumer types? Simulcast, SVC ? Being mediasoup a library we should try not to do changes solely for a specific use case. |
Both frontend and backend assumed Something like |
In those cases the implementation should also check if that switch is possible (e.g. checking codec compatibility before switching producer).
So are you proposing to implement the switch logic at producer side, connecting the webrtc consumer to this piped producer? |
Simulcast audio? 🙃
I'm not saying that it should use pipe transport, but I think would be nice if mechanism was similar. This way consumers will continue to have fixed producer ID and producer will contain all of the logic. |
Of course not talking about audio only Consumers :-) |
Well, this PR is just for audio. Video will be more difficult and will only work for the same codec. |
I know, I know this PR is just for audio... |
Thanks for this effort, @vpalmisano. Similar to what Jose said, this feature (in its current state) is not suitable for mediasoup but for a super specific use case in which all participants send audio with exactly the same RTP parameters. This PR doesn't consider cases in which Alice and Bob may be producing OPUS with different We cannot make mediasoup implement these kinds of super specific use cases that make too many assumptions. Anyway I understand this PR as a PoC to be considered for future updates. |
Maybe I'm saying something stupid, but what if the check if the producers are "compatible" regarding to RTP parameters throwing an error if not? Then, the application could force somehow all participants in the same session have the same paramters. I don't know, maybe it is not possible to force all the parameters. I don't think is very specific use case. It is a common way to implement sessions with a big number of participants. But I understand you don't want it included in mediasoup codebase as it is not generic (only audio). I suppose the current way to send to a participant the audio of last N speakers is with SDP renegotiation, but maybe the time elapsed between the participant has started to speak and when his audio is finally arriving to the others' participant browsers is too much, with potential audio loss. Following @nazar-pc comment, this dynamic routing capability maybe can be implemented in another process receiving audio streams of all participants and selecting only one of them. Also, it could be changed dynamically. The generated audio stream can be sent to a mediasoup router and from it to the user by means of WebRTC. It is not ideal becase that interprocess communication increase latency and CPU resources, but solves the issue of re-negotiation delay. |
Considering that it is assumed (and asserted) that a Consumer will always consume from a Producer with the same RTP Parameters, I believe a more generic solution would be the following:
|
@jmillan I like the proposal. And we can ensure that |
If we go that way we will need PluggableSimpleConsumer, PluggableSimulcastConsumer and so on. For v4 I think we should merge all XxxxConsumer classes (including PipeConsumer and DirectConsumer) into one and also merge PlainTransport and PipeTransport into one (both are already basically the same with a few differences plus the latter uses PipeConsumers instead of normal ones). We should have a single Consumer class with capabilities to deal with simulcast, SVC, spatial and temporal layers, "pipe" option (to behave as a PipeConsumer which lets all simulcast stream go together to the endpoint), etc. We should make such a Consumer capable of dynamically switching codecs, RTP parameters, simulcast, SVC, etc. And we are done. Why? Because right now we have a SimpleConsumer class that cannot deal with temporal layers and in case of VP8 with a single stream and N temporal layers we need to use SimulcastConsumer, which makes no sense at all. We should have a single Consumer class capable of dealing with everything (spatial, temporal layers, etc) and make the code behave different (when needed) based on the codec. That's why we abstract the codec into the PayloadDescriptor class. The proposal of yet another PluggableConsumer (and friends) goes against that direction. |
I think you have misunderstood it. The proposal is about |
There is zero change on Consumers within the proposal. Yet, it is not 100% defined, neither is it the intention. |
Still I don't understand how it solves the consuming side in which RTP parameters may be different. |
Just to clarify, let me summarise from the beginning. There are N Producers in a Router. Let's imagine that each Producer represents a single endpoint. In a typical scenario each endpoint would like to consume all the Router's Producers but itself, meaning each endpoint would have N-1 Consumers. In a Router with 100 Producers, each endpoint would have 99 Consumers. This feature aims to reduce the number of Producers that are consumed at a given time in order to reduce to a known number the Consumers created for each endpoint. Imagine there are 100 Producers in a Router, but the application will limit to 5 the number of consumable Producers at a given time (it will base which Producers to consume based on application logic: LastN, etc.). In this scenarios Consumers will not be created out of Producers but out of PluggableProducers instead. The job of the PluggableProducer is to be consumed as if it was a real one, but a PluggableProducer is just a facade; it is no source of RTP by any means, its source of RTP is a real Producer which is plugged into it at a given moment. Consumers are hence created out of the PluggableProducer which RTP parameters will not change on its whole life time. The PluggableProducer needs to present its Consumers the RTP with the encodings expected by them, it hence needs to replace the SSRCs and any other needed info of the real Producer with its own. When it comes to unplug one Producer from the PluggableProducer and plug another one, PluggableProducer's Consumers need to be paused and resumed respectively so the sequence number, timestamps, referenceSpatialLayer, etc are reseted too and new ones are considered from scratch (we are already ready for that). |
For some reason I did read about "PluggableConsumer" in your previous test and hence my confusion. It makes sense as you said. But, regarding RTP parameters, remember that the consuming endpoint/device receives the RTP parameters of the corresponding Producer and those are needed for the SDP negotiation for things such as enabling stereo, ptime and so on. If we replace the producer a consumer is consuming, we must signal its RTP parameters (ok, the mapped ones as we always do) to the consuming device to run another SDP O/A and honor new stereo, ptime settings etc. |
Only if we allow changing real producer to one with different parameters. Initial design may not allow that and still be useful. I think we should just close this and create an issue describing |
This would be one approach. Which is legit, but has it's drawbacks:
Even if we went that way, the application could easily bypass it by doing which IMHO is a more library-ish way of doing it: Instead of renegotiating the new RTP parameters with an existing Consumer, create a new PluggableProducer and the corresponding Consumers with the new set of RTP parameters when needed. It's the application responsibility to pause or destroy any existing Consumers that are not needed anymore, or just keep them as in an Object Pool for later usage at zero cost and zero setup time, instead of renegotiating the Consumers at the cost of one RTT each time a PluggableProducer is to be plugged a Producer that does not match the current Producer's RTP parameters. With this approach we have one RTT just the first time we create a new PluggableProducer and its Consumers, but for the rest of the times that Consumer can be reused instantaneously, without any delay. IMHO there is no real benefit on renegotiating, but the extra RTT implications. And it can anyway be bypassed by applications as exposed above. The exposed solution fits IMHO the real purpose of the request, which implies instantaneous[*] transition consuming two different Producers. [*]: Only upon creation of the Consumers for the PluggableProducer, a RTT is needed. |
This PR implements a proof-of-concept allowing audio consumers to dynamically switch the producer instance, without any client renegotiations:
The motivation for adding this feature is supporting large rooms scenarios with hundred (or thousands) of audio producers. In this case, it is difficult, at client side, to handle such a large number of consumers, giving the fact that browsers have limitations on the number
<audio>
objects (and in general to the number of peer connection transceivers) that can be created inside the same page. With this API, the client can create one single audio consumer, and at server side we can use the active speaker data to automatically feed that single audio consumer with the active producer.Comments are welcomed!
The new API can be tested using this medianode-demo branch enabling the
singleAudioConsumerMode
query parameter (this PR is not intended to be merged, the only scope is demonstrating the new feature).Current TODO list:
changeProducer
availability to audio-only consumers.