Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wildcard SUBSCRIBE #253

Closed
afrind opened this issue Sep 27, 2023 · 10 comments · Fixed by #498
Closed

Wildcard SUBSCRIBE #253

afrind opened this issue Sep 27, 2023 · 10 comments · Fixed by #498
Labels
Subscribe Related to SUBSCRIBE message and subscription handling

Comments

@afrind
Copy link
Collaborator

afrind commented Sep 27, 2023

SUBSCRIBE today requires a full track name, and specifies exactly one track. Do we need a subscription mechanism that matches more than one track, including tracks that may not exist currently, but may come into existence in the future?

@kixelated
Copy link
Collaborator

kixelated commented Sep 29, 2023

The only use-case for wildcard subscriptions is what I'm dubbing catalog-less. A subscriber could subscribe to tracks without knowing about their existence.

  1. A subscriber would know the base path of a conference call.
  2. Each participant would ANNOUNCE base/participant.
  3. Each participant would SUBSCRIBE base/*.

This is cool at first glance, but I don't see it actually working in practice. The problem is that there's no selection mechanism; a subscriber will receive ALL possible tracks. That means stuff like simulcast or multiple codec support is basically impossible. You would absolutely need a catalog for each participant at a minimum.

3a. Each participant would SUBSCRIBE base/*/catalog

Now you could choose tracks for each participant. The benefit of catalog-less is that you don't have a central server; each participant would talk directly to the CDN only.

...except you still would still want a central server; I don't think you can get away with it. Stuff like authentication, notifications, logging, billing, validation, versioning etc is really difficult if participants are allowed to connect to a CDN and publish arbitrary namespaces. It's possible, but you're just going to end up exposing an API (ex. api.webex.com/join). The central server can then just publish a catalog of all participants; there's no need for catalog-less.

@hardie
Copy link
Collaborator

hardie commented Sep 29, 2023 via email

@wilaw
Copy link
Contributor

wilaw commented Sep 29, 2023

The wildcard approach is alluring in its simplicity. However it opens up a strong vector for unintentional (or perhaps malicious) DOS'ing of a distribution relay. For example, a reasonable wildcard'ed subscription like this

webex.com/session/gh34gh43j4/carol/*

would get you Carol's slides when she publishes them. However this subscription

webex.com/*

would likely DOS your relay and all the nodes upstream of it. In the HTTP world, it would be the equivalent to being able to ask any CDN edge node for all 22PB of assets owned by a customer, without even knowing any of their URLs.

We can counter this risk by using tokens to enforce access control. The token would presumably define some base path below which a user is allowed to wildcard. In moq-transport so far, token protection is at the discretion of the application. Since we have no standard tokenization scheme, this is difficult to enforce, brittle when we get to scale and IMO is overreaching in terms of design.

There are two alternate solutions (beyond wildcard subscriptions) to the oft cited use case of getting the slides quickly in a conferencing use-case.

  1. The catalog for Carol (or for the web conference if the catalog is generated by a central server) can advertise a track for Carol's slides and all the other participants can subscribe to it. It doesn't mean she actually has to send out any content on that track and in fact she many never do. However, the moment Carol shares her slides, the network is setup to instantly distribute the content to the correct recipients.
  2. Simply use a delta catalog update to describe the new slide track. It is a very small payload and would be distributed very quickly. A web conferencing application already has a roundtrip delay in sharing slides, because only one person can share at a time, and so the orchestrator needs to coordinate who has the right to share. The delta catalog update would happen in parallel to any such orchestration. By using the catalog update, you inherit the content selection, initialization, track relationship and access control built in to that solution. I think this solution is a clean one and I'd like to see evidence from some of the early conferencing poc's that it is insufficiently performant.

@hardie
Copy link
Collaborator

hardie commented Sep 29, 2023 via email

@kixelated
Copy link
Collaborator

kixelated commented Sep 29, 2023

This might add some complexity, but I think we will need it for conference call case; otherwise the join latency will depend on a catalog update.

I'm proposing that the meeting host (participant or service) maintains a meeting catalog track of all participants. All participants are subscribed to this track, so when a new participant is added, the catalog update is effectively pushed.

-> SUBSCRIBE meeting/catalog
<- SUBSCRIBE_OK meeting/catalog
<- OBJECT meeting/catalog 0
<- OBJECT meeting/catalog 1
...
(alice joins)
<- OBJECT meeting/catalog 2

How Alice joins the meeting is up to the application; somehow the meeting host needs to get the catalog from Alice. It could be part of a api.webex.com/join API call that you'll need anyway for 3rd party CDN support, or it could be via an ANNOUNCE from Alice. Either way, the meeting host updated the meeting catalog and each participant can fetch Alice's media.

-> SUBSCRIBE meeting/alice/audio
-> SUBSCRIBE meeting/alice/video
<- SUBSCRIBE_OK meeting/alice/audio
<- SUBSCRIBE_OK meeting/alice/video
<- OBJECT meeting/alice/audio 0
<- OBJECT meeting/alice/video 0

This adds another RTT so the viewer can choose which tracks it wants. You're right that this RTT could be avoided if every participant produced required tracks using required codecs. However, you will eventually need to support optional codecs, optional renditions, and/or optional tracks. It's possible with catalog-less, but it would be gross as it would require unsubscribing to individual wildcard matches.

Another issue with catalog-less is the lack of a central authority. I don't see how you could remove a participant from a meeting, or approve a participant to screen share, or force mute a participant, or really perform any sort of access control. With a catalog, the meeting host (possibly api.webex.com) can push a delta update to add/remove/modify tracks based on UI interaction. With a wildcard subscribe instead... I'm not sure what you would do actually.

@kixelated
Copy link
Collaborator

kixelated commented Sep 29, 2023

(Will and Ted replied while I was drafting my latest wall of text)

The wildcard approach is alluring in its simplicity. However it opens up a strong vector for unintentional (or perhaps malicious) DOS'ing of a distribution relay.

The auth token would definitely need to scope wildcard SUBSCRIBEs, much like it would already scope individual SUBSCRIBEs. I don't think this is a problem.

Waiting for a catalog track and re-request will mean someone joining will take longer to be visible/audible to other participants and may "arrive" at different time scales for different participants.

I agree that a periodic catalog refresh ala HLS would be a poor experience.

Will and I want a live catalog track so there's no "re-request". Each update gets "pushed" to all participants since they will be subscribed to the meeting catalog track. There's no arrival variance other than the latency to origin.

You could also design this so that the latecomer (Carol) can distribute a catalog track as well, so that any other tracks offered (including alternate video or audio codecs) would be available shortly.

It's just a lot of complexity to save an RTT or two on join. You could save those RTTs in other ways, like pushing the catalog update while the webcam/encoder is initializing, so subscriptions are active before any OBJECTs are generated. I think it's premature to optimize for RTTs, especially when it's not clear how you would implement standard conferencing functionality without a catalog.

@afrind afrind added the Subscribe Related to SUBSCRIBE message and subscription handling label Oct 3, 2023
@suhasHere
Copy link
Collaborator

suhasHere commented Oct 4, 2023

Wildcard subscribe doesn't imply catalog-less :-) .

use-cases that needs supporting group semanitcs (not the moqt group, but group of users in a chat room, group of members for whom there will be policy (security) updates published frequently and so on) will be benefited with something like wildcard. All these cases require the subscriber to not know full track names before hand or will end up being too noisy with several catalog updates or can incur latencies.

@afrind
Copy link
Collaborator Author

afrind commented Feb 7, 2024

Individual Comment:

It seems to me that this is an optimization to save some RTTs, but that is meaningful and worth investigating.

Does anyone have a sketch of how this might work with the constructs we have in the current draft?

I keep running back into the Track Alias problem, which is that objects can suddenly arrive at a receiver who does not have any reference of what they are if the OBJECT races the OK. Maybe something like this:

WILDCARD_SUBSCRIBE {
  # matching criteria, TBD
  Num Reserved Subscription IDs (i)
  [Reserved Subscription ID (i)] +
}

When a track is available the matches the criteria, all objects get a preface with the track name / subscription ID mapping

WILDCARD_SUBSCRIBE_OK {
  Full Track Name
  Subscription ID (i)
  # other useful info
}
OBJECT_STREAM | STREAM_HEADER_TRACK | STREAM_HEADER_GROUP

Maybe the subscriber could issue a regular subscribe after seeing the mapping header or a catalog update and that would get rid of the extra overhead within a few RTTs. Another option is send the WILDCARD_SUBSCRIBE_OK on the control stream, and the receiver may have to buffer unknown objects, but only if it's using wildcards.

The publisher/relay can also send messages to indicate it needs more subscribe IDs.

@vasilvv
Copy link
Collaborator

vasilvv commented Feb 7, 2024

One option is as follows: if you subscribe to foo/* in SUBSCRIBE identified by number 123, and you receive an object for foo/bar, you can abbreviate it as (123, bar).

@afrind
Copy link
Collaborator Author

afrind commented Aug 14, 2024

Proposal is to close this in favor of SUBSCRIBE_NAMESPACE #484 #498. Any thoughts or objections?

ianswett added a commit that referenced this issue Sep 4, 2024
This message suite allows subscribers to express interest in
ANNOUNCE/UNANNOUNCE for set of namespaces that match a prefix. To
accomplish safe prefix matching, Track Namesapce is redefined from a
single byte array to a N-Tuple of byte arrays.

Fixes #484
Closes #253
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Subscribe Related to SUBSCRIBE message and subscription handling
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants