Wildcard SUBSCRIBE #253

afrind · 2023-09-27T18:03:48Z

SUBSCRIBE today requires a full track name, and specifies exactly one track. Do we need a subscription mechanism that matches more than one track, including tracks that may not exist currently, but may come into existence in the future?

kixelated · 2023-09-29T01:18:11Z

The only use-case for wildcard subscriptions is what I'm dubbing catalog-less. A subscriber could subscribe to tracks without knowing about their existence.

A subscriber would know the base path of a conference call.
Each participant would ANNOUNCE base/participant.
Each participant would SUBSCRIBE base/*.

This is cool at first glance, but I don't see it actually working in practice. The problem is that there's no selection mechanism; a subscriber will receive ALL possible tracks. That means stuff like simulcast or multiple codec support is basically impossible. You would absolutely need a catalog for each participant at a minimum.

3a. Each participant would SUBSCRIBE base/*/catalog

Now you could choose tracks for each participant. The benefit of catalog-less is that you don't have a central server; each participant would talk directly to the CDN only.

...except you still would still want a central server; I don't think you can get away with it. Stuff like authentication, notifications, logging, billing, validation, versioning etc is really difficult if participants are allowed to connect to a CDN and publish arbitrary namespaces. It's possible, but you're just going to end up exposing an API (ex. api.webex.com/join). The central server can then just publish a catalog of all participants; there's no need for catalog-less.

hardie · 2023-09-29T08:10:47Z

As a personal comment, I think you are taking the semantics of * a little too literally in your modeling of the problem. You have: 1. A subscriber would know the base path of a conference call. 2. Each participant would ANNOUNCE base/participant. 3. Each participant would SUBSCRIBE base/*. but you could have this as base/audio/* base/video/* thus ignoring text or haptics or whatever else the client couldn't handle, but getting audio and video from all participants. You could also have a semantic like base/participant/default/* where the subscription has the other side choose what tracks to send based on their defaults. Or even base/*/audio-codec1 base/*/video-codec1 to get all the participants, known and unknown, to send that audio and video (note that this works best if the application has a mandatory to implement codec set, because then you can be certain this works). This might add some complexity, but I think we will need it for conference call case; otherwise the join latency will depend on a catalog update. regards, Ted

…

On Fri, Sep 29, 2023 at 2:18 AM kixelated ***@***.***> wrote: The only use-case for wildcard subscriptions is what I'm dubbing *catalog-less*. A subscriber could subscribe to tracks without knowing about their existence. 1. A subscriber would know the base path of a conference call. 2. Each participant would ANNOUNCE base/participant. 3. Each participant would SUBSCRIBE base/*. This is cool at first glance, but I don't see it actually working in practice. The problem is that there's no selection mechanism; a subscriber will receive ALL possible tracks. That means stuff like simulcast or multiple codec support is basically impossible. You would absolutely need a catalog for each participant at a minimum. 3a. Each participant would SUBSCRIBE base/*/catalog Now you could choose tracks for each participant. The benefit of *catalog-less* is that you don't have a central server; each participant would talk directly to the CDN only. ...except you still would still want a central server; I don't think you can get away with it. Stuff like authentication, notifications, logging, billing, validation, versioning etc is really difficult if participants are allowed to connect to a CDN and publish arbitrary namespaces. It's possible, but you're just going to end up exposing an API (ex. api.webex.com/join). The central server can then just publish a catalog of all participants. — Reply to this email directly, view it on GitHub <#253 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKVXZCMJ2P7E5LYW4NB3VLX4YOWBANCNFSM6AAAAAA5JYAHCE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

wilaw · 2023-09-29T10:10:15Z

The wildcard approach is alluring in its simplicity. However it opens up a strong vector for unintentional (or perhaps malicious) DOS'ing of a distribution relay. For example, a reasonable wildcard'ed subscription like this

webex.com/session/gh34gh43j4/carol/*

would get you Carol's slides when she publishes them. However this subscription

webex.com/*

would likely DOS your relay and all the nodes upstream of it. In the HTTP world, it would be the equivalent to being able to ask any CDN edge node for all 22PB of assets owned by a customer, without even knowing any of their URLs.

We can counter this risk by using tokens to enforce access control. The token would presumably define some base path below which a user is allowed to wildcard. In moq-transport so far, token protection is at the discretion of the application. Since we have no standard tokenization scheme, this is difficult to enforce, brittle when we get to scale and IMO is overreaching in terms of design.

There are two alternate solutions (beyond wildcard subscriptions) to the oft cited use case of getting the slides quickly in a conferencing use-case.

The catalog for Carol (or for the web conference if the catalog is generated by a central server) can advertise a track for Carol's slides and all the other participants can subscribe to it. It doesn't mean she actually has to send out any content on that track and in fact she many never do. However, the moment Carol shares her slides, the network is setup to instantly distribute the content to the correct recipients.
Simply use a delta catalog update to describe the new slide track. It is a very small payload and would be distributed very quickly. A web conferencing application already has a roundtrip delay in sharing slides, because only one person can share at a time, and so the orchestrator needs to coordinate who has the right to share. The delta catalog update would happen in parallel to any such orchestration. By using the catalog update, you inherit the content selection, initialization, track relationship and access control built in to that solution. I think this solution is a clean one and I'd like to see evidence from some of the early conferencing poc's that it is insufficiently performant.

hardie · 2023-09-29T10:39:06Z

On Fri, Sep 29, 2023 at 11:10 AM Will Law ***@***.***> wrote: The wildcard approach is alluring in its simplicity. However it opens up a strong vector for unintentional (or perhaps malicious) DOS'ing of a distribution relay. For example, a reasonable wildcard'ed subscription like this webex.com/session/gh34gh43j4/carol/* would get you Carol's slides when she publishes them. However this subscription webex.com/* would likely DOS your relay and all the nodes upstream of it.

It seems far more likely to simply be refused. The ability to use wildcards does not imply that they can be used everywhere; */* (any host, any session) is clearly ridiculous and would be refused. example.com/* should be out of bounds as well. In the HTTP world, it would be the equivalent to being able to ask any CDN

edge node for all 22PB of assets owned by a customer, without even knowing any of their URLs. We can counter this risk by using tokens to enforce access control. The token would presumably define some base path below which a user is allowed to wildcard. In moq-transport so far, token protection is at the discretion of the application. Since we have no standard tokenization scheme, this is difficult to enforce, brittle when we get to scale and IMO is overreaching in terms of design. There are two alternate solutions (beyond wildcard subscriptions) to the oft cited use case of getting the slides quickly in a conferencing use-case.

This isn't the main problem I was referencing. Let's take the authors' call as an example. When I connect, I want the audio and video sent by the participants. But the participant list isn't consistent from week to week and it is very common for a participant to join later than the main set. Waiting for a catalog track and re-request will mean someone joining will take longer to be visible/audible to other participants and may "arrive" at different time scales for different participants. A subscription like: example.com/sesssionID/$ALL_Participants/video example.com/sesssionID/$ALL_Participants/audio example.com/sesssionID/$ALL_Participants/chat_text Would mean that the other participants would automatically be subscribed to chat text, audio, and video of anyone as they join. You could also design this so that the latecomer (Carol) can distribute a catalog track as well, so that any other tracks offered (including alternate video or audio codecs) would be available shortly.

…

1. The catalog for Carol (or for the web conference if the catalog is generated by a central server) can advertise a track for Carol's slides and all the other participants can subscribe to it. It doesn't mean she actually has to send out any content on that track and in fact she many never do. However, the moment Carol shares her slides, the network is setup to instantly distribute the content to the correct recipients. 2. Simply use a delta catalog update to describe the new slide track. It is a very small payload and would be distributed very quickly. A web conferencing application already has a roundtrip delay in sharing slides, because only one person can share at a time, and so the orchestrator needs to coordinate who has the right to share. The delta catalog update would happen in parallel to any such orchestration. By using the catalog update, you inherit the content selection, initialization, track relationship and access control built in to that solution. I think this solution is a clean one and I'd like to see evidence from some of the early conferencing poc's that it is insufficiently performant. — Reply to this email directly, view it on GitHub <#253 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKVXZBUMKFSWIRI74UAY7DX42NBHANCNFSM6AAAAAA5JYAHCE> . You are receiving this because you commented.Message ID: ***@***.***>

kixelated · 2023-09-29T11:33:46Z

This might add some complexity, but I think we will need it for conference call case; otherwise the join latency will depend on a catalog update.

I'm proposing that the meeting host (participant or service) maintains a meeting catalog track of all participants. All participants are subscribed to this track, so when a new participant is added, the catalog update is effectively pushed.

-> SUBSCRIBE meeting/catalog
<- SUBSCRIBE_OK meeting/catalog
<- OBJECT meeting/catalog 0
<- OBJECT meeting/catalog 1
...
(alice joins)
<- OBJECT meeting/catalog 2

How Alice joins the meeting is up to the application; somehow the meeting host needs to get the catalog from Alice. It could be part of a api.webex.com/join API call that you'll need anyway for 3rd party CDN support, or it could be via an ANNOUNCE from Alice. Either way, the meeting host updated the meeting catalog and each participant can fetch Alice's media.

-> SUBSCRIBE meeting/alice/audio
-> SUBSCRIBE meeting/alice/video
<- SUBSCRIBE_OK meeting/alice/audio
<- SUBSCRIBE_OK meeting/alice/video
<- OBJECT meeting/alice/audio 0
<- OBJECT meeting/alice/video 0

This adds another RTT so the viewer can choose which tracks it wants. You're right that this RTT could be avoided if every participant produced required tracks using required codecs. However, you will eventually need to support optional codecs, optional renditions, and/or optional tracks. It's possible with catalog-less, but it would be gross as it would require unsubscribing to individual wildcard matches.

Another issue with catalog-less is the lack of a central authority. I don't see how you could remove a participant from a meeting, or approve a participant to screen share, or force mute a participant, or really perform any sort of access control. With a catalog, the meeting host (possibly api.webex.com) can push a delta update to add/remove/modify tracks based on UI interaction. With a wildcard subscribe instead... I'm not sure what you would do actually.

kixelated · 2023-09-29T12:26:12Z

(Will and Ted replied while I was drafting my latest wall of text)

The wildcard approach is alluring in its simplicity. However it opens up a strong vector for unintentional (or perhaps malicious) DOS'ing of a distribution relay.

The auth token would definitely need to scope wildcard SUBSCRIBEs, much like it would already scope individual SUBSCRIBEs. I don't think this is a problem.

Waiting for a catalog track and re-request will mean someone joining will take longer to be visible/audible to other participants and may "arrive" at different time scales for different participants.

I agree that a periodic catalog refresh ala HLS would be a poor experience.

Will and I want a live catalog track so there's no "re-request". Each update gets "pushed" to all participants since they will be subscribed to the meeting catalog track. There's no arrival variance other than the latency to origin.

You could also design this so that the latecomer (Carol) can distribute a catalog track as well, so that any other tracks offered (including alternate video or audio codecs) would be available shortly.

It's just a lot of complexity to save an RTT or two on join. You could save those RTTs in other ways, like pushing the catalog update while the webcam/encoder is initializing, so subscriptions are active before any OBJECTs are generated. I think it's premature to optimize for RTTs, especially when it's not clear how you would implement standard conferencing functionality without a catalog.

suhasHere · 2023-10-04T11:54:49Z

Wildcard subscribe doesn't imply catalog-less :-) .

use-cases that needs supporting group semanitcs (not the moqt group, but group of users in a chat room, group of members for whom there will be policy (security) updates published frequently and so on) will be benefited with something like wildcard. All these cases require the subscriber to not know full track names before hand or will end up being too noisy with several catalog updates or can incur latencies.

afrind · 2024-02-07T04:04:15Z

Individual Comment:

It seems to me that this is an optimization to save some RTTs, but that is meaningful and worth investigating.

Does anyone have a sketch of how this might work with the constructs we have in the current draft?

I keep running back into the Track Alias problem, which is that objects can suddenly arrive at a receiver who does not have any reference of what they are if the OBJECT races the OK. Maybe something like this:

WILDCARD_SUBSCRIBE {
  # matching criteria, TBD
  Num Reserved Subscription IDs (i)
  [Reserved Subscription ID (i)] +
}

When a track is available the matches the criteria, all objects get a preface with the track name / subscription ID mapping

WILDCARD_SUBSCRIBE_OK {
  Full Track Name
  Subscription ID (i)
  # other useful info
}
OBJECT_STREAM | STREAM_HEADER_TRACK | STREAM_HEADER_GROUP

Maybe the subscriber could issue a regular subscribe after seeing the mapping header or a catalog update and that would get rid of the extra overhead within a few RTTs. Another option is send the WILDCARD_SUBSCRIBE_OK on the control stream, and the receiver may have to buffer unknown objects, but only if it's using wildcards.

The publisher/relay can also send messages to indicate it needs more subscribe IDs.

vasilvv · 2024-02-07T16:37:54Z

One option is as follows: if you subscribe to foo/* in SUBSCRIBE identified by number 123, and you receive an object for foo/bar, you can abbreviate it as (123, bar).

afrind · 2024-08-14T16:09:22Z

Proposal is to close this in favor of SUBSCRIBE_NAMESPACE #484 #498. Any thoughts or objections?

This message suite allows subscribers to express interest in ANNOUNCE/UNANNOUNCE for set of namespaces that match a prefix. To accomplish safe prefix matching, Track Namesapce is redefined from a single byte array to a N-Tuple of byte arrays. Fixes #484 Closes #253

afrind mentioned this issue Sep 27, 2023

SUBSCRIBE REQUEST ambiguity #204

Closed

afrind added the Subscribe Related to SUBSCRIBE message and subscription handling label Oct 3, 2023

ianswett mentioned this issue Aug 31, 2024

SUBSCRIBE_NAMESPACE #498

Merged

ianswett closed this as completed in #498 Sep 4, 2024

This was referenced Nov 14, 2024

Support wildcard/pattern subscribes #615

Open

Simplify naming by removing the distinction between Track Namespace and Track Name #508

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wildcard SUBSCRIBE #253

Wildcard SUBSCRIBE #253

afrind commented Sep 27, 2023

kixelated commented Sep 29, 2023 •

edited

Loading

hardie commented Sep 29, 2023 via email

wilaw commented Sep 29, 2023

hardie commented Sep 29, 2023 via email

kixelated commented Sep 29, 2023 •

edited

Loading

kixelated commented Sep 29, 2023 •

edited

Loading

suhasHere commented Oct 4, 2023 •

edited

Loading

afrind commented Feb 7, 2024

vasilvv commented Feb 7, 2024

afrind commented Aug 14, 2024

Wildcard SUBSCRIBE #253

Wildcard SUBSCRIBE #253

Comments

afrind commented Sep 27, 2023

kixelated commented Sep 29, 2023 • edited Loading

hardie commented Sep 29, 2023 via email

wilaw commented Sep 29, 2023

hardie commented Sep 29, 2023 via email

kixelated commented Sep 29, 2023 • edited Loading

kixelated commented Sep 29, 2023 • edited Loading

suhasHere commented Oct 4, 2023 • edited Loading

afrind commented Feb 7, 2024

vasilvv commented Feb 7, 2024

afrind commented Aug 14, 2024

kixelated commented Sep 29, 2023 •

edited

Loading

kixelated commented Sep 29, 2023 •

edited

Loading

kixelated commented Sep 29, 2023 •

edited

Loading

suhasHere commented Oct 4, 2023 •

edited

Loading