Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for SVC scalability modes #40

Closed
aboba opened this issue Feb 12, 2020 · 22 comments
Closed

Support for SVC scalability modes #40

aboba opened this issue Feb 12, 2020 · 22 comments
Assignees
Labels
extension Interface changes that extend without breaking. Ready for PR

Comments

@aboba
Copy link
Collaborator

aboba commented Feb 12, 2020

TheVideoEncodeLayer dictionary is using the approach to SVC used in the ORTC API. Unfortunately, this approach cannot support non-hierarchical scalability modes such as K-SVC. That is why we took a different approach in WebRTC-SVC. Can we instead use an approach based on scalabiltyModes?

@sandersdan
Copy link
Contributor

A few beginner questions I have about SVC:

  • What features require specific decoder support? (My understanding is that a constrained baseline H.264 decoder should be able to decode with a temporal layer stripped out, but that other modes require decoder support.)
  • How can sites query for support? Can we simplify the query to a small set of attributes the encoder/decoder has, or do we need the full SVC configuration?
  • Is there an existing compatibility matrix for hardware encoder/decoder support that I can reference? (Primarily for Android, but also Windows would be useful.)

@chcunningham
Copy link
Collaborator

How can sites query for support? Can we simplify the query to a small set of attributes the encoder/decoder has, or do we need the full SVC configuration?

For H.264, rfc6381 described "svc" as a codec string prefix similar to "avc". I have no idea if this has been used in practice. It would sure be nice if the svc prefix + profile bytes described sets of svc features that must be supported.

For other codecs (VP9, AV1), I'm even less familiar. @marco99zz @jzern

@jzern
Copy link

jzern commented Feb 15, 2020

@FrankGalligan

Frank, I don't think we defined any codec strings for svc.

@chcunningham
Copy link
Collaborator

chcunningham commented Feb 15, 2020

Frank, I don't think we defined any codec strings for svc.

@sandersdan found that, for H264, they seem to have defined SVC profiles, so that might suggest a path for vpx/av1 (assuming its not deemed integral to some existing profiles - could also be fine with that).

@FrankGalligan
Copy link

FrankGalligan commented Feb 15, 2020 via email

@aboba
Copy link
Collaborator Author

aboba commented Feb 15, 2020

@sandersdan

  1. What features require specific decoder support?

[BA] The model in WebRTC-SVC is for the decoder and encoder to advertise the scalability modes they support for each codec. For VP8, VP9 and AV1, decoder support for SVC is mandatory so by default a decoder is assumed to be able to decode any SVC mode, so there is no need for a scalabilityMode setting on the decoder. However, encoders for these codecs may only support a subset of scalabilityMode values so these capabilities can be discovered, and the desired scalabiltiyMode can be set for the encoder.

The situation for H.264/SVC is more complex (see below), but the only browser supporting H.264/SVC was Edge Spartan (current Edge, Chrome, Firefox and Safari only support H.264/AVC), so it's not clear how much we need to worry about about H.264/SVC support.

  1. My understanding is that a constrained baseline H.264 decoder should be able to decode with a temporal layer stripped out, but that other modes require decoder support.

[BA] The H.264 encoder and decoder in Chromium apparently has some support for temporal scalability, so at one point there was a (rejected) PR to use the framemarking RTP header extension to supply the TID that is not present in H.264/AVC bitstreams. This PR and the H.264/SVC implementation that shipped in Edge Spartan were the high points of H.264/SVC support in WebRTC. With AV1 on the horizon, I'm not sure how much demand there is for supporting H.264/SVC in WebCodecs.

Note that H.264/SVC (unlike VP8 and VP9 SVC) has a history of interoperability problems. Initial H.264/SVC implementations by teleconferencing vendors could not initially interoperate due to differences in their bitstream and RTP implementations. The bitstream interop problem was addressed via the IMTC H.264/SVC bitstream profile, which established mandatory and optional scalability modes for H.264/SVC.

  1. How can sites query for support? Can we simplify the query to a small set of attributes the encoder/decoder has, or do we need the full SVC configuration?

[BA] In WebRTC-SVC encoders and decoders advertise the supported scalabilityMode values.

  1. Is there an existing compatibility matrix for hardware encoder/decoder support that I can reference? (Primarily for Android, but also Windows would be useful.)

[BA] Any compliant VP8 or VP9 decoder can decode SVC. Chromium-based browsers support capability advertisement and configuration for SVC encoders via WebRTC-SVC. Currently, Chromium advertises support for the L1T2 and L1T3 scalabilityMode values, but I'm not sure whether these modes are supported on all platforms. In general, hardware encoders have not had very good support for SVC (particularly more exotic modes like spatial scalability or K-SVC).

@aboba
Copy link
Collaborator Author

aboba commented Feb 15, 2020

@chcunningham

"might suggest a path for vpx/av1 (assuming it's not deemed integral to some existing profiles - could also be fine with that)."

[BA] The AV1 bitstream specification defines scalability modes in Section 6.7.5, and we borrowed the terminology in WebRTC-SVC. Currently, WebRTC-SVC re-uses the scalability modes defined in the AV1 bitstream specification.

"For H.264, rfc6381 described "svc" as a codec string prefix similar to "avc". I have no idea if this has been used in practice."

[BA] In SDP, H.264/SVC (RFC 6190) is treated as a distinct codec from H.264/AVC (RFC 6184). This simplifies things somewhat.

"It would sure be nice if the svc prefix + profile bytes described sets of svc features that must be supported."

[BA] The IMTC H.264/SVC bitstream profile attempts to improve H.264/SVC interoperability by mandating that encoders and decoders support a subset of scalability modes (e.g. L1T2, L1T3, etc.). Because an H.264/SVC decoder may not be able to decode any scalability mode an H.264/SVC encoder can send, for H.264/SVC both the encoder and decoder need to advertise their supported scalability modes.

"For other codecs (VP8, VP9, AV1), I'm even less familiar."

[BA] Since VP8/VP9/AV1 decoders are required to be able to decode anything an encoder can send, there typically isn't a need to discover or configure decoder SVC capabilities, just to discover what scalabilityMode values an encoder supports and to configure an encoder to encode using a desired scalabilityMode.

@sandersdan
Copy link
Contributor

My initial impression is that the WebRTC-SVC approach is simpler (as compared to ORTC) to implement capability detection for. In general I'd like to follow the WebRTC APIs where possible for consistency.

It does seem like there are a lot of modes, and I'm not sure what happens when AV2 comes out and supports an order of magnitude more things. I am in general not a fan of query APIs that list every supported feature, because it often turns out to be expensive to query platform decoders.

Is there already a mapping of this for MediaCapabilities? If so, do clients list the modes they are interested in as part of the query?

@chcunningham
Copy link
Collaborator

chcunningham commented Mar 24, 2020

Is there already a mapping of this for MediaCapabilities? If so, do clients list the modes they are interested in as part of the query?

No, no facility for MC exists at this time. If I understand @aboba correctly, none is need for vpx/av1.

[BA] Since VP8/VP9/AV1 decoders are required to be able to decode anything an encoder can send, there typically isn't a need to discover or configure decoder SVC capabilities, just to discover what scalabilityMode values an encoder supports and to configure an encoder to encode using a desired scalabilityMode.

Where can I read about this requirement? I'd like to be sure its airtight.

@aboba
Copy link
Collaborator Author

aboba commented Mar 25, 2020

@sandersdan The ORTC approach was not able to configure or advertise support for non-hierarchical modes like the KEY and KEY_SHIFT modes that are supported in AV1. It also was more complex for developers to correctly configure, as compared with the WebRTC-SVC approach.

The only downside of WebRTC-SVC as compared with ORTC was that ORTC supported arbitrary resolution ratios for spatial scalability as well as the ability to enable/disable individual layers. These features were also available in WebRTC 1.0's simulcast API (via sendEncodings), so developers may come to expect them. As discussed here, there have been some ideas about how to add some of that functionality back to WebRTC-SVC, at the cost of some increased complexity.

@aboba
Copy link
Collaborator Author

aboba commented Mar 25, 2020

@chcunningham While there isn't a need to describe the scalability modes that VP8/VP9/AV1 decoders can decode (because a compliant decoder should be able to decode any mode), there is a need to describe the scalability modes than an encoder can encode. For example, in Chrome when you enable the "experimental web features" flag, you will. see that calling RTCRtpSender.getCapabilities('video') indicates support for L1T2 and L1T3 with the VP8 and VP9 codecs, but not spatial scalability modes or the KEY or KEY_SHIFT modes.

@jimbankoski
Copy link
Contributor

As per discussions a couple of things we'd like to see :

per layer level bitrate controls.
the ability to set min and max quantizer on a per level basis.
the ability to change the type of scalability on the fly

@aboba
Copy link
Collaborator Author

aboba commented Apr 15, 2020

@jimbankoski Some of these needs (and proposals for addressing them) have come up in the WebRTC-SVC specification:
w3c/webrtc-svc#14

@jimbankoski
Copy link
Contributor

jimbankoski commented Apr 15, 2020

@aboba Thanks for the pointer and agreed it seems to be covered there!

@chcunningham
Copy link
Collaborator

Didn't make Q4, but I intend to work on it this quarter.

@aboba
Copy link
Collaborator Author

aboba commented Mar 31, 2021

Related: #9 #25 #85

@chcunningham
Copy link
Collaborator

Triage note: marked 'breaking' because the PR that resolves this makes a breaking change to the encoder output callback in order to emit metadata.

@chcunningham chcunningham added the breaking Interface changes that would break current usage (producing errors or undesired behavior). label May 12, 2021
@chcunningham
Copy link
Collaborator

With #187 merged, the breaking change to the encoder output callback is now resolved. We still intend to add additional descriptions of SVC metadata (frame dependencies, spatial layer id, ...), but this is done by extending the metadata dictionary with new keys for those.

@chcunningham chcunningham added extension Interface changes that extend without breaking. and removed breaking Interface changes that would break current usage (producing errors or undesired behavior). labels May 21, 2021
@juhani-honkala
Copy link

Has there been any progress on this? Is there a browser support matrix for the allowed SVC modes and some working examples how to actually use them or this all still wip?

@aboba
Copy link
Collaborator Author

aboba commented Sep 23, 2022

Encoding of temporal scalability (L1T2, L1T3) is supported for VP8, VP9, H.264/AVC and AV1. A live demo is here.

Github repo: https://github.com/aboba/wc-demo

@aboba
Copy link
Collaborator Author

aboba commented Mar 16, 2023

@dalecurtis Can we close this?

@dalecurtis
Copy link
Contributor

Basic support is now specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Interface changes that extend without breaking. Ready for PR
Projects
None yet
Development

No branches or pull requests

8 participants