-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add description of an API for controlling SDP codec negotiation #186
base: main
Are you sure you want to change the base?
Conversation
index.bs
Outdated
#### AddSendCodecCapability #### {#add-send-codec-capability} | ||
|
||
1. Let 'capability' be the "capability" argument | ||
1. If 'capability' is a known codec, throw an IllegalModification error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you define "known"? I would suggest forbidding the same mimeType so you can't have collisions between e.g. H264 and a newly added H264 with a different profile-level. But you seem to allow same-mimeType-different-fmtp in the explainer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What test is used to determine whether a capability is 'known"? Do we check against what would be returned by RTCRtpSender.getCapabilities()
, Media Capabilities or WebCodecs isSupported()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of throwing here is so that you can't ask to duplicate existing codecs. If you could duplicate, how could the other side know which one to choose for sending?
So it's important on the receive side, but exists on the send side mostly for symmetry.
I think "known" should be "a codec with exactly the same name and fmtp parameters appears in the list that is generated by default" - adding special codecs for special H.264 parameters is probably a valid use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to edit this in, I find that clock rates and channels are involved too. There can be 2 codecs that differ only in clock rate (CN codecs in particular), so I think we should throw in the kitchen sink and say that "equal if all the fields compare equal".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from discussion elsewhere: rtcp-fb might need consideration too (and *
does not cut it sadly)
index.bs
Outdated
#### AddSendCodecCapability #### {#add-send-codec-capability} | ||
|
||
1. Let 'capability' be the "capability" argument | ||
1. If 'capability' is a known codec, throw an IllegalModification error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What test is used to determine whether a capability is 'known"? Do we check against what would be returned by RTCRtpSender.getCapabilities()
, Media Capabilities or WebCodecs isSupported()
?
index.bs
Outdated
|
||
1. Let 'capability' be the "capability" argument | ||
1. If 'capability' is a known codec, throw an IllegalModification error. | ||
1. If 'packetizationMode' is defined, and is not a known codec, throw a SyntaxError DOMException. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would one specify they want to use a generic/codec agnostic packetizer/depacketizer?
Currently, the transform must preserve whatever bits the packetizers happen to read from (example libwebrtc reference) - however, if the RTCEncodedVideoFrameMetadata is correctly filled, the packetization process could use this instead, and not need to read any bits from the frame.
Is that something we need to distinguish between - the
"I am not modifying any necessary bits in the header, go ahead and use VP8/AV1/etc. packetizer"
vs the
"I am transforming the entire frame, don't read from it, but the frame metadata is there"
cases?
Should we even support both, or could we only allow generic RTCEncodedVideoFrameMetadata packetization in the first place?
also very minor nit: the name is slightly confusing due to the matching H264 parameter 'packetization-mode'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once that generic / codec agnostic packetizer/depacketizer has a name, we can use it.
Once a spec exists, it's likely to have a name in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more language about packetizers while moving to naming them explicitly. See if this conversation can be resolved, or whether comments should go elsewhere.
This issue had an associated resolution in WebRTC June 2023 meeting – 27 June 2023 (Encoded Transform Codec Negotiation):
|
index.bs
Outdated
<pre class="idl"> | ||
// New methods for RTCPeerConnection | ||
partial interface RTCPeerConnection { | ||
undefined AddSendCodecCapability(DOMString kind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind
seems redundant if we throw on !mimeType.startsWith("video/") && !mimeType.startsWith("audio/")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that once e.g this is called
const acmeCodec = {
mimeType: "video/acme-encrypted",
clockRate: 90000,
sdpFmtpLine: "encapsulated-codec=vp8",
packetizationMode: "video/vp8",
};
pc.addSenderCodecCapability(acmeCodec);
...then every negotiation from here on will offer sending of this new codec on ALL transceivers until the pc is deleted? And in first place?
This seems more limiting than e.g. setCodecPreferences
which allows negotiation of codecs per transceiver.
In fact, I was wondering, why can't we just reuse setCodecPreferences here instead?
transceiver.setCodecPreferences([acmeCodec]);
I.e. have this be how JS both adds new codecs and specifies their position among existing ones?
We already accepts dictionaries there, but have prose to throw on unknown codecs. But if we're about to allow JS to make up codecs anyway, then that limitation seems superfluous suddenly.
We perhaps adopt a different limitation then, which seems to be that setCodecPreferences sets the same codec preference for both send and receive, do I have that right? But maybe that's something we should solve anyway, or perhaps it's not a problem in practice since JS can just create one-way transceivers?
This might also sidestep input validation question, since we could throw on partially unique codecs (since we're expecting either known codecs or new codecs that can't be mistaken for known ones).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setCodecPreferences() has the limitation that it doesn't distinguish between capacity-to-send and capacity-to-receive.
It might actually make sense to add a direction field to RTCRtpCodecCapability - possible values "send", "receive" and "sendrecv". This could be useful for the receive-only codecs that are currently a design problem (HEVC is the biggest name in town on that front currently).
There are two derived dictionaries from RTCRtpCodec: RTCRtpCodecCapability (adds nothing) and RTCRtpCodecParameters (adds payloadType). Should the direction field be in both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One argument against using SetCodecPreferences is that it will turn unintended changes to the capabilities (on which we currently throw) into definitions for new codecs. Is that a Bad Thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reply to "All negotiations will offer..." - only if unmodified by SetCodecPreferences.
SetCodecPreferences should allow for selecting the additional codecs to offer / answer with as well as the "baseline" codecs as desired by the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I have some questions of scope here, and I'd like to understand the use cases beyond adding SDP negotiation to JS-e2ee.
I also have some concerns about the number of nuts and bolts and application would need to keep straight here, e.g. having to modify frame data.
Also in the spec, we have the RTCRtpScriptTransform interface. I wonder if it would simplify things if we put some options on that interface instead? (but perhaps not since an app may switch transform on the fly AFAICT. @youennf is that right?
index.bs
Outdated
<pre class="idl"> | ||
// New methods for RTCPeerConnection | ||
partial interface RTCPeerConnection { | ||
undefined AddSendCodecCapability(DOMString kind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that once e.g this is called
const acmeCodec = {
mimeType: "video/acme-encrypted",
clockRate: 90000,
sdpFmtpLine: "encapsulated-codec=vp8",
packetizationMode: "video/vp8",
};
pc.addSenderCodecCapability(acmeCodec);
...then every negotiation from here on will offer sending of this new codec on ALL transceivers until the pc is deleted? And in first place?
This seems more limiting than e.g. setCodecPreferences
which allows negotiation of codecs per transceiver.
In fact, I was wondering, why can't we just reuse setCodecPreferences here instead?
transceiver.setCodecPreferences([acmeCodec]);
I.e. have this be how JS both adds new codecs and specifies their position among existing ones?
We already accepts dictionaries there, but have prose to throw on unknown codecs. But if we're about to allow JS to make up codecs anyway, then that limitation seems superfluous suddenly.
We perhaps adopt a different limitation then, which seems to be that setCodecPreferences sets the same codec preference for both send and receive, do I have that right? But maybe that's something we should solve anyway, or perhaps it's not a problem in practice since JS can just create one-way transceivers?
This might also sidestep input validation question, since we could throw on partially unique codecs (since we're expecting either known codecs or new codecs that can't be mistaken for known ones).
index.bs
Outdated
partial interface RTCRtpSender { | ||
attribute RTCRtpTransform? transform; | ||
undefined setEncodingCodec(RTCRtpCodecParameters parameters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems redundant now that we have set codec in setParameters. Mentioned in the slides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will be removed, and replaced with a pointer to "set codec in setParameters" where appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving this in with a note for now, when we're separating the encoding and packetization steps, I want to make sure we're controlling both appropriately.
index.bs
Outdated
1. The {{RTCPeerConnection/[[CustomSendCodecs]]}} is included in the list of codecs which the user agent | ||
is currently capable of sending; the {{RTCPeerConnection/[[CustomReceiveCodecs]]}} is included in the list of codecs which the user agent is currently prepared to receive, as referenced in "set a session description" steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I wonder if these custom codecs could instead just live in transceiver.[[PreferredCodecs]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step 8 in setCodecPreferences() filters against codecCapabilities (the static).
If we modify that, it would be possible - but would we entirely stop filtering, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the spam - I only thought to group these into a review late in the day.
This addresses comments on index.bs - I'll address comments on the explainer in a separate review.
index.bs
Outdated
1. The {{RTCPeerConnection/[[CustomSendCodecs]]}} is included in the list of codecs which the user agent | ||
is currently capable of sending; the {{RTCPeerConnection/[[CustomReceiveCodecs]]}} is included in the list of codecs which the user agent is currently prepared to receive, as referenced in "set a session description" steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step 8 in setCodecPreferences() filters against codecCapabilities (the static).
If we modify that, it would be possible - but would we entirely stop filtering, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explainer updated. PTAL!
Modified according to discussions at TPAC according to solution presented there to #199 PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix bikeshed markup
(the remaining CI error is pending merge of #208) |
I have read this proposal, and I can't quite understand how it is suppose to work. AFAICT the goal is to allow negotiation of custom codecs, but we still only allow standardized packetizers to be (ab)used. This part of the spec:
seems extremely complex. The user needs to know at the lowest possible level of how the packetizers work, know which bits specifically they are allowed to touch or not touch for each codec, and AFAICT it would also be borderline impossible to debug if a packetizer decided to drop the frame for you. Would something like this work instead?
When Now when a sender and receiver needs to negotiate a codec they can do string matching on the codec names, and assuming they are aware of how the bits are being transformed, then they know everything about the codec and the format. |
to @Philipel-WebRTC - I don't understand your proposal.
There is no standard at the moment for a raw packetization format. So we can't refer to that format from a standard. I thought about using a DOMString for "setPacketizer" rather than a codec description, and allowing implementations to put non-specified values in there. But I didn't find any particular advantage in doing so. |
It could be that I fundamentally misunderstand the goal of this proposal, but AFAICT the goal is to enable custom codecs to be negotiated via SDP, or am I wrong? As soon as we negotiate a non-standardized codec name then we don't have any standardized packetizer to rely on, and I get that the proposed API would allow the sender and receiver to map the payload type of a custom codec to some other know (de)packetizer, but to me that seems very limiting and complex. The way it is proposed right now, would we expect anyone to every use a packetizer other than VP9 (the only one that separates descriptor and bitstream data cleanly in the payload) for a custom codec? Again, this to me looks a bit more like the abuse side of things than a general solution (assuming I understand the goal of this PR). |
At least one external WebRTC user has made this almost work by using the H.264 packetizer - he was advised that he would have an easier time if he appended the custom data as SEI (don't know the proper term for those). So no, VP9 is not the only option, but yes, VP9 is probably the best standard one at the moment. It's always possible for implementations to define their own packetizers, but that's not useful for interoperability. |
Actually, registering new codecs in the vnd. tree with IANA is a possibility that would allow a well defined way of using nonstandard packetizers. |
@aboba you have a "request changes" marker. Can you re-review? |
Following on yesterday's editor meeting, here are some thoughts related to this proposal. AIUI, we seem to get consensus on adding support to:
For 1, we assume the UA implements the RTP payload format. For 2, we could use payload types.
On sender side, the application would set it to We could also use that API internally to further define the interaction of SFrameTransform with the RTP layers. |
Solving the SDP negotiation problem for built-in Sframe is not sufficient. We have to solve it for script transforms. The proposal in #186 (comment) will not allow us to affect SDP negotiation for script transforms; SDP negotiation is only affected by things that happen before offer/answer time - there are no frames before offer/answer. |
Do you mean any other types here, or just SFrame? Do you have an example?
What would that look like? Would it multiply the payload types in
Do you mean a RTCRtpScriptTransform here? What would that look like?
What's the advantage or use case for choosing this per frame? Apps can already specify SFrame ahead of negotiation, so doesn't the UA have all it needs to negotiate sframe? const {sender} = pc.addTransceiver(track);
sender.transform = new SFrameTransform({role: "encrypt"});
// negotiate
const {codecs} = sender.getParameters();
if (!codecs.find(findSFrame)) sender.transform = null; |
You heard the SCIP (NATO) format being mentioned on the call. That's one.
It would look like the API described in https://developer.mozilla.org/en-US/docs/Web/API/RTCRtpTransceiver/setCodecPreferences
From the explainer:
I believe Youenn described this case in details on the call; I added this FAQ question specifically to address his use case. From the explainer:
I particularly mentioned on the call (and have said in every presentation that I've given on it) that I want an interface that is able to support the existing, deployed, Javascript-based sframe format, which is not the same as the sframe transform that is supposed to use the (still not final) IETF SFrame format draft, or any other transform. |
Fair question.
The use case I know is enable/disable SFrame dynamically. Otherwise, we just want to stick to whatever packetization is associated to the encoder media content and we can already change the media content via This change can be done either at the frame level or at the transform level (similarly to It is interesting to look at receiver side though, in case a UA implements the SFrame packetization format and processing happens in a script transform. The web application might want to know:
I would tend to expose the same API receiver and sender side, hence why I would tend to go with frame level.
That is probably where we have a disconnect.
This For this |
Specific interfaces that can only be used for the SFrame transform should go into the SFrame transform specification, not here. setCodecPreferences chooses the preference by the receiver about what the sender should send, it does not enforce either what the sender sends or what the receiver will have to handle. setDepacketizer(PT, depacketizer) is an API that addresses the use case on the receiver side.
We don't agree here. There is no generic metadata mechanism, and a number of codecs lack functionality for adding metadata.
I don't agree here. I think we want to allow for experimentation of this kind, we should actively ensure that we make it possible to try out new codecs without changing the browser.
Doing this without SDP handling would make the current situation, where people send content that does not conform to the SDP describing what they claim to be sending, strictly worse. I would strongly oppose adding such an API to the PeerConnection family of APIs without solving the SDP issue first.
If we have to handle SDP negotiation and RTP packetization for this use case, and we know that we have other use cases we need to handle it for, why not accept the current proposal into the WD? |
I agree this suggests an API mismatch. I see evidence of this in the Lyra use case as well, which had to:
It's an impressive feat, but I imagine these problems would be amplified with video. It might need its own API.
This makes sense to me as well. E.g. spitballing: sender.encoder = new RTCRtpScriptEncoder(worker, options);
sender.transform = new SFrameTransform({role: "encrypt"}); ? |
My preferential API for chained transforms is readable.pipeThrough(transform1).pipeThrough(transform2).pipeTo(writable). It is possible to write chain links that permit this syntax when using RTCRtpScriptTransformer. |
My point was really that if we want to add more attributes to addTransceiver(), we should not add more arguments. |
I've filed #221. My point here was merely to highlight that a transceiver is primarily the sender and receiver. |
I agree, so we should stop adding new methods to RtpTransceiver if we can avoid it. I favor SDP-agnostic options on Is it SDP-specific for JS to declare the input/output codec of the transform it performs? — Seems useful to the worker regardless of SDP. E.g. based on the transform-based part of what @alvestrand and I have been working on: transceiver.sender.transform = new RTCRtpScriptTransform(worker, {
inputCodecs: [{mimeType: "video/vp8"}],
outputCodecs: [{mimeType: "video/custom-encrypted"}]
});
transceiver.receiver.transform = new RTCRtpScriptTransform(worker, {
inputCodecs: [{mimeType: "video/custom-encrypted"}],
outputCodecs: [{mimeType: "video/vp8"}]
}); This nicely communicates the task to the (reused) worker (removing the need to invent a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alvestrand Here's the transform example you asked for. Hope it looks OK and thanks for your patience!
I was able to reuse the same worker function. I didn't touch on acceptOnlyInputCodecs since the JS does that (but it might be a nice optimization).
Separate the two proposals better Co-authored-by: Jan-Ivar Bruaroey <jan-ivar@users.noreply.github.com>
Separate the two proposals better Co-authored-by: Jan-Ivar Bruaroey <jan-ivar@users.noreply.github.com>
After writing up some considerations for how codecs are described in w3c/webrtc-pc#2925 I am starting to feel that the transform proposal fits better - but it would be described as modifying the codec list when a transform is added. Comments? |
Co-authored-by: Jan-Ivar Bruaroey <jan-ivar@users.noreply.github.com>
Co-authored-by: Jan-Ivar Bruaroey <jan-ivar@users.noreply.github.com>
Co-authored-by: Jan-Ivar Bruaroey <jan-ivar@users.noreply.github.com>
Co-authored-by: Jan-Ivar Bruaroey <jan-ivar@users.noreply.github.com>
Co-authored-by: Philipp Hancke <fippo@goodadvice.pages.de>
This issue was mentioned in WEBRTCWG-2024-02-20 (Page 14) |
@jan-ivar this is incomplete, but does it so far conform with what you think we've agreed on? |
I think this is ready for an editors review now - I believe the spec changes are in a state where they are good to merge, and reflect the API used in the explainer. Please take a look. |
This issue had an associated resolution in WebRTC April 23 2024 meeting – 23 April 2024 (Custom Codecs):
|
sequence<RTCRtpCodec> inputCodecs; | ||
sequence<RTCRtpCodec> outputCodecs; | ||
boolean acceptOnlyInputCodecs = false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are specs that take an ANY and transform them to a dictionary. @jan-ivar will find an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From streams:
- "Let underlyingSourceDict be underlyingSource, converted to an IDL value of type UnderlyingSource."
In that spec, underlyingSource is an object, vs ours which is any
, so we'd probably want to check against object
before doing this to avoid a TypeError here. (more below)
Aim to set editors can integrate on next Thursday (May 2, 2024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in reviewing this. Some questions still around how this all fits together with custom codecs, sframe etc.
sequence<RTCRtpCodec> inputCodecs; | ||
sequence<RTCRtpCodec> outputCodecs; | ||
boolean acceptOnlyInputCodecs = false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From streams:
- "Let underlyingSourceDict be underlyingSource, converted to an IDL value of type UnderlyingSource."
In that spec, underlyingSource is an object, vs ours which is any
, so we'd probably want to check against object
before doing this to avoid a TypeError here. (more below)
@@ -130,6 +130,7 @@ The <dfn abstract-op>readEncodedData</dfn> algorithm is given a |rtcObject| as p | |||
1. Let |frame| be the newly produced frame. | |||
1. Set |frame|.`[[owner]]` to |rtcObject|. | |||
1. Set |frame|.`[[counter]]` to |rtcObject|.`[[lastEnqueuedFrameCounter]]`. | |||
1. If |rtcObject|'s |transform|'s |options| contains {{RTCRtpScriptTransformParameters/acceptOnlyInputCodecs}} with the value True, and the {{RTCEncodedVideoFrameMetadata/mimeType}} of |frame| does not match any codec in |transform|'s |options|' |inputCodecs|, execute the [$writeEncodedData$] algorithm given |rtcObject| and |frame|. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First a nit: Maybe set an [[acceptOnlyInputCodecs]]
internal slot to avoid referencing the options
object which gets transferred? This is meant to be an invariant, and would avoid implying cross-thread access.
Also, while reviewing this, I realized the spec is confusing atm wrt what threads the various objects are on (rtcObject
is always on main-thread), so I've filed #229 which should help. I hope to do a PR there soon. — To not block on that, I think we can still merge this with the understanding that #229 might move things in the right place later, if that's OK with everyone.
But on to the substance of this line:
The writeEncodedData is the transformer.writable
's writeAlgorithm, i.e. it's underlying sink. The streams spec normally manages when this is called. This says to invoke that algorithm directly, which I worry might interfere with the streams spec's view of what is happening since it owns when to call its underlying sink. E.g. this might bypass any frames queued on the writable's internal queue that the streams spec maintains.
I'd be more comfortable if we could find a more WebRTC-specific shortcut that doesn't interact with the streams spec here. If I understand correctly, the goal was to bypass the streams machinery here anyway. Otherwise this optimization doesn't seem worthwhile.
Shouldn't we also abort the remaining steps in that case? I.e. don't enqueue the frame on the readable?
@@ -146,7 +147,7 @@ The <dfn abstract-op>writeEncodedData</dfn> algorithm is given a |rtcObject| as | |||
|
|||
On sender side, as part of [$readEncodedData$], frames produced by |rtcObject|'s encoder MUST be enqueued in |rtcObject|.`[[readable]]` in the encoder's output order. | |||
As [$writeEncodedData$] ensures that the transform cannot reorder frames, the encoder's output order is also the order followed by packetizers to generate RTP packets and assign RTP packet sequence numbers. | |||
The packetizer may expect the transformed data to still conform to the original format, e.g. a series of NAL units separated by Annex B start codes. | |||
The packetizer will expect the transformed data to conform to the format indicated by its {{RTCEncodedVideoFrameMetadata/mimeType}}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to define requirements on the packetizer here, e.g. what MUST happen if it doesn't or can't conform.
<dt> | ||
<dfn dict-member>frameId</dfn> <span class="idlMemberType">unsigned long long</span> | ||
</dt> | ||
<dd> | ||
<p> | ||
An identifier for the encoded frame, monotonically increasing in decode order. Its lower | ||
16 bits match the frame_number of the AV1 Dependency Descriptor Header Extension defined in Appendix A of [[?AV1-RTP-SPEC]], if present. | ||
Only present for received frames if the Dependency Descriptor Header Extension is present. | ||
</p> | ||
</dd> | ||
<dt> | ||
<dfn dict-member>dependencies</dfn> <span class="idlMemberType">sequence<unsigned long long></span> | ||
</dt> | ||
<dd> | ||
<p> | ||
List of frameIds of frames this frame references. | ||
Only present for received frames if the AV1 Dependency Descriptor Header Extension defined in Appendix A of [[?AV1-RTP-SPEC]] is present. | ||
</p> | ||
</dd> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a separate PR for this or explain its relationship? I don't get it.
dictionary RTCRtpScriptTransformParameters { | ||
sequence<RTCRtpCodec> inputCodecs; | ||
sequence<RTCRtpCodec> outputCodecs; | ||
boolean acceptOnlyInputCodecs = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bikeshed: With "accept" I worry it's not super clear to what happens to frames that are not "accepted". Especially for E2EE, we'd want to make it super clear that frames NOT accepted are sent through untransformed.
In the explainer you refer to it as a "filter". How about filterOnInputCodecs
or transformInputCodecsOnly
?
Also, I forget: why can't we default this to true or even remove the option to set it to false?
The member "packetizationMode" is added to the {{RTCRtpCodec}} definition; its value is a MIME type. When this is present, the rules for packetizing and depacketizing are those of the named MIME type. For codecs not supported by the platform, this MUST be present. | ||
|
||
<pre class='idl'> | ||
partial dictionary RTCRtpCodec { | ||
DOMString packetizationMode; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the idea with having both inputCodecs
and outputCodecs
that we could infer the packetizationMode from the inputCodecs
? e.g. from #186 (comment).
RTCRtpCodec
is used all over the place, including setCodecPreferences, set/getParameters, and the codec match algorithm, and it's not super-clear to me when the user agent is supposed to populate this member or not, or even why it needs to be exposed when the application needs to know this a priori.
The only example uses packetizationMode: video/sframe
which is a bit confusing to me. Sorry if I'm a bit rusty on this.
It's also not used in any algorithm, so it's not clear what its function is.
## Operations ## {#RTCRtpScriptTransform-operations} | ||
|
||
The <dfn constructor for="RTCRtpScriptTransform" lt="RTCRtpScriptTransform(worker, options)"><code>new RTCRtpScriptTransform(|worker|, |options|, |transfer|)</code></dfn> constructor steps are: | ||
1. Set |t1| to an [=identity transform stream=]. | ||
2. Set |t2| to an [=identity transform stream=]. | ||
3. Set |this|.`[[writable]]` to |t1|.`[[writable]]`. | ||
4. Set |this|.`[[readable]]` to |t2|.`[[readable]]`. | ||
4. Set |this|.`[[options]]?` to the result of converting |options| to an RTCRtpScriptTransformParameters dictionary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on comment above, something like:
4. Set |this|.`[[options]]?` to the result of converting |options| to an RTCRtpScriptTransformParameters dictionary. | |
4. Set |this|.`[[options]]` to |options|. | |
5. Let |optionsObject| be |options| if |options| is of type [=object=], otherwise `{}` . | |
6. Set |this|.`[[optionsDict]]` to the result of converting |optionsObject| to an RTCRtpScriptTransformParameters dictionary. |
Fixes #172
Preview | Diff