Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scaleResolutionDownTo to RTCRtpEncodingParameters #221

Merged
merged 17 commits into from
Sep 12, 2024
75 changes: 73 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -359,12 +359,43 @@ <h3>
additional members to control audio packetization.
</p>
<pre class="idl">partial dictionary RTCRtpEncodingParameters {
unsigned long ptime;
boolean adaptivePtime = false;
RTCResolutionRestriction scaleResolutionDownTo;
unsigned long ptime;
boolean adaptivePtime = false;
};</pre>
<section id="rtcrtpencodingparameters-attributes">
<h2>Dictionary {{RTCRtpEncodingParameters}} Members</h2>
<dl data-link-for="RTCRtpEncodingParameters" data-dfn-for="RTCRtpEncodingParameters" class="dictionary-members">
<dt>
<dfn data-idl>scaleResolutionDownTo</dfn> of type <span class="idlMemberType">{{RTCResolutionRestriction}}</span>
</dt>
<dd>
<p>The maximum dimensions at which to restrict this encoding.</p>
<p>The default "scale resolution down by" factors (e.g. 4:2:1) are not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rephrase without air quotes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per other comment we can remove this sentence because we'll say to ignore it later in the text

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with the existing scaleResolutionDownBy values is that they effectively have default values (inserted in prose even though it's not in WebIDL since the defaults are dynamic, e.g. 4,2,1.

Unfortunately, webcompat here is poor since pasting this into web console produces 4,2,1 values in Firefox, 1,1,1, in Safari, and undefined, undefined, undefined in Chrome as defaults for scaleResolutionDownBy:

new RTCPeerConnection().addTransceiver("video", {sendEncodings: [{rid: "a"}, {rid: "b"}]}).sender.getParameters().encodings;

Hopefully this means few websites are actually relying on defaults or they're going to have a hard time.

Long-term we should fix the defaults to be better, but in the short-term we cannot rely on them. Better to ignore them I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is crbug 344943229 in Chrome and webkit 279202 in Safari.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

applied when this parameter is used. Instead, frames are either sent
as-is or they are downscaled by the encoder to uphold the
restrictions. The encoder will never upscale a frame. In simulcast,
henbos marked this conversation as resolved.
Show resolved Hide resolved
not upscaling can result in multiple same-sized encodings if the size
of the frame is smaller than {{scaleResolutionDownTo}}; an encoding is
not sent if {{scaleResolutionDownTo}} is larger than the frame and
there exists another {{RTCRtpEncodingParameters/active}} encoding with
a smaller {{scaleResolutionDownTo}} density resulting in the same
size.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that if we have a 480x360 stream, and there are two encodings:
scaleResolutionDownTo(480x360)
scaleResolutionDownTo(1024x768)
the first one (the "smaller density") is the one that will be sent, no matter which order they occur in?
This makes intuitive sense to me. Be careful of the case where the densities are equal - in that case, the current language says that both will be sent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct! I figured if scaleBy allows any order then scaleTo should too. In practice the libwebrtc bitrate allocation logic probably assumes smaller-to-larger but I'm not aware of anything in the spec that says you have to use this order, so I made this "don't send" logic agnostic towards order too

<p>When setting parameters, if {{scaleResolutionDownTo}} is specified
henbos marked this conversation as resolved.
Show resolved Hide resolved
on any encoding, check that the following is true or else return a
promise [= rejected =] with a newly [= exception/created =]
{{InvalidModificationError}}:</p>
<ul>
<li>
<p>{{scaleResolutionDownTo}} is specified on all encodings and has
values in both dimensions that are greater than 0.</p>
</li>
<li>
<p>{{RTCRtpEncodingParameters/scaleResolutionDownBy}} is not
specified on any encoding.</p>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of throwing when this is specified, say to ignore it. This is the way to make get+setParameters compatible with the default factors which today are different on different browsers (due to bugs), i.e. undefined on Chrome, 4:2:1 on Firefox and 1:1:1 on Safari

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

</li>
</ul>
</dd>
<dt>
<dfn data-idl>ptime</dfn> of type <span class="idlMemberType">unsigned long</span>
</dt>
Expand Down Expand Up @@ -403,6 +434,46 @@ <h2>Dictionary {{RTCRtpEncodingParameters}} Members</h2>
</dl>
</section>
</section>
<section id="rtcresolutionrestriction">
<h3>
The {{RTCResolutionRestriction}} dictionary.
</h3>
<pre class="idl">dictionary RTCResolutionRestriction {
unsigned long maxWidth;
unsigned long maxHeight;
};</pre>
<section id="rtcresolutionrestriction-attributes">
<h2>Dictionary {{RTCResolutionRestriction}} Members</h2>
<dl data-link-for="RTCResolutionRestriction" data-dfn-for="RTCResolutionRestriction" class="dictionary-members">
<dt>
<dfn data-idl>maxWidth</dfn> of type <span class="idlMemberType">unsigned long</span>
</dt>
<dd>
<p>The maximum width that frames will be encoded with without getting
downscaled. The restrictions are orientation agnostic, see note below.
henbos marked this conversation as resolved.
Show resolved Hide resolved
When scaling is applied, both dimensions of the frame are downscaled
using the same factor.</p>
</dd>
<dt>
<dfn data-idl>maxHeight</dfn> of type <span class="idlMemberType">unsigned long</span>
</dt>
<dd>
<p>The maximum height that frames will be encoded with without getting
downscaled. The restrictions are orientation agnostic, see note below.
henbos marked this conversation as resolved.
Show resolved Hide resolved
When scaling is applied, both dimensions of the frame are downscaled
using the same factor.</p>
henbos marked this conversation as resolved.
Show resolved Hide resolved
</dd>
<p class="note">
The restrictions being orientation agnostic means that they will
automatically be adjusted to the orientation of the frame being
restricted (portrait mode or landscape mode) by swapping width and
height if necessary. This means that it does not matter if 1280x720 or
720x1280 is specified, both always result in the exact same scaling
factor regardless of the orientation of the frame.
Copy link
Collaborator Author

@henbos henbos Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that we need to do this and that the C++ implementation already does this.

</p>
</dl>
</section>
</section>
<section id="rtcrtpsender-setparameters-keyframe">
<h3>
{{RTCRtpSender}} {{RTCRtpSender/setParameters()}} extensions for requesting the generation of a key frame.
Expand Down
Loading