-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add presentationTimestamp to RTCEncodedVideoFrameMetadata #173
Conversation
@alvestrand Uploaded as we had discussed. Needs some editorial work to add links in the right places, but the logic is there. PTAL |
Adds a new field captureTimestamp to the RTCEncodedVideoFrameMetadata idl to expose this at the JS level. The field gets its value from |capture_time_identifier| in its corresponding webrtc::VideoFrame. Spec change: w3c/webrtc-encoded-transform#173 Bug: webrtc:14878 Change-Id: If0af3fce6d76c1aa299fa3e0c98be397af970696 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4218374 Reviewed-by: Guido Urdaneta <guidou@chromium.org> Commit-Queue: Palak Agarwal <agpalak@chromium.org> Cr-Commit-Position: refs/heads/main@{#1110320}
e5b0a7f
to
867d5b8
Compare
I released after some more testing that what we actually want, to be able to match the WebCodecs VideoFrame.timestamp is to have the presentationTimestamp included with the encodedFrame - this will also match the requestVideoFrameCallback's mediaTime field. Updated this PR to match this. PTAL @alvestrand |
index.bs
Outdated
<p> | ||
The media presentation timestamp (PTS) in microseconds of raw frame, matching the | ||
{{VideoFrame/timestamp}} for raw frames which correspond to this frame and the | ||
{{VideoFrameCallbackMetadata/mediaTime}} given if this frame is decoded and rendererd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VideoFrameCallbackMetadata/mediaTime is a double in seconds, maybe we should remove the reference to this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VideoFrameCallbackMetadata/mediaTime is a double in seconds, maybe we should remove the reference to this one.
Do you mean milliseconds? I also found w3c/webcodecs#122 on why WebCodecs uses microseconds for timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it uses seconds to match mediaElement.currentTime. 🤦
While I think it makes sense for a method on HTMLVideoElement to stay consistent with its attributes, this here is arguably a lower-level API.
To make matters complicated, W3C design principle § 8.3. Use milliseconds for time measurement says... that.
And in w3c/webcodecs#122 they use microseconds, out of concern for audio AV drift.
I see the original PR was for captureTimestamp
and then it was changed to presentationTimestamp
... What's the use case for this value? We may need to consider this to figure out which unit to stay consistent with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our model is VideoFrame -> encoder -> encoded chunk -> encoded transform.
Seems best to be consistent with VideoFrame here, and remove the reference to rvfc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the OP was edited and originally used microseconds. Can someone clarify what changed? Sorry if I missed it, I find the github thread here a bit hard to follow.
Normally, an issue is opened first for discussion, which tends to leave a thread that is more easy to follow, then a PR is opened later once discussion solidifies (github's PR workflow tends to hide things once resolved, which suits a review process more than a discussion imho).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the discussion and links!
Consistency with VideoFrame is indeed the motivating usecase here as Youenn said - so that an app is able to associate a raw frame before encoding with an encoded frame afterwards, when applying transforms on both sides. I'm happy to just remove the reference to rVFC here - would definitely simplify the text.
Apologies for editing the in-flight PR. The intention was always to have a field here which matched VideoFrame.timestamp, but the previous PR #137 that I was asked to adopt seemed to mistake which timestamp this corresponded to. Given noone but Harald had commented here yet, I thought it easier to just modify in place before kicking off the review more widely.
Why is |
The primary motivator is for being able to match with VideoFrame.timestamp, which uses a long long timestamp. I was trying to also pull in the rVFC field, per the discussion towards the end of #137, but that was probably a mistake. Per the comment thread with Jan-Ivar and Youenn, it seems to make sense to define this only in reference with the VideoFrame timestamp - I've pushed a commit removing the ref from the text at least. Does that make sense to you @aboba? |
Thanks, microseconds makes sense now. However, I remain concerned about the name. Consistency with VideoFrame and EncodedVideoChunk would mean calling it " However, in RTCEncodedVideoFrame If that is still our intent, we would appear to be at a crossroads. Is it too late to align this? |
How about us using This moves us into more alignment with the WebCodecs type by having a consistent |
This issue was discussed in WebRTC WG 2023-04-18 – (PR 173 adding presentationTime to RTCEncodedVideoFrameMetadata (Tony)) |
I've pushed a commit implementing the name suggestion from my previous comment - calling this new field just PTAL @aboba, @jan-ivar, @alvestrand et al |
This was discussed during the last Virtual Interim and there was support for it |
SHA: 442e4c2 Reason: push, by henbos Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
The old name was based on a misconception by us when initially implementing, clarified during the spec discussions and landed there in w3c/webrtc-encoded-transform#173. Driveby fix removing the incorrect comment on the value of rtpTimestamp. Note: still guarded by the as yet unlaunched feature RTCEncodedVideoFrameAdditionalMetadata Bug: 1441825 Change-Id: I388b197e93858957a79ed11c43d7a474f80c56a7 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5110960 Reviewed-by: Guido Urdaneta <guidou@chromium.org> Auto-Submit: Tony Herre <toprice@chromium.org> Commit-Queue: Guido Urdaneta <guidou@chromium.org> Cr-Commit-Position: refs/heads/main@{#1236889}
Add a captureTimestamp to RTCEncodedVideoFrameMetadata, defined only for local device captures to match the mediaTime as defined in requestVideoFrameCallback() and also match the
timestamp
in WebCodecs VideoFrame and EncodedVideoChunk. This allows apps to match encoded frames after encoding/before decoding with the corresponding raw frames in the mediacapture-transform APIs.Essentially an adoption of the open #137, incorporating the comments there.
Preview | Diff