Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

video.requestVideoFrameCallback() #250

Closed
tguilbert-google opened this issue Jan 15, 2020 · 33 comments · Fixed by #311
Closed

video.requestVideoFrameCallback() #250

tguilbert-google opened this issue Jan 15, 2020 · 33 comments · Fixed by #311
Labels
position: positive venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG)

Comments

@tguilbert-google
Copy link

tguilbert-google commented Jan 15, 2020

Request for Mozilla Position on an Emerging Web Specification

Other information

HTMLVideoElement.requestAnimationFrame() provides a reliable way for web authors to know when a frame has been presented for composition, and to provide metadata for that frame.

@dbaron dbaron added the venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG) label Jan 16, 2020
@dbaron
Copy link
Contributor

dbaron commented Jan 16, 2020

cc @jyavenard @padenot for their opinions.

@smfr
Copy link

smfr commented Feb 21, 2020

@tguilbert-google
Copy link
Author

This link has the latest information:
https://wicg.github.io/video-raf/

@padenot
Copy link
Collaborator

padenot commented Mar 9, 2020

Some assorted comments on this, sorry for the delay:

  • The name needs to be changed, taking the same name for a completely different concept is not a good idea. requestAnimationFrame is a pull model (telling you when one might want to do something, because the compositor is about to composite), this particular API is push (telling you when something happens and pushing the frame to the compositor). Those are opposite things and naming them in a similar way is problematic.
  • It's unclear if this API is for:
    • knowing about composition events of video frames at the video frame-rate?
    • triggering texture uploads of particular video frames with a certain
      guarantee in terms of timing?
    • synchronize clock domain between remote sources, decoder, screens, by observing the timing of window.requestAnimationFrame events in relation to HTMLVideoElement.requestAnimationFrame events, to be able to choose the best content for the display and/or current situation?
  • In the abstract, past tense is used (has been presented for composition), but in the intro, it says which will run when a new frame is presented for composition, and then immediately switches to the past again (most recently presented for composition). It's a bit confusing because those callbacks run before regular requestAnimationFrame, that allow scheduling changes to the page. Frame submission to canvas seem to have been dropped in between the first explainer and https://wicg.github.io/video-raf/, this would have been useful.
  • In relation to the previous point, what happens if the main thread has a high load? The frame indices look like a bolted on solution on top of the design, and there is no way to signal back-pressure from the compositor back to the decoder. Is this supposed to be handled by the implementation, in its preferred style (pausing/buffering vs. frame dropping, which are the two commonly deployed strategies here), reflected in the callback metadata?
  • There is additional problems with the naming of the attributes in VideoFrameMetadata. presentation means a lot of different things in this text: it is the time the frame is handed off to the compositor (which one? The OS compositor, the browser compositor?), the time the frame is displayed on the display, and also the time in the media timeline (in a different clock domain, even, with a different type).
  • No worker support, how does it work in a Web Worker with an OffscreenCanvas? There was a mention of being able to do DOM-less operations, but an HTMLVideoElement was passed in the callback? This bit also appears to have been dropped, but could have been super useful.
  • In addition to the previous point, it is known that major websites can't ensure a steady requestAnimationFrame or anything iso-synchronous, without jitter, and requiring a requestAnimationFrame-style call each time doesn't work well in a scenario like this. This is not a problem is this API is now just for observing video composition events, in the sense that composition will still happen, but the other things that were set to happen in relation to composition events won't happen.
  • How to prevent doing wasteful texture uploads when the video has higher frame-rate than the display? This seems to be answered via the frame indices?
  • How does it work with adaptive sync (FreeSync/G-Sync/ProMotion/QSync etc.)? Presumably both requestAnimationFrame (this new one and the normal one) starts being called at the same rate, does the browser decide based on what is going on in the page?
  • Why have a callback at all and not a way to get those metadata on the video, if this is now just for observing presentation/composition events?
  • How does all those attribute work when playbackRate is no 1.0 ? Presumably everything is scaled appropriately.

Also it feels a bit weird to have capabilities like this that tie frame availability (at the decoding and/or network level) with info about presentation, without a way to handle back-pressure or under-runs. In terms of extensible web, this would go alongside web codecs (but probably wouldn't be needed if web codecs existed), and not be on HTMLMediaElement, except if this spec has changed in scope in between the request for comment and what is currently on the WICG text.

Generally, having a way to observe the fact that new frames are available on a video element is a good idea, and is necessary for the platform, but there are a number of initiatives that can solve this, are also underway (Web Codecs, Video Editing), and allow doing more than observing video frame availability, in a way that is useful. It would be best to coordinate so that we end up with multiple ways of solving the same problem at different layers. The simplicity of this new API is appealing however, but I'm afraid it's a bit ad-hoc to solve a particular problem.

This is non-harmful, but to be more specific: the capability is worth having, but the design needs work and/or clarification. (We're in the process of adjusting our labels so that our position is clearer, in the meantime I'm being specific here).

@foolip
Copy link

foolip commented Mar 9, 2020

@padenot were the "update the rendering" bits in https://wicg.github.io/video-raf/#video-raf-procedures when you reviewed this? I think that implicitly answers certain questions like "what happens if the main thread has a high load" although it doesn't say anything about back pressure.

@padenot
Copy link
Collaborator

padenot commented Mar 9, 2020

Yes I had indeed a lot of this written in the previous state of the proposal (when it was an explainer in a markdown file), but updated everything this morning, looking at the newer proposal, where the intent seem to have been a reduction in scope, but maybe not, since this is still before FrameRequestCallback as part of the rendering, which seem to imply a certain impact of this API on the rendering itself, which has its set of problems. For example, putting a dependency on the main thread scheduling for the compositor, where the source for the composited frame is usually not on the main thread (these days).

The back-pressure is the important bit, it is clear what happens, but it is not clear if what happens is desirable, the question is mostly rhetorical. This all depends on the purpose of the API: observation of frame compositing vs. scheduling of frame compositing vs. both, and the current text doesn't make this clear, it has a bit of everything.

The problem is that both goals are useful, so I don't know that this reduction in scope was intentional, incidental, or exists at all: it's possible that the author implicitly relies on the fact that everything will happen on the main thread, as part of the algorithm Update the rendering. The issue is that the ordering of video frame painting is not specified there (for good reasons: it's usually asynchronous w.r.t the main thread).

@tguilbert-google
Copy link
Author

Thanks for the thorough comments. I've tried to address them as best I can inline below.

  • The name needs to be changed, taking the same name for a completely different concept is not a good idea. requestAnimationFrame is a pull model (telling you when one might want to do something, because the compositor is about to composite), this particular API is push (telling you when something happens and pushing the frame to the compositor). Those are opposite things and naming them in a similar way is problematic.

The original "explainer API" had some push characteristics. I don't think the current API does. It runs like requestAnimationFrame, but only if a new frame has been presented since the callback was registered. The API doesn't interact in away with the composition path, it only observes composition events.

  • It's unclear if this API is for:

    • knowing about composition events of video frames at the video frame-rate?

Yes.

  • triggering texture uploads of particular video frames with a certain
    guarantee in terms of timing?

It's useful to trigger texture uploads at the video frame-rate. There is timing information, but there are no timing guarantees.

  • synchronize clock domain between remote sources, decoder, screens, by observing the timing of window.requestAnimationFrame events in relation to HTMLVideoElement.requestAnimationFrame events, to be able to choose the best content for the display and/or current situation?

I don't fully understand this example, but from what I understand, yes. For example, WebRTC applications could use the metadata from receiveTime, captureTime and elapsedProcessingTime to adjust the content they serve and reduce end-to-end latency.

  • In the abstract, past tense is used (has been presented for composition), but in the intro, it says which will run when a new frame is presented for composition, and then immediately switches to the past again (most recently presented for composition). It's a bit confusing because those callbacks run before regular requestAnimationFrame, that allow scheduling changes to the page. Frame submission to canvas seem to have been dropped in between the first explainer and https://wicg.github.io/video-raf/, this would have been useful.

I will be updating the abstract/intro text to be more consistent, and closer to "runs after a frame has been presented".
The API can still be used to draw into a canvas. The WICG text didn't include any canvas mention, because there is nothing spec related to say about it. We did drop frame preservation guarantees, in moving the callbacks to the Update the rendering.
Can you expand on why it's confusing to have the video callbacks run before regular requestAnimationFrame, and what scheduling changes you are mentioning? This could still be moved around in the spec.

  • In relation to the previous point, what happens if the main thread has a high load? The frame indices look like a bolted on solution on top of the design, and there is no way to signal back-pressure from the compositor back to the decoder. Is this supposed to be handled by the implementation, in its preferred style (pausing/buffering vs. frame dropping, which are the two commonly deployed strategies here), reflected in the callback metadata?

If the main thread has a high load, the video will still play smoothly, and we will get fewer callbacks. Web authors can detect how many frame's info they've missed. I am not following what you mean by signalling back-pressure from the compositor to the decoder.

  • There is additional problems with the naming of the attributes in VideoFrameMetadata. presentation means a lot of different things in this text: it is the time the frame is handed off to the compositor (which one? The OS compositor, the browser compositor?), the time the frame is displayed on the display, and also the time in the media timeline (in a different clock domain, even, with a different type).

I was considering renaming presentationTime to timePresented, and expectedPresentationTime to expectedDisplayTime (keeping presentationTime the same). Any thoughts?

  • No worker support, how does it work in a Web Worker with an OffscreenCanvas? There was a mention of being able to do DOM-less operations, but an HTMLVideoElement was passed in the callback? This bit also appears to have been dropped, but could have been super useful.

Are you referring to the mention of WebGLVideoTexture in the original explainer (e.g. here's the commit that removes it)?

  • In addition to the previous point, it is known that major websites can't ensure a steady requestAnimationFrame or anything iso-synchronous, without jitter, and requiring a requestAnimationFrame-style call each time doesn't work well in a scenario like this. This is not a problem is this API is now just for observing video composition events, in the sense that composition will still happen, but the other things that were set to happen in relation to composition events won't happen.

The API is about observing composition events, but work can be done in the callbacks. The current implementation isn't perfect (when drawing from a 25fps video to a canvas, the canvas is sometimes 1 v-sync behind the video), but it's less wasteful than doing unnecessary work via window.requestAnimationFrame.

  • How to prevent doing wasteful texture uploads when the video has higher frame-rate than the display? This seems to be answered via the frame indices?

Callbacks will never be fired more often than the "brower's frame rate". 120fps video on a 60hz screen should fire video.rAF callbacks at 60hz.

  • How does it work with adaptive sync (FreeSync/G-Sync/ProMotion/QSync etc.)? Presumably both requestAnimationFrame (this new one and the normal one) starts being called at the same rate, does the browser decide based on what is going on in the page?

I haven't looked into this, nor do I have a way of testing this. Video.rAF should behave the same as window.rAF in those cases.

  • Why have a callback at all and not a way to get those metadata on the video, if this is now just for observing presentation/composition events?

One of the intended uses of the API is still WebGL and canvas applications. We considered making the API an event that runs as a microtask. Painting a 60fps video on a 60hz screen from microtasks was definitively not as smooth as running callbacks in the rendering steps.

  • How does all those attribute work when playbackRate is no 1.0 ? Presumably everything is scaled appropriately.

It should work as expected. There is nothing directly dependent on playbackRate, but the rate at which we present frames might differ, and we would have a different spread of presentationTime.

[...] since this is still before FrameRequestCallback as part of the rendering, which seem to imply a certain impact of this API on the rendering itself, which has its set of problems. For example, putting a dependency on the main thread scheduling for the compositor, where the source for the composited frame is usually not on the main thread (these days).

E.g. pulling info from the compositor to the main thread during the rendering might have a performance impact?

Also it feels a bit weird to have capabilities like this that tie frame availability (at the decoding and/or network level) with info about presentation, without a way to handle back-pressure or under-runs. [...]
The back-pressure is the important bit, it is clear what happens, but it is not clear if what happens is desirable, the question is mostly rhetorical. This all depends on the purpose of the API: observation of frame compositing vs. scheduling of frame compositing vs. both, and the current text doesn't make this clear, it has a bit of everything.

Could you expand on what back-pressure/under-runs would look like for this API? The API isn't proposing any scheduling of frame compositing (unless you include drawing to a canvas from inside a callback). Whether the callbacks run on time or not does not affect the smoothness of the HTMLVideoElement. If the main thread is under a heavy load and there are fewer rendering steps, the callbacks won't be run as often, and won't be re-registered as often (if the callbacks queue up another video.rAF).

@padenot
Copy link
Collaborator

padenot commented Mar 16, 2020

I'm copying here (reformatted/rewritten/added something so it makes sense in context and because I've thought a bit more about this) the important points I had in private.

  • triggering texture uploads of particular video frames with a certain
    guarantee in terms of timing?

It's useful to trigger texture uploads at the video frame-rate. There is timing information, but there are no timing guarantees.

This is the main problem with this proposal.

This is also a fundamental contradiction with the way rendering works on the web. The text says that there are no guarantees when it comes to scheduling, but there is, and it's quite central, it allows a synchronization of canvas and CSS, for example (the general term is something like "atomicity of rendering", I'm finding Chromium docs when I search for this). Roughly, everything that happens in the same requestAnimationFrame must be displayed at the same time (CSS, canvas, etc.). The exception is videos, because there is no mechanism to know when new frames are available. However, we're adding one here, but we keep saying there are no guarantees. It's unclear what happens with this new requestAnimationFrame. Now, the name itself is really problematic. This is really essential to define what should happen, I think. And I don't think it should be different from the rest of the web platform.

The problem here is the contradiction in goal vs. approach: it is said that this API is to observe composition events of videos, but those callbacks happen before window.requestAnimationFrame, which is the thing that allows specifying what should happen during the next composition. Then it is said that it's OK to have a frame of latency, but this is a low-level API, and this frame of latency is unnecessary. Anybody that does video and overlays in any serious capacity (like video game developers, or TV broadcasters, or twitch), want synchronization. Additionally, this forces the engine to keep the frame around one frame too long, which is also unnecessary.

  • synchronize clock domain between remote sources, decoder, screens, by observing the timing of window.requestAnimationFrame events in relation to HTMLVideoElement.requestAnimationFrame events, to be able to choose the best content for the display and/or current situation?

I don't fully understand this example, but from what I understand, yes. For example, WebRTC applications could use the metadata from receiveTime, captureTime and elapsedProcessingTime to adjust the content they serve and reduce end-to-end latency.

It feels really weird to tie everything together here. As an author, what choice would you make based on those numbers? Deciding to drop a frame because it's too late?

Can you expand on why it's confusing to have the video callbacks run before regular requestAnimationFrame, and what scheduling changes you are mentioning? This could still be moved around in the proposal.

I've answered above, let me know if it's unclear. It's not confusing anymore, it's now clear what happens, but it's technically problematic for the reasons outlined above.

  • In relation to the previous point, what happens if the main thread has a high load? The frame indices look like a bolted on solution on top of the design, and there is no way to signal back-pressure from the compositor back to the decoder. Is this supposed to be handled by the implementation, in its preferred style (pausing/buffering vs. frame dropping, which are the two commonly deployed strategies here), reflected in the callback metadata?

If the main thread has a high load, the video will still play smoothly, and we will get fewer callbacks. Web authors can detect how many frame's info they've missed. I am not following what you mean by signalling back-pressure from the compositor to the decoder.

How can one tell that the decoder is struggling, vs. the compositor? This can be observed via the video quality metrics, but how can authors decide what to do here?

  • There is additional problems with the naming of the attributes in VideoFrameMetadata. presentation means a lot of different things in this text: it is the time the frame is handed off to the compositor (which one? The OS compositor, the browser compositor?), the time the frame is displayed on the display, and also the time in the media timeline (in a different clock domain, even, with a different type).

I was considering renaming presentationTime to timePresented, and expectedPresentationTime to expectedDisplayTime (keeping presentationTime the same). Any thoughts?

The current version of the description of the attribute is good (and solve the ambiguity for sure), but the names are not uniform. Depending non other factors, such as the scheduling of the callbacks, they will need to be renamed.

  • No worker support, how does it work in a Web Worker with an OffscreenCanvas? There was a mention of being able to do DOM-less operations, but an HTMLVideoElement was passed in the callback? This bit also appears to have been dropped, but could have been super useful.

Are you referring to the mention of WebGLVideoTexture in the original explainer (e.g. here's the commit that removes it)?

Yes, this was nice and useful.

  • In addition to the previous point, it is known that major websites can't ensure a steady requestAnimationFrame or anything iso-synchronous, without jitter, and requiring a requestAnimationFrame-style call each time doesn't work well in a scenario like this. This is not a problem is this API is now just for observing video composition events, in the sense that composition will still happen, but the other things that were set to happen in relation to composition events won't happen.

The API is about observing composition events, but work can be done in the callbacks. The current implementation isn't perfect (when drawing from a 25fps video to a canvas, the canvas is sometimes 1 v-sync behind the video), but it's less wasteful than doing unnecessary work via window.requestAnimationFrame.

It's problematic to ship an API while stating that one of it's core use-cases cannot be properly solved. I certainly agree with not doing any wasteful work however.

We can make it so that the video and canvas are perfectly in sync. The video frame is already in memory anyways, when that happens (video memory or regular memory, depending).

  • How to prevent doing wasteful texture uploads when the video has higher frame-rate than the display? This seems to be answered via the frame indices?

Callbacks will never be fired more often than the "brower's frame rate". 120fps video on a 60hz screen should fire video.rAF callbacks at 60hz.

Sounds good.

  • How does it work with adaptive sync (FreeSync/G-Sync/ProMotion/QSync etc.)? Presumably both requestAnimationFrame (this new one and the normal one) starts being called at the same rate, does the browser decide based on what is going on in the page?

I haven't looked into this, nor do I have a way of testing this. Video.rAF should behave the same as window.rAF in those cases.

Adaptive sync is the most important thing that has happened to video playback since battery-powered video playing devices exist. We can't simply ignore it here. We might find that it just works however, but we need to look into it and to allow for it to kick-in.

  • Why have a callback at all and not a way to get those metadata on the video, if this is now just for observing presentation/composition events?

One of the intended uses of the API is still WebGL and canvas applications. We considered making the API an event that runs as a microtask. Painting a 60fps video on a 60hz screen from microtasks was definitively not as smooth as running callbacks in the rendering steps.

I don't understand what microtask have to do with this or how they can help. We can't artificially reduce the usefulness of the API at the design stage for one of it's intended use-cases.

  • How does all those attribute work when playbackRate is no 1.0 ? Presumably everything is scaled appropriately.

It should work as expected. There is nothing directly dependent on playbackRate, but the rate at which we present frames might differ, and we would have a different spread of presentationTime.

Sounds good but maybe this needs to be said.

[...] since this is still before FrameRequestCallback as part of the rendering, which seem to imply a certain impact of this API on the rendering itself, which has its set of problems. For example, putting a dependency on the main thread scheduling for the compositor, where the source for the composited frame is usually not on the main thread (these days).

E.g. pulling info from the compositor to the main thread during the rendering might have a performance impact?

Not if it's just signaling composition, but just signaling composition is in opposition with the stated use-case of the API.

Also it feels a bit weird to have capabilities like this that tie frame availability (at the decoding and/or network level) with info about presentation, without a way to handle back-pressure or under-runs. [...]
The back-pressure is the important bit, it is clear what happens, but it is not clear if what happens is desirable, the question is mostly rhetorical. This all depends on the purpose of the API: observation of frame compositing vs. scheduling of frame compositing vs. both, and the current text doesn't make this clear, it has a bit of everything.

Could you expand on what back-pressure/under-runs would look like for this API? The API isn't proposing any scheduling of frame compositing (unless you include drawing to a canvas from inside a callback). Whether the callbacks run on time or not does not affect the smoothness of the HTMLVideoElement. If the main thread is under a heavy load and there are fewer rendering steps, the callbacks won't be run as often, and won't be re-registered as often (if the callbacks queue up another video.rAF).

Yes this is the main contradiction again, this API, as it stands, has artificial limitations for one of its use-cases, and it doesn't have to be like that. Having this in a worker with an OffscrenCanvas would be useful (but a main-thread counterpart is needed).

Changing the semantic to be an event that signals when a new frame is available is about to be submitted to the compositor for display (i.e. would be after the next windows.rAF), with the same mechanism that is in the proposal right now would be more useful (getting rid of the latency, the memory implication), with the main thread load issues not quite mitigated (but if we have worker support this can be fine).

In any way, it's better to work on this in the media-wg what has been created for the purpose of doing this kind of work without fragmenting the discourse.

@padenot
Copy link
Collaborator

padenot commented Mar 17, 2020

We had a call just now with the Chromium people involved in this. Most of my criticism here was about the presence of a non-compressible frame of latency, that seemed to be implied by wording like this (the text at https://wicg.github.io/video-raf/#introduction as of commit WICG/video-rvfc@97577fb):

This method allows web authors to register a VideoFrameRequestCallback which runs after a
new frame is presented for composition. [...]The VideoFrameRequestCallback also provides
useful metadata about the video frame that was most recently presented for composition.

In particular the use of past tense seemed to implied that this happens after the video frame has been sent ("presented") for composition. In fact this is not quite what happens in practice. This callback happens before window.requestAnimationFrame, a drawing operation made during this new callback will be synchronized (in a best effort fashion, because this is the main thread on the web after all) with a video playing, without latency. If the main thread is not loaded too much, a video being played back on the page in a regular

We discussed about the naming of the main callback of the API. I don't like it, because I think it expresses something else than requestAnimationFrame (which requests the new frame of animation), and that using familiar names that have different meaning is problematic when trying to convey meaning: here the meaning is that a new video frame is available and will be pushed to the compositor. Nothing is requested, and (nothing to do with animation either).

Some attributes names are not aligned with what we usually see on the web platform: something with Time in the name is in fact a duration, and the two important and related attributes:

  required DOMHighResTimeStamp timePresented;
  required DOMHighResTimeStamp expectedDisplayTime;

are constructed differently from each other.

It wasn't clear what the three last attribute precisely means, they were lifted from WebRTC:

  DOMHighResTimeStamp captureTime;
  DOMHighResTimeStamp receiveTime;
  unsigned long rtpTimestamp;

I think first and second are extremely useful in live-stream scenarios (twitch, mixer, periscope, etc.), the last one I'm not sure, I'm not familiar enough to comment. Those should be defined with enough care to allow another implementation to work in a compatible manner, as usual.

We talked about scriptable extension points to be able to do this from a worklet-like scope, which would free it from main thread scheduling hazard, and about worker uses, which need not block this, because it's a separate API.

I'll now move to the repo there (https://github.com/wicg/video-raf/) to continue feedback and bikeshedding on naming etc.

@dalecurtis
Copy link

dalecurtis commented Mar 17, 2020

Thanks Paul! Per our VC it seemed you were okay with Chrome shipping with the name as is, but now it sounds like you want to discuss the naming more. Can you confirm you stance?

Sorry if we misunderstood and happy to continue discussion in an issue on the video.rAF spec; the salient points from our discussion which I thought swayed you were:

  • Given that this runs with the rendering steps and has a subscription model like window.rAF, calling it something else seems confusing. See preliminary TAG text for naming consistency here: w3ctag/design-principles@a4636fc
  • video.rAF nor window.rAF always request something and the context in which they do so is application specific. Both could be used simply as telemetry or as a request for some sort of work from the browser.

Let me know if I missed anything.

@smfr
Copy link

smfr commented Mar 17, 2020

video.rAF nor window.rAF always request something

Did you mean neither video.rAF nor window.rAF always request something?

window.rAF does always request something. It triggers an "update the rendering" step, which causes the rAF callback to fire (even if there is no other rendering to do). This is not true of video.rAF.

@dalecurtis
Copy link

Yes I did mean neither. Thanks for the correction! Both window.rAF and video.rAF trigger the "update the rendering" step per current spec language. My statement was more in the vein of a conceptual description since that's how Paul was discussing it.

@padenot
Copy link
Collaborator

padenot commented Mar 26, 2020

Thanks Paul! Per our VC it seemed you were okay with Chrome shipping with the name as is, but now it sounds like you want to discuss the naming more. Can you confirm you stance?

My argument is the following:

  • window.requestAnimationFrame tells you that you need to do something in the function you passed, because it wants the next frame of animation. It's effectively "requesting an animation frame", this is pull: the request comes from something, and information flows from the author's code to the implementation (in this case, a series of drawing command)
  • HTMLVideoElement.requestAnimationFrame informs you that a new video frame is now available and is going to be composited, telling you a bunch of metrics. Incidentally, a drawImage can be done there as well. Nothing is "requesting an animation frame", this is what window.requestAnimationFrame is for.

I'd prefer if the name of this new feature would convey what it does, but I can live with the current name (we're discussing other naming issues on the HTMLVideoElement.requestAnimationFrame repo) if we don't find something better.

@tguilbert-google
Copy link
Author

I opened an issue on the WICG repo. I propose we move the API naming discussion there.

@smfr
Copy link

smfr commented Mar 27, 2020

Link?

@tguilbert-google
Copy link
Author

Sorry: WICG/video-rvfc#44

@dbaron
Copy link
Contributor

dbaron commented Apr 7, 2020

@padenot do you think we can make a statement now about whether this is something that is valuable to add to the web, or that it is likely harmful? Or do you think we need to wait for more spec discussion to happen before we can make that call?

@padenot
Copy link
Collaborator

padenot commented Apr 8, 2020

It's best to wait, the discussion is ongoing.

@dalecurtis
Copy link

@padenot is there any discussion remaining besides the naming (which now seems resolved)?

@padenot
Copy link
Collaborator

padenot commented Apr 10, 2020

Yes indeed, I should have written here after reading the other thread and agreeing.

I'd say this is worth-prototyping.

@dbaron dbaron changed the title video.requestAnimationFrame() video.requestVideoFrameCallback() Apr 10, 2020
@dbaron
Copy link
Contributor

dbaron commented Apr 10, 2020

@padenot The one thing that would be helpful to me (for adding this to the positions table) is a one or two sentence explanation of why this would be a useful addition to the web platform, i.e., what value it adds.

dbaron added a commit to dbaron/standards-positions that referenced this issue Apr 10, 2020
@dbaron dbaron added the ready to add Appears ready to add to the table of positions. label Apr 10, 2020
@tguilbert-google
Copy link
Author

Updated repo link if anyone is looking for it: https://github.com/WICG/video-rvfc

@padenot
Copy link
Collaborator

padenot commented Apr 14, 2020

@padenot The one thing that would be helpful to me (for adding this to the positions table) is a one or two sentence explanation of why this would be a useful addition to the web platform, i.e., what value it adds.

Historically, there has been no way of knowing the frame rate of an HTMLVideoElement, or to know when video frames are painted (for variable frame-rate videos, one example would be something coming from a webcam where frame-rate is dependent on the available light, for example). We have the proprietary mozPaintedFrames that can be used as a hack, but it's not very good. In particular it's just a number, so it requires polling, and the number is updated after painting, so this adds latency.

This means that authors that want to do any processing on a video are bound to do any of the following (depending on the scenario):

  • process too much frames: if the display is clocked at 60Hz, requestAnimationFrame is called at 60Hz, and video have often less frames per second (29.97fps, 30fps and 24fps are common numbers)
  • get an out of band information about the frame rate of a video (for a playback-type scenario, where the frame rate is assumed to be constant), and do the work in requestAnimationFrame. This requires cooperation with the provider of the media, which quite a problematic constraint
  • decode a bit of the media in script to get this piece of information (which is wasteful)

And there is no real answer for variable-frame rate video, or any sensible handling of dropped frame.

Some quick examples of scenario enabled by this new API (not exhaustive of course):

  • real-time video processing and painting on a canvas (2d or gl) at the correct rate (any VR program that want to display videos, in-game video content in non-VR game, 360° video footage display)
  • real-time video analysis on any video source at the correct rate, potentially allowing a lot more time to compute the analysis (if the analysis runs at 30Hz instead of 60Hz, twice as much CPU time can be spent on each frame)
  • frame-accurate synchronization of content outside of a video with the video (DOM, CSS, canvas overlays, etc.), for example https://w3c.github.io/danmaku/api.html or any annotation system that needs to be tighter than checking HTMLVideoElement.currentTime, which has a rather low resolution, and is variable in different implementations

Those scenarios are ubiquitous on platforms other than the web platform, especially (not not only) in the mobile application ecosystem.

@therealadityashankar
Copy link

therealadityashankar commented Mar 23, 2021

Hi Mozilla devs,
I've been wondering if any work has been done on this issue, I believe such a function would be extremely helpful for playing videos in-browser with synchronicity in a canvas, any updates would be appreciated ! 😃

@jaybe78
Copy link

jaybe78 commented Jun 9, 2021

+1 any update on this one ???

@andre-kornetzky
Copy link

Now that Safari has already provided this feature, only Firefox is missing...

Is there a way to test the feature? Firefox Nightly does not deliver the feature yet.

@tthef
Copy link

tthef commented Jul 11, 2022

Any update? This is really critical for applications that need to accurately sync a canvas with a video, and now that the API is provided by both Chrome and Safari it's not practical to be working around the missing API for Firefox alone ...

@smaug----
Copy link
Collaborator

I could note there isn't a proper spec for this, so it implementing is a tad hard.

@tomayac
Copy link
Contributor

tomayac commented Sep 20, 2022

The currently working WICG spec link is https://wicg.github.io/video-rvfc/.

@ThaUnknown
Copy link

ThaUnknown commented Oct 2, 2022

I created a polyfill for this! https://github.com/ThaUnknown/rvfc-polyfill/ this runs closely to the video framerate, but can sometimes be off by a few milliseconds or up to a frame, but is still better than not being able to use this API at all, and is way better than just running raf on a loop
It's also on NPM: https://www.npmjs.com/package/rvfc-polyfill

@grigb
Copy link

grigb commented Jan 2, 2023

I have started using requestvideoframecallback on Chrome to convert video to GIFs. Is that possible with the proposed solutions above?

Here's a working version for Chrome:
https://javascript.plainenglish.io/how-to-convert-a-video-clip-to-a-gif-file-with-client-side-javascript-56575d093191

thx!

@tguilbert-google
Copy link
Author

requestVideoFrameCallback does not guarantee that you will get a callback for every frame of your video. You should consider using the WebCodecs API instead, if this is one of your requirements. WebCodecs also allows convert videos into gifs faster than in realtime.

@zcorpan zcorpan added position: positive and removed ready to add Appears ready to add to the table of positions. labels Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
position: positive venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG)
Projects
None yet
Development

Successfully merging a pull request may close this issue.