Support sample accurate audio splicing using timestampOffset/appendWindowStart/appendWindowEnd #37

Melatonin64 · 2015-11-09T18:58:54Z

One of the use cases for sample-accurate-audio-splicing is gapless audio playback (by removing the excess front/back padding added by most audio codecs).

Step 9 of the for loop in the Coded Frame Processing algorithm states:
If frame end timestamp is greater than appendWindowEnd, then set the need random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.

For audio, this means that we'll be dropping a complete coded frame (i.e. for AAC - 1024 audio samples) even if some of the audio samples would have fallen within the append window.
This granularity (coded frames) is not sufficient in order to achieve gapless audio playback.
It would be great if instead, we could keep the frame, and mark a range of samples to be discarded from frame, so that once the frame is decoded only the samples that fall within the append window will be used.

It's possible that this change alone is not sufficient to support sample-accurate-audio-splicing...

A test webpage can be found here, (based on Dale Curtis' article).
Further discussion can be found here

The text was updated successfully, but these errors were encountered:

wolenetz · 2016-03-16T23:14:11Z

Does the suggested change not fall within the non-normative note already in the spec? (pasted below)
NOTE
Some implementations may choose to collect some of these coded frames that are outside the append window and use them to generate a splice at the first coded frame that has a presentation timestamp greater than or equal to appendWindowStart even if that frame is not a random access point. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement.

wolenetz · 2016-03-16T23:20:29Z

On reread, this looks like a request to make gapless behavior (as is in Chrome) normative, not non-normative. For v1, this remains a quality-of-implementation issue. @Melatonin64 do you have a better non-normative note that we could put into v1?

Per triage process, marking V1NonBlocking to resolve any non-normative note fixes.
Gapless support, per reasons in the existing non-normative note after step 8, is non-normative for V1.

Melatonin64 · 2016-03-17T00:17:22Z

Thanks for your comments.
You're correct - this is basically a request to make gapless audio playback normative.

I'm not entirely sure why this requires multiple decoders (for audio codecs, where every frame is a random access point AFAIK).
It seems to me this could be easily implemented by including the coded frames that sit on the append window boundaries, while retaining some metadata for the splice.
Once the audio samples have been decoded, the excess samples (those that fall outside the append window) can just be discarded.
I might be missing something here though...

Otherwise, I don't have anything to add to the note.

jdsmith3000 · 2016-03-17T22:09:29Z

It's my understanding this would be implemented post decode. I'm not sure either why the note mentions multiple decoders, but it is not a simple change.

wolenetz · 2016-03-17T22:29:42Z

It is not a simple change. Chrome does this post-decode, with parser-time marking of the encoded frames to assist the post-decode. w.r.t. multiple decoders, at minimum a faster-than-realtime decoder would probably be necessary, if doing the splicing for gapless post-decode at playback time (because some of the decoder outputs from pre- and post- splice will be extra decoder work vs the actual decoded samples kept). Some implementations may not be able to do this faster-than-realtime decode without having more than one decoder (this is my educated guess why that non-normative note is phrased that way).

Melatonin64 · 2016-03-17T22:57:08Z

Ok, thanks for your comments.

I think it's a shame that implementers are not required to implement this.
Also, since Step 9 in Coded Frame Processing explicitly instructs implementers to:
drop the coded frame ... If frame end timestamp is greater than appendWindowEnd,
implementing this might not even occur to some.

Are there any plans to make this behavior normative in subsequent versions of the spec?

wolenetz · 2016-05-17T15:12:29Z

@Melatonin64 at the moment we are focused on getting MSE v1 spec across the line. A feature request like this imposes constraints that may be too much for some implementations, especially at this point in the spec process. I propose we move this to VNext.
If we keep this in V1, this is substantive. The milestone shouldn't be V1NonBlocking since this is substantive.

Melatonin64 · 2016-05-17T15:17:09Z

@wolenetz Ok, thanks.

mwatson2 · 2020-09-08T20:12:54Z

IIUC, this is asking for PCM sample-accurate audio splicing, which is definitely a desirable feature.

Melatonin64 · 2020-09-08T20:23:47Z

Yup, correct!

wolenetz · 2020-09-21T22:07:27Z

See also #165

mwatson2 · 2020-11-12T17:11:47Z

This feature needs to cover the various ways the audio could be spliced after decoding and applying the append window in the decoded domain. For example, if there remains overlap between the old and new audio, what are the options for browsers ? Cross-fading should remain an option. This would allow sites to specify the exact position of the start of the cross-fade within an audio frame.

wolenetz · 2021-08-10T15:00:18Z

This was discussed on today's Media Workgroup call.

Since not all implementations may be capable of supporting sample-accurate splicing, perhaps a normative feature detection would be best for apps to adapt appropriately, and then normative behavior for what is necessary for an implementation to support interoperable gapless/sample-accurate splicing. This would enable interop tests, and ideally promote better interop of bytestream-specific metadata interpretations of things like negative timestamps, edit lists, decoder preroll, and decoder delay - all to improve interoperability of especially gapless/sample-accurate implementations.

Note that cross-fade at whatever audio splice points result may depend on implementation capability for doing that (for example, currently Chrome doesn't cross-fade, yet does do sample-accurate audio splicing). See #165 for related issue around splice rendering behavior.

chrisn · 2022-12-07T15:11:42Z

Minutes from 8 Nov 2022 Media WG meeting: https://www.w3.org/2022/11/08-mediawg-minutes.html#t03

wolenetz added the needs author input label Mar 16, 2016

wolenetz added this to the V1NonBlocking milestone Mar 16, 2016

wolenetz modified the milestones: VNext, V1NonBlocking May 17, 2016

wolenetz removed this from the VNext milestone Jun 9, 2020

mwatson2 added this to the V2 milestone Sep 21, 2020

mwatson2 added the feature request label Sep 21, 2020

wolenetz added the TPAC-2022-discussion Marked for discussion at TPAC 2022 Media WG meeting Sep 16 label Sep 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support sample accurate audio splicing using timestampOffset/appendWindowStart/appendWindowEnd #37

Support sample accurate audio splicing using timestampOffset/appendWindowStart/appendWindowEnd #37

Melatonin64 commented Nov 9, 2015

wolenetz commented Mar 16, 2016

wolenetz commented Mar 16, 2016

Melatonin64 commented Mar 17, 2016

jdsmith3000 commented Mar 17, 2016

wolenetz commented Mar 17, 2016

Melatonin64 commented Mar 17, 2016

wolenetz commented May 17, 2016

Melatonin64 commented May 17, 2016

mwatson2 commented Sep 8, 2020

Melatonin64 commented Sep 8, 2020

wolenetz commented Sep 21, 2020

mwatson2 commented Nov 12, 2020

wolenetz commented Aug 10, 2021 •

edited by chrisn

Loading

chrisn commented Dec 7, 2022

Support sample accurate audio splicing using timestampOffset/appendWindowStart/appendWindowEnd #37

Support sample accurate audio splicing using timestampOffset/appendWindowStart/appendWindowEnd #37

Comments

Melatonin64 commented Nov 9, 2015

wolenetz commented Mar 16, 2016

wolenetz commented Mar 16, 2016

Melatonin64 commented Mar 17, 2016

jdsmith3000 commented Mar 17, 2016

wolenetz commented Mar 17, 2016

Melatonin64 commented Mar 17, 2016

wolenetz commented May 17, 2016

Melatonin64 commented May 17, 2016

mwatson2 commented Sep 8, 2020

Melatonin64 commented Sep 8, 2020

wolenetz commented Sep 21, 2020

mwatson2 commented Nov 12, 2020

wolenetz commented Aug 10, 2021 • edited by chrisn Loading

chrisn commented Dec 7, 2022

wolenetz commented Aug 10, 2021 •

edited by chrisn

Loading