AudioBufferSourceNode: How to interpolate after last sample? #2032

collares · 2019-08-20T16:42:26Z

Describe the issue
The spec requires interpolation for playhead positions not corresponding to sampled times. Therefore, after each sample, there is an interval of duration 1/buffer.sampleRate corresponding to valid playhead positions which must be interpolated. This is also true for the interval after the last sample, but the spec doesn't specify how to produce the interpolated values for this region (since there is no next point to interpolate with).

This affects existing WPT tests: buffer-resampling.html seems to assume that the value of the last sample should be used for the whole interval, but another option is to interpolate with silence.

Where Is It
https://webaudio.github.io/web-audio-api/#playback-AudioBufferSourceNode -- more precisely, "Sub-sample start offsets or loop points may require additional interpolation between sample frames" does not cover the last interval described above, because it is not between sample frames.

Additional Information
This is not related to looping, and in fact only applies when looping is disabled.

rtoy · 2019-08-22T15:43:46Z

Teleconf: Firefox and Chrome pass the buffer-resampling test, so we should spec that behavior

karlt · 2019-08-22T23:11:00Z

This affects existing WPT tests: buffer-resampling.html seems to assume that the value of the last sample should be used for the whole interval, but another option is to interpolate with silence.

buffer-resampling.html doesn't expect that the value of the last sample is used for the whole subsequent interval.

It expects that interpolation after the last sample of one buffer is consistent with the interpolation before the first sample of the subsequent (adjacent) buffer.

Interpolation after the last sample should assume that the imaginary next sample in the buffer is zero, just like interpolation before the first sample should assume that the imaginary previous sample in the buffer was zero. i.e. it would be correct to interpolate with silence.

collares · 2019-08-22T23:50:40Z

I understand where your intuition comes from, but there is an inherent asymmetry: Whenever playback starts, it starts at least from the first sample in the buffer (maybe a bit later if the start time is subsample-accurate). Therefore, there is no situation in which we interpolate with an imaginary previous sample if we are linearly interpolating.

But the plot thickens! I just implemented "use the last sample for the whole subsequent interval" in Servo and this does not make the test pass:

Interpolating with silence, I get 10 wrong samples for buffer-resampling.html. The SNR is 20.05dB, below the 37.17dB threshold.
Using the last sample value for the interval, I get 5 wrong samples. The SNR is 32.76dB, below the 37.17dB threshold.

Apparently Blink extrapolates from the last two samples (this behavior was added along with buffer-resampling.html in [0]), and Firefox uses an interpolation algorithm that is better than linear (via libspeex). So buffer-resampling.html seems to require a better algorithm for the last interval.

[0] https://chromium.googlesource.com/chromium/src.git/+/feaba58ccedc657f5d4ee23c5b11825de876bf0f%5E%21/#F1

karlt · 2019-08-23T01:50:42Z

That asymmetry is a problem when upsampling as in this test (and also if applied to non-sample-aligned start times). The noise threshold of 0.09 seems fairly high, so I wonder if that is involved.

In buffer-resampling.html, the buffer has a sample rate of 8000 Hz while the context is rendering at 48000 Hz.

The first sample in the buffer represents a sinc function with frequency 8000Hz. If that is centered on the start time, then there would be a few significantly non-zero 48000 Hz rendering samples before the start time.
An interpolation that chopped off those samples would not be ideal.

I found https://www.psaudio.com/article/cardinal-sinc/ helpful.

collares · 2019-08-23T14:23:57Z

The first sample in the buffer represents a sinc function with frequency 8000Hz. If that is centered on the start time, then there would be a few significantly non-zero 48000 Hz rendering samples before the start time.
An interpolation that chopped off those samples would not be ideal.

This is definitely an issue from an audio quality standpoint, but outputting samples before the Node's start time violates lines 95--96 of the playback algorithm (besides being a bit counter-intuitive). This would be impossible to implement if the Node is to start playing immediately, but I agree with you: this should be considered for nodes scheduled for playing in the future. I filed issue #2047 for tracking a specific way of implementing this.

I found https://www.psaudio.com/article/cardinal-sinc/ helpful.

Personally, I agree that linear interpolation is a bad idea due to physical/DSP considerations (and I plan on using libspeex in Servo too once I decide on how to handle the case where the loop length is not a multiple of the "buffer offset per tick"), and I would be happy to see the spec mandating better-than-linear interpolation.

But I feel the issue you raise is mostly orthogonal to the present issue, because the spec strongly suggests linear interpolation is a valid implementation strategy. The spec says "may require additional interpolation between sample frames" (emphasis mine), which in my opinion requires clarification. From reading the spec, it didn't occur to me that linear extrapolation (or better, such as sinc interpolation) would be required instead of just desirable.

rtoy · 2019-08-27T22:14:35Z

I had to check the code to see what Chrome is doing. See https://cs.chromium.org/chromium/src/third_party/blink/renderer/modules/webaudio/audio_buffer_source_node.cc?rcl=d0788ba8029af2c73443ef598ed5871f1cc44450&l=348

Based on the comment there, it's linearly extrapolating the last two samples to find the output sample. I guess that's kind of reasonable since you don't know what the following value would be since you're at the end of the buffer.

rtoy · 2019-09-26T17:24:08Z

See also WebAudio/web-audio-api-v2#38.

padenot · 2019-10-31T15:53:14Z

But I feel the issue you raise is mostly orthogonal to the present issue, because the spec strongly suggests linear interpolation is a valid implementation strategy. The spec says "may require additional interpolation between sample frames" (emphasis mine), which in my opinion requires clarification. From reading the spec, it didn't occur to me that linear extrapolation (or better, such as sinc interpolation) would be required instead of just desirable.

AudioWG call: We're going to do a fix for this bit in bold, but it is indeed related to WebAudio/web-audio-api-v2#38, that we'll get clarified in V2.

rtoy · 2019-10-31T16:42:48Z

A little more info from the call. We'll probably say it's extrapolated, but leave the extrapolation method unspecified.

Simple justification: if you're doing buffer stitching and have ABSNs that are contiguous parts of a large audio source where all the pieces are basically continuous, then extrapolation will produce a value that is close to the next value from the next buffer. If you interpolate between the last sample and zero, the output will probably differ quite a bit from the next buffer value unless it happened to be 0.

haywirez · 2020-06-09T15:49:29Z

(Also) related v2 issue: https://github.com/WebAudio/web-audio-api-v2/issues/25

rtoy · 2020-09-11T15:57:35Z

Let's see how this goes. Assume an AudioBuffer with one channel of length 3 with the same sample rate as the AudioContext. For simplicity, we'll be working with frames, not seconds.

Let interp(n,m) be some function to interpolate between frames n and m of the AudioBuffer. (This isn't specified; it could be simple linear interpolation or a more complicated sinc interpolator.)

Assume the user calls start(0.5). Let out[n] be the output at frame n. Then,

out[0] = 0;  // because we haven't started yet
out[1] = interp(0, 1);
out[2] = interp(1, 2);
out[3] = ?;
out[4] = 0;

I think the above is straightforward. But what about out[3]? We can't do interp(2, 3) because there is no frame 3 in the buffer.

There were two options here:

Interpolate between frame 2 and a value of 0.
Extraplate

The conclusion from the teleconf was to extrapolate to produce this output. So frames 1 and 2 (and possibly more) are used to extrapolate an appropriate output value.

Whenever we run out of data but need one more sample we extrapolate from previous values. This includes the case where the sample rates are different or the playbackRate is not 1.

I don't intend to put this much detail into the spec; I think we can just say that if any of the following holds for a non-looping source

the start time is not on a frame boundary
the sample rates differ
the playbackRate is not 1

then last output value is extrapolated from the last values of the buffer. The extrapolation method is implementation-dependent.

Update `playbackSignal` to mention that extrapolation is used to compute the output sample when the buffer is not looping and we're at the end of the buffer but need output a sample after the end of the buffer but before the next output sample frame.

karlt · 2020-10-15T04:59:55Z

Assume the user calls start(0.5). Let out[n] be the output at frame n. Then,

out[0] = 0; // because we haven't started yet

interp(-1, 0) would usually give a better result here, as an interpolation between the value zero (at frame -1) and the value at frame 0 of the buffer.

The case for this is even stronger with start((n + ε)/F) and 0 < ε ≪ 1, which is a likely scenario given the limited precision of double start times. In this case, using out[n] = 0 would be skipping the first frame in the buffer. The wish to fabricate an additional sample at the end of the buffer would motivated by the missing first sample in the subsequent contiguous buffer being stitched.

If, however, the first sample is interpolated with zero, then the last sample can be interpolated with zero, which provides consistent interpolation between contiguous buffers.

If the playback algorithm doesn't support this, then it is not conforming to the stated principles "Sub-sample start offsets or loop points may require additional interpolation between sample frames" and "Resampling of the buffer may be performed arbitrarily by the UA at any desired point to increase the efficiency or quality of the output."

collares mentioned this issue Aug 20, 2019

AudioBufferSourceNode: Implement linear extrapolation for endpoints servo/media#300

Open

rtoy added the Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/ label Aug 22, 2019

collares mentioned this issue Aug 22, 2019

AudioBufferSourceNode: Hold last sample instead of interpolating with 0 servo/media#302

Closed

collares mentioned this issue Aug 25, 2019

AudioBufferSourceNode: Allow for (small) negative offsets in Start() when subsampling #2047

Closed

mdjp added this to the Web Audio V1 milestone Sep 16, 2019

karlt mentioned this issue Oct 15, 2020

Fix #2032: ABSN extrapolates the last output #2256

Merged

rtoy closed this as completed in 23502fc Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AudioBufferSourceNode: How to interpolate after last sample? #2032

AudioBufferSourceNode: How to interpolate after last sample? #2032

collares commented Aug 20, 2019

rtoy commented Aug 22, 2019

karlt commented Aug 22, 2019

collares commented Aug 22, 2019 •

edited

Loading

karlt commented Aug 23, 2019

collares commented Aug 23, 2019 •

edited

Loading

rtoy commented Aug 27, 2019

rtoy commented Sep 26, 2019

padenot commented Oct 31, 2019

rtoy commented Oct 31, 2019

haywirez commented Jun 9, 2020 •

edited

Loading

rtoy commented Sep 11, 2020

karlt commented Oct 15, 2020

AudioBufferSourceNode: How to interpolate after last sample? #2032

AudioBufferSourceNode: How to interpolate after last sample? #2032

Comments

collares commented Aug 20, 2019

rtoy commented Aug 22, 2019

karlt commented Aug 22, 2019

collares commented Aug 22, 2019 • edited Loading

karlt commented Aug 23, 2019

collares commented Aug 23, 2019 • edited Loading

rtoy commented Aug 27, 2019

rtoy commented Sep 26, 2019

padenot commented Oct 31, 2019

rtoy commented Oct 31, 2019

haywirez commented Jun 9, 2020 • edited Loading

rtoy commented Sep 11, 2020

karlt commented Oct 15, 2020

collares commented Aug 22, 2019 •

edited

Loading

collares commented Aug 23, 2019 •

edited

Loading

haywirez commented Jun 9, 2020 •

edited

Loading