Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing Rubber Band padding, preventing it from eating the initial transient #11120

Merged
merged 13 commits into from
Dec 21, 2022
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 94 additions & 31 deletions src/engine/bufferscalers/enginebufferscalerubberband.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
#include "engine/bufferscalers/enginebufferscalerubberband.h"

#include <rubberband/RubberBandStretcher.h>

#include <QtDebug>

#include "control/controlobject.h"
Expand All @@ -19,7 +17,7 @@ namespace {
// TODO (XXX): this should be removed. It is only needed to work around
// a Rubberband 1.3 bug.
// This is the default increment from RubberBand 1.8.1.
size_t kRubberBandBlockSize = 256;
constexpr size_t kRubberBandBlockSize = 256;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to update the comment?
Our oldest supported version is 1.8.2 from Ubuntu Focal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, completely remove the block size limit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
I have double checked, the bug has been fixed in Rubberband 1.7
breakfastquay/rubberband@c26dc1d

By the way, Rubberband will never request more than getPreferredStartPad() * 2
https://github.com/breakfastquay/rubberband/blob/de56cd114a678003dfef17abbd74ebd9203964eb/src/finer/R3Stretcher.cpp#L565
Maybe we can use this info to improve our buffers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted most of adb034b in 9b8e82a. It Works On My System ™️ with librubberband 3.1.2.


#define RUBBERBANDV3 (RUBBERBAND_API_MAJOR_VERSION >= 2 && RUBBERBAND_API_MINOR_VERSION >= 7)

Expand All @@ -28,22 +26,16 @@ size_t kRubberBandBlockSize = 256;
EngineBufferScaleRubberBand::EngineBufferScaleRubberBand(
ReadAheadManager* pReadAheadManager)
: m_pReadAheadManager(pReadAheadManager),
m_buffer_back(SampleUtil::alloc(MAX_BUFFER_LEN)),
m_buffers{mixxx::SampleBuffer(MAX_BUFFER_LEN), mixxx::SampleBuffer(MAX_BUFFER_LEN)},
m_bufferPtrs{m_buffers[0].data(), m_buffers[1].data()},
m_interleavedReadBuffer(MAX_BUFFER_LEN),
m_bBackwards(false),
m_useEngineFiner(false) {
m_retrieve_buffer[0] = SampleUtil::alloc(MAX_BUFFER_LEN);
m_retrieve_buffer[1] = SampleUtil::alloc(MAX_BUFFER_LEN);
// Initialize the internal buffers to prevent re-allocations
// in the real-time thread.
onSampleRateChanged();
}

EngineBufferScaleRubberBand::~EngineBufferScaleRubberBand() {
SampleUtil::free(m_buffer_back);
SampleUtil::free(m_retrieve_buffer[0]);
SampleUtil::free(m_retrieve_buffer[1]);
}

void EngineBufferScaleRubberBand::setScaleParameters(double base_rate,
double* pTempoRatio,
double* pPitchRatio) {
Expand Down Expand Up @@ -142,31 +134,46 @@ void EngineBufferScaleRubberBand::clear() {
VERIFY_OR_DEBUG_ASSERT(m_pRubberBand) {
return;
}
m_pRubberBand->reset();
reset();
}

SINT EngineBufferScaleRubberBand::retrieveAndDeinterleave(
CSAMPLE* pBuffer,
SINT frames) {
SINT frames_available = m_pRubberBand->available();
SINT frames_to_read = math_min(frames_available, frames);
const SINT frames_available = m_pRubberBand->available();
const SINT frames_to_read = math_min(frames_available, frames);
SINT received_frames = static_cast<SINT>(m_pRubberBand->retrieve(
m_retrieve_buffer, frames_to_read));
m_bufferPtrs.data(), frames_to_read));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to read here frames_to_read + frame_offset;
Even though the buffer is more than big enough, it would be correct to check for its size.
We may consider to use a way smaller buffer than MAX_BUFFER_LEN b.t.w.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, did you miss to push a commit?

The function expects frames and we need to discard frame_offset from the retrieved buffer. So we need to retrieve frames + frame_offset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 755a4a4.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked code makes that the function returns less frames than it might could have read. So we should not clamp the value to frames , but to frames + frame_offset in line 132

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we should not. I can't check what m_pRubberBand->available() returns here, but logically it would return 1024 if it's doing overlap-add with four times overlap. At a 1.0 pitch ratio/when the time stretcher should not be making any adjustments, we need to throw away half a window's worth of samples, which is 2048 in this case. So in that situation the first two calls to this function must read and return 0 samples or else we've been reading the processed padding that we're trying to throw away here.

The linked code makes that the function returns less frames than it might could have read.

It can only read a small window of output at a time. Presumably this is 1/4th of the FFT window size. So 1024 samples. Which means that the first two calls to this function (each producing 1024 samples of garbage output) should be thrown away.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a bit debug code to shows the issue after a seek:
qDebug() << "padding = " << m_remainingPaddingInOutput << "frames = " << frames << "available = " << frames_available;

debug [Engine] padding  =  1025 frames  =  256 available  =  0
debug [Engine] padding  =  1025 frames  =  256 available  =  512
debug [Engine] padding  =  769 frames  =  256 available  =  768                 
debug [Engine] padding  =  513 frames  =  256 available  =  1024
debug [Engine] padding  =  257 frames  =  256 available  =  1280
debug [Engine] padding  =  1 frames  =  256 available  =  1536
debug [Engine] padding  =  0 frames  =  1 available  =  1792
debug [Engine] padding  =  0 frames  =  256 available  =  1791
debug [Engine] padding  =  0 frames  =  256 available  =  1535
debug [Engine] padding  =  0 frames  =  256 available  =  1279

In the second call, we read only 256 frames from the available 512. The available frames are stacked up before they are reduced again. We need to read all available frames in that case. This needs to be fixed.

Copy link
Contributor Author

@robbert-vdh robbert-vdh Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah I see what you mean now! Fixed in d3aa210.

Somewhat related, would there be any interest in exposing the Rubberband R3 + short window option? I added it for myself here: robbert-vdh@af570f4 Its performance is in line with R2 (and also no huge spikes when searching), and quality wise it's more or less a sidegrade to R2. It doesn't have R2's time domain transient preservation, but it also doesn't introduce spurious transients and it does still sound better than R2 without transient preservation. Hmm maybe not. It usually does better than R2, but it sometimes also transforms kickdrums into smeary messes.

SINT frame_offset = 0;

// As explained below in `reset()`, the first time this is called we need to
// drop the silence we fed into the time stretcher as padding from the
// output
if (m_remainingPaddingInOutput > 0) {
const SINT drop_num_frames = std::min(received_frames, m_remainingPaddingInOutput);

m_remainingPaddingInOutput -= drop_num_frames;
received_frames -= drop_num_frames;
frame_offset += drop_num_frames;
}

DEBUG_ASSERT(received_frames <= static_cast<ssize_t>(m_buffers[0].size()));
SampleUtil::interleaveBuffer(pBuffer,
m_retrieve_buffer[0],
m_retrieve_buffer[1],
received_frames);
m_buffers[0].data() + frame_offset,
m_buffers[1].data() + frame_offset,
received_frames);

return received_frames;
}

void EngineBufferScaleRubberBand::deinterleaveAndProcess(
const CSAMPLE* pBuffer, SINT frames, bool flush) {
DEBUG_ASSERT(frames <= static_cast<ssize_t>(m_buffers[0].size()));
Copy link
Member

@JoergAtGithub JoergAtGithub Dec 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I try to compile this locally on Windows with VS2022, it fails with an error, that identifier ssize_t is undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize ssize_t isn't from the standard library, and it compiles fine with MSVC on the CI. Probably through some transitive dependency. I guess ptrdiff_t would be the portable alternative that comes closest to ssize_t.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik its part of the C standard library. I guess you could also #include <cstddef>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't appear to be. Libusb and the other libraries Mixxx uses that also use ssize_t seem to explicitly define it on Windows.


SampleUtil::deinterleaveBuffer(
m_retrieve_buffer[0], m_retrieve_buffer[1], pBuffer, frames);
m_buffers[0].data(), m_buffers[1].data(), pBuffer, frames);

m_pRubberBand->process(m_retrieve_buffer,
m_pRubberBand->process(m_bufferPtrs.data(),
frames,
flush);
}
Expand All @@ -192,17 +199,20 @@ double EngineBufferScaleRubberBand::scaleBuffer(
// enough calls to retrieveAndDeinterleave because CachingReader returns
// zeros for reads that are not in cache. So it's safe to loop here
// without any checks for failure in retrieveAndDeinterleave.
// If the time stretcher has just been reset then this will throw away
// the first `m_remainingPaddingInOutput` samples of silence padding
// from the output.
SINT received_frames = retrieveAndDeinterleave(
read, remaining_frames);
remaining_frames -= received_frames;
total_received_frames += received_frames;
read += getOutputSignal().frames2samples(received_frames);

if (break_out_after_retrieve_and_reset_rubberband) {
//qDebug() << "break_out_after_retrieve_and_reset_rubberband";
// qDebug() << "break_out_after_retrieve_and_reset_rubberband";
// If we break out early then we have flushed RubberBand and need to
// reset it.
m_pRubberBand->reset();
reset();
break;
}

Expand All @@ -221,26 +231,26 @@ double EngineBufferScaleRubberBand::scaleBuffer(
iLenFramesRequired = kRubberBandBlockSize;
}
}
//qDebug() << "iLenFramesRequired" << iLenFramesRequired;
// qDebug() << "iLenFramesRequired" << iLenFramesRequired;

if (remaining_frames > 0 && iLenFramesRequired > 0) {
SINT iAvailSamples = m_pReadAheadManager->getNextSamples(
// The value doesn't matter here. All that matters is we
// are going forward or backward.
(m_bBackwards ? -1.0 : 1.0) * m_dBaseRate * m_dTempoRatio,
m_buffer_back,
getOutputSignal().frames2samples(iLenFramesRequired));
// The value doesn't matter here. All that matters is we
// are going forward or backward.
(m_bBackwards ? -1.0 : 1.0) * m_dBaseRate * m_dTempoRatio,
m_interleavedReadBuffer.data(),
getOutputSignal().frames2samples(iLenFramesRequired));
SINT iAvailFrames = getOutputSignal().samples2frames(iAvailSamples);

if (iAvailFrames > 0) {
last_read_failed = false;
deinterleaveAndProcess(m_buffer_back, iAvailFrames, false);
deinterleaveAndProcess(m_interleavedReadBuffer.data(), iAvailFrames, false);
} else {
if (last_read_failed) {
// Flush and break out after the next retrieval. If we are
// at EOF this serves to get the last samples out of
// RubberBand.
deinterleaveAndProcess(m_buffer_back, 0, true);
deinterleaveAndProcess(m_interleavedReadBuffer.data(), 0, true);
break_out_after_retrieve_and_reset_rubberband = true;
}
last_read_failed = true;
Expand Down Expand Up @@ -279,10 +289,63 @@ void EngineBufferScaleRubberBand::useEngineFiner(bool enable) {
}
}

// See
// https://github.com/breakfastquay/rubberband/commit/72654b04ea4f0707e214377515119e933efbdd6c
// for how these two functions were implemented within librubberband itself
size_t EngineBufferScaleRubberBand::getPreferredStartPad() const {
#if RUBBERBANDV3
return m_pRubberBand->getPreferredStartPad();
#else
// `getPreferredStartPad()` returns `window_size / 2`, while with
// `getLatency()` both time stretching engines return `window_size / 2 /
// pitch_scale`
return static_cast<size_t>(std::ceil(
m_pRubberBand->getLatency() * m_pRubberBand->getPitchScale()));
#endif
}

size_t EngineBufferScaleRubberBand::getStartDelay() const {
#if RUBBERBANDV3
return m_pRubberBand->getStartDelay();
#else
// In newer Rubber Band versions `getLatency()` is a deprecated alias for
// `getStartDelay()`, so they should behave the same. In the commit linked
// above the behavior was different for the R3 stretcher, but that was only
// during the initial betas of Rubberband 3.0 so we shouldn't have to worry
// about that.
return m_pRubberBand->getLatency();
#endif
}

int EngineBufferScaleRubberBand::runningEngineVersion() {
#if RUBBERBANDV3
return m_pRubberBand->getEngineVersion();
#else
return 2;
#endif
}

void EngineBufferScaleRubberBand::reset() {
m_pRubberBand->reset();

// As mentioned in the docs (https://breakfastquay.com/rubberband/code-doc/)
// and FAQ (https://breakfastquay.com/rubberband/integration.html#faqs), you
// need to run some silent samples through the time stretching engine first
// before using it. Otherwise it will eat add a short fade-in, destroying
// the initial transient.
size_t remaining_padding = getPreferredStartPad();
std::fill_n(m_buffers[0].span().begin(), kRubberBandBlockSize, 0.0f);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment that this filling zeros is the correct solution here, because we always fade in from zero after seek to avoid click sounds. This means that the first half of the buffer needs to be zero anyway.

I have verified it with a recording in Audacity that this looks good.

My fist assumption to use here real samples is correct but they have zero gain = 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the idea of the zero padding is that if the STFT process is working correctly, then those samples will not have any impact on the rest of the output. See my comment above. I added a link to the comment in the code.

std::fill_n(m_buffers[1].span().begin(), kRubberBandBlockSize, 0.0f);
while (remaining_padding > 0) {
const size_t pad_samples = std::min<size_t>(remaining_padding, kRubberBandBlockSize);
m_pRubberBand->process(m_bufferPtrs.data(), pad_samples, false);

remaining_padding -= pad_samples;
}

// The silence we just added covers half a window (see the last paragraph of
// https://github.com/mixxxdj/mixxx/pull/11120#discussion_r1050011104). This
// silence should be dropped from the result when the `retrieve()` in
// `retrieveAndDeinterleave()` first starts producing audio.
m_remainingPaddingInOutput = static_cast<SINT>(getStartDelay());
}
45 changes: 37 additions & 8 deletions src/engine/bufferscalers/enginebufferscalerubberband.h
Original file line number Diff line number Diff line change
@@ -1,21 +1,27 @@
#pragma once

#include <rubberband/RubberBandStretcher.h>

#include <array>

#include "engine/bufferscalers/enginebufferscale.h"
#include "util/memory.h"

namespace RubberBand {
class RubberBandStretcher;
} // namespace RubberBand
#include "util/samplebuffer.h"

class ReadAheadManager;

// Uses librubberband to scale audio. This class is not thread safe.
class EngineBufferScaleRubberBand : public EngineBufferScale {
class EngineBufferScaleRubberBand final : public EngineBufferScale {
Q_OBJECT
public:
explicit EngineBufferScaleRubberBand(
ReadAheadManager* pReadAheadManager);
~EngineBufferScaleRubberBand() override;

Swiftb0y marked this conversation as resolved.
Show resolved Hide resolved
EngineBufferScaleRubberBand(const EngineBufferScaleRubberBand&) = delete;
EngineBufferScaleRubberBand& operator=(const EngineBufferScaleRubberBand&) = delete;

EngineBufferScaleRubberBand(EngineBufferScaleRubberBand&&) = delete;
EngineBufferScaleRubberBand& operator=(EngineBufferScaleRubberBand&&) = delete;

// Let EngineBuffer know if engine v3 is available
static bool isEngineFinerAvailable();
Expand All @@ -38,7 +44,17 @@ class EngineBufferScaleRubberBand : public EngineBufferScale {
// Reset RubberBand library with new audio signal
void onSampleRateChanged() override;

/// Calls `m_pRubberBand->getPreferredStartPad()`, with backwards
/// compatibility for older librubberband versions.
size_t getPreferredStartPad() const;
/// Calls `m_pRubberBand->getStartDelay()`, with backwards compatibility for
/// older librubberband versions.
size_t getStartDelay() const;
int runningEngineVersion();
/// Reset the rubberband instance and run the prerequisite amount of padding
/// through it. This should be used instead of calling
/// `m_pRubberBand->reset()` directly.
void reset();

void deinterleaveAndProcess(const CSAMPLE* pBuffer, SINT frames, bool flush);
SINT retrieveAndDeinterleave(CSAMPLE* pBuffer, SINT frames);
Expand All @@ -48,11 +64,24 @@ class EngineBufferScaleRubberBand : public EngineBufferScale {

std::unique_ptr<RubberBand::RubberBandStretcher> m_pRubberBand;

CSAMPLE* m_retrieve_buffer[2];
CSAMPLE* m_buffer_back;
/// The audio buffers samples used to send audio to Rubber Band and to
/// receive processed audio from Rubber Band. This is needed because Mixxx
/// uses interleaved buffers in most other places.
std::array<mixxx::SampleBuffer, 2> m_buffers;
/// These point to the buffers in `m_buffers`. They can be defined here
/// since this object cannot be moved or copied.
std::array<float*, 2> m_bufferPtrs;

/// Contains interleaved samples read from `m_pReadAheadManager`. These need
/// to be deinterleaved before they can be passed to Rubber Band.
mixxx::SampleBuffer m_interleavedReadBuffer;

// Holds the playback direction
bool m_bBackwards;
/// The amount of silence padding that still needs to be dropped from the
/// retrieve samples in `retrieveAndDeinterleave()`. See the `reset()`
/// function for an explanation.
SINT m_remainingPaddingInOutput = 0;

bool m_useEngineFiner;
};