float sample_rate #891

vincentqb · 2020-08-26T15:44:36Z

torchaudio currently treats sample_rate as an integer. In #890, it was mentioned that float may be more appropriate.

Librairies we use for saving/loading may not support float sample_rate.
C++ extensions currently use int.
torchaudio signatures use int.

Thoughts?

The text was updated successfully, but these errors were encountered:

faroit · 2020-08-26T16:09:23Z

ping @f0k @bmcfee @justinsalamon @nils-werner to get some more dsp guys on board here...

bmcfee · 2020-08-26T16:27:55Z

💯 for floating point. There's no good reason to require integer sampling rates, and any interoperability requirements (eg C++ extensions) should have an explicit cast so it's clear what's going on.

psmaragdis · 2020-08-26T17:03:10Z

Floating point. How would you decimate 44,100 by a factor of 8 otherwise?!

justinsalamon · 2020-08-26T19:18:17Z

Looks like there's no need for further comments on my part :)

faroit · 2020-08-27T11:24:48Z

Looks like there's no need for further comments on my part :)

any comment by @psmaragdis should indeed taken seriously :-D

Here is a list of how some popular audio packages handle sample rate types

soxr: double
libsndfile: not consistent, mostly int
ffmpeg: not consistent, depends on decoder/encoder submodule
tensorflow: int32
tfio: int64/int32
aubio: int
portaudio: double
libpd: inconsistent, mostly int
opus: int32
flac: int

so the majority of c/c++ libs use int, also one more compelling argument is that the sample rate in WAV files is stored as an integer.

While I would love to use float as its physically more correct, I guess it is too far out of scope to make a change here... @vincentqb what do you think?

vincentqb · 2020-08-27T21:00:01Z

@faroit -- just to make sure I understand: your point is that because of other libraries, int should be the used, right?

Let's say someone resamples a float 44100 hz waveform to 441000/8 hz = 5512.5 hz as suggested above. :) What would be the expected behavior when saving the signal in a file format or i/o library that only supports int? Should an error be raised because the sample rate is not convertible to int, int(5512.5) != 5512.5? The user would then have to explicitly resample to an int-valued sample rate to continue?

faroit · 2020-08-27T21:10:32Z

@faroit -- just to make sure I understand: your point is that because of other libraries, int should be the used, right?

i am just saying that its more common. Actually why don't we take int and floats for the resample function and use int everywhere else?

Let's say someone resamples a float 44100 hz waveform to 441000/8 hz = 5512.5 hz as suggested above. :) What would be the expected behavior when saving the signal in a file format or i/o library that only supports int? Should an error be raised because the sample rate is not convertible to int, int(5512.5) != 5512.5? The user would then have to explicitly resample to an int-valued sample rate to continue?

Interesting ;-) I had no idea, so I tried this and resampled a 44.1k file using sox and ffmpeg

sox 441.wav  -r 5512.5 out.wav
ffmpeg -i 441.wav -ar 5512.5 out.wav

sox didn't complain but saved the file as with 5512.0 Hz
ffmpeg complaint that Expected int64 for ar but found 5512.9

psmaragdis · 2020-08-27T21:21:16Z

I wouldn't limit this software based on what I/O libraries do. We don't use 16bit samples in our algorithms because some formats only support that, right? It shouldn't be the same for the sample rate.

Keep in mind that some audio people work at very different scales (e.g. infrasound or ultrasound). Rounding a fractional sample rate for them can completely ruin their algorithms.

@faroit textbook resampling is only defined for integer upsampling/decimation that's why the resampling function inputs are integers, they actually define the flow graph that's being used. Their ratio though can be fractional, and lead to a fractional value, e.g. 44100/8.

psmaragdis · 2020-08-27T21:22:09Z

Of course the real hot question is whether the sample rate should be complex valued :)

vincentqb · 2020-09-02T20:12:30Z

offline discussion with @sw005320 who votes for integer, though I'll let him chime in if he wants to elaborate :)

f0k · 2020-09-06T17:29:20Z

I'm leaning towards float -- it seems a bit arbitrary to require an even number of samples in a second. If we're afraid of rounding errors, we could optionally support fractions.Fraction.

Taking this from the other side, what are the obstacles we can expect with non-integer sample rates? Are there serious problems? Would be curious to hear @sw005320's arguments!

sw005320 · 2020-09-07T19:46:49Z

After I read through the discussion, I agree with the other people's consensus, which is float. I was thinking that we could always cast it to float internally if needed, and we can keep the original integer representation when we load the WAV file, but it would be tricky. I think we should just use float.

nils-werner · 2020-09-17T06:30:14Z

Just a sidenote, because this originated in a discussion about type hints: for type hints, int is included in float, so people can continue using ints as before, and only switch to floats if needed.

aliutkus · 2020-09-22T11:53:57Z

float, and round if needed, possibly issuing a warning ?

vincentqb · 2020-09-25T15:33:12Z

Alrighty, I'm convinced -- we'll go ahead with float for sample rate :)

We'll start with handling sample rate on the python side.

as a first step, we'll take the integer part of a sample_rate before sending it to any backend for saving or C++ extensions for processing. and load sample rate as an int and convert to float afterwards.
we'll add a note to inform the user on top of the release notes: either a warning, or a sentence in the documentation.
for algorithms (e.g. resampling) that only support integer sample rates, we'll add a note as above, and take the integer part
for algorithms that silently were doing integer division (e.g. fix resample type hint #888, updating silent integer division #890) where it was not needed, we'll migrate explicitly to float division.
we'll change the python type hints to float.

Once we have that, I'd suggest a version later than the above,

if we decide to make all backends behave the same for the end user, my recommendation is to add a flag to the load/save function, force_float with default to False, and make backends that do not support it raise an error when True. we can revisit this at that point.

Co-authored-by: holly1238 <77758406+holly1238@users.noreply.github.com>

[FX] Add example of tracer that records module qualname for each node

vincentqb mentioned this issue Aug 26, 2020

fix resample type hint #888

Closed

vincentqb mentioned this issue Dec 17, 2020

Simpler and faster resampling #1087

Merged

carolineechen mentioned this issue May 5, 2021

Resampling #1487

Closed

6 tasks

mthrok pushed a commit to mthrok/audio that referenced this issue Dec 13, 2022

Fix typo (pytorch#891)

4bd1164

Co-authored-by: holly1238 <77758406+holly1238@users.noreply.github.com>

mpc001 pushed a commit to mpc001/audio that referenced this issue Aug 4, 2023

Merge pull request pytorch#891 from jamesr66a/fx_module_tracer

4db1116

[FX] Add example of tracer that records module qualname for each node

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

float sample_rate #891

float sample_rate #891

vincentqb commented Aug 26, 2020 •

edited

Loading

faroit commented Aug 26, 2020

bmcfee commented Aug 26, 2020

psmaragdis commented Aug 26, 2020

justinsalamon commented Aug 26, 2020 •

edited

Loading

faroit commented Aug 27, 2020 •

edited

Loading

vincentqb commented Aug 27, 2020

faroit commented Aug 27, 2020

psmaragdis commented Aug 27, 2020

psmaragdis commented Aug 27, 2020

vincentqb commented Sep 2, 2020

f0k commented Sep 6, 2020

sw005320 commented Sep 7, 2020

nils-werner commented Sep 17, 2020 •

edited

Loading

aliutkus commented Sep 22, 2020

vincentqb commented Sep 25, 2020 •

edited

Loading

float sample_rate #891

float sample_rate #891

Comments

vincentqb commented Aug 26, 2020 • edited Loading

faroit commented Aug 26, 2020

bmcfee commented Aug 26, 2020

psmaragdis commented Aug 26, 2020

justinsalamon commented Aug 26, 2020 • edited Loading

faroit commented Aug 27, 2020 • edited Loading

vincentqb commented Aug 27, 2020

faroit commented Aug 27, 2020

psmaragdis commented Aug 27, 2020

psmaragdis commented Aug 27, 2020

vincentqb commented Sep 2, 2020

f0k commented Sep 6, 2020

sw005320 commented Sep 7, 2020

nils-werner commented Sep 17, 2020 • edited Loading

aliutkus commented Sep 22, 2020

vincentqb commented Sep 25, 2020 • edited Loading

vincentqb commented Aug 26, 2020 •

edited

Loading

justinsalamon commented Aug 26, 2020 •

edited

Loading

faroit commented Aug 27, 2020 •

edited

Loading

nils-werner commented Sep 17, 2020 •

edited

Loading

vincentqb commented Sep 25, 2020 •

edited

Loading