-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
float sample_rate #891
Comments
ping @f0k @bmcfee @justinsalamon @nils-werner to get some more dsp guys on board here... |
💯 for floating point. There's no good reason to require integer sampling rates, and any interoperability requirements (eg C++ extensions) should have an explicit cast so it's clear what's going on. |
Floating point. How would you decimate 44,100 by a factor of 8 otherwise?! |
Looks like there's no need for further comments on my part :) |
any comment by @psmaragdis should indeed taken seriously :-D Here is a list of how some popular audio packages handle sample rate types
so the majority of c/c++ libs use int, also one more compelling argument is that the sample rate in WAV files is stored as an integer. While I would love to use float as its physically more correct, I guess it is too far out of scope to make a change here... @vincentqb what do you think? |
@faroit -- just to make sure I understand: your point is that because of other libraries, int should be the used, right? Let's say someone resamples a float 44100 hz waveform to 441000/8 hz = 5512.5 hz as suggested above. :) What would be the expected behavior when saving the signal in a file format or i/o library that only supports |
i am just saying that its more common. Actually why don't we take int and floats for the resample function and use int everywhere else?
Interesting ;-) I had no idea, so I tried this and resampled a 44.1k file using sox and ffmpeg
|
I wouldn't limit this software based on what I/O libraries do. We don't use 16bit samples in our algorithms because some formats only support that, right? It shouldn't be the same for the sample rate. Keep in mind that some audio people work at very different scales (e.g. infrasound or ultrasound). Rounding a fractional sample rate for them can completely ruin their algorithms. @faroit textbook resampling is only defined for integer upsampling/decimation that's why the resampling function inputs are integers, they actually define the flow graph that's being used. Their ratio though can be fractional, and lead to a fractional value, e.g. 44100/8. |
Of course the real hot question is whether the sample rate should be complex valued :) |
offline discussion with @sw005320 who votes for integer, though I'll let him chime in if he wants to elaborate :) |
I'm leaning towards float -- it seems a bit arbitrary to require an even number of samples in a second. If we're afraid of rounding errors, we could optionally support Taking this from the other side, what are the obstacles we can expect with non-integer sample rates? Are there serious problems? Would be curious to hear @sw005320's arguments! |
After I read through the discussion, I agree with the other people's consensus, which is float. I was thinking that we could always cast it to float internally if needed, and we can keep the original integer representation when we load the WAV file, but it would be tricky. I think we should just use float. |
Just a sidenote, because this originated in a discussion about type hints: for type hints, |
float, and round if needed, possibly issuing a warning ? |
Alrighty, I'm convinced -- we'll go ahead with float for sample rate :) We'll start with handling sample rate on the python side.
Once we have that, I'd suggest a version later than the above,
|
Co-authored-by: holly1238 <77758406+holly1238@users.noreply.github.com>
[FX] Add example of tracer that records module qualname for each node
torchaudio currently treats
sample_rate
as an integer. In #890, it was mentioned that float may be more appropriate.Thoughts?
The text was updated successfully, but these errors were encountered: