-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support files with frame rates over 48KHz and don't use temporary filenames #262
Conversation
This is great! Would you mind adding some short clips to the test data at these higher sample rates and a few short tests verifying they can be read and exported correctly? Also, if you pull from master it should fix your test failures here. |
Sure, I'll add some tests for high sample rates in the following days. I haven't worked on export yet since I don't need to export audio on my workflow with pydub, but I'll try to work on that too. Btw, I just noticed you just dropped python 2.6 support. The python2 travis build is failing because I tested on python2.7 (where struct.unpack_from accepts a bytearray parameter) and I was wondering how to test such an old python version to fix it. So dropping 2.6 support is good news :) |
This allows pydub to open files having more than 48KHz and/or 32-bit data. If scipy is not available, falls back to using the standard wave module as before. Fixes jiaaro#134
…m_file Rename from_file to from_file_using_temporary_files just in case there's any case in which the new from_file doesn't work (I couldn't find any, but just in case, I guess it would be nice to keep it maybe as deprecated). Add a new from_file function that does all the reading on memory with pipes, not using any temporary file, which is faster and doesn't wear down disks for heavy usages. The new from_file function reads the input file and passes it to ffmpeg using a pipe and then reads ffmpeg output using another pipe directly to memory. Since wav files have the file length in the header and ffmpeg can't write it since it's working on a stream, we modify the resulting raw data from ffmpeg before reading it using the standard method. Fixes jiaaro#237 Might also fix jiaaro#209
Instead of always using pipes to feed ffmpeg the input file, if we have a file on disk, just let ffmpeg open and read it by itself. ffmpeg needs to be able to seek on certain formats to read them, so we can't use pipes. This commit fixes reading those formats.
scipy only supports wav files with a bit depth in (8, 16, 32, 64, 96, 128) so in other cases (most notably, 24), it raises a ValueError("Unsupported " " bit depth: the wav file has {}-bit data.".format(bit_depth)) exception. Also, When the data doesn't start with the standard RIFF header, scipy raises a ValueError("File format {}... not understood.".format(repr(data))) exception. In those cases, as well as when scipy is not available, fallback to using the python standard wave module, just in case it does support reading that data (for some cases of 24 bit depth data, it does).
avconv doesn't seem to set the return code correctly and might return no data, print an error message and still give a return code of 0, so we handle having "no data" also as an error
It seems avconv is lazier than ffmpeg when initializing the wav headers. ffmpeg initialized the data subchunk header and only the RIFF chunk had to be tampered, but avconv also leaves the data subchunk header uninitialized, so we also have to set it to the correct value.
Running the tests under python3 shows over 60 warnings like ResourceWarning: unclosed file <_io.BufferedRandom name='/tmp/tmpwff3iqay.mp3'> With this commit, the warnings are reduced to 33.
As opposed to unpack, which works on strings in python2
Awesome! So could we have a new release of this version in pip as soon as possible? |
Glad that you like it, but note I still have to add tests and export support :) |
I will be happy to issue a release as soon as it's ready :) |
The current mediainfo function mixes information from different streams so mediainfo_json is a replacement which just asks avprobe/ffprobe to return the information as a json dictionary and parses it, thus giving separate information for all streams Additionally, avprobe doesn't return as much information as ffprobe on the json data, so we extract that missing information from its stderr output. Finally, also, remove an unneeded import from the code of mediainfo.
By default, avconv/ffmpeg generates pcm_s16le data when converting to a wav file, even if the input is using 24/32 bits. For this reason, we have to find out the bits_per_sample of the input data and then request the same value in the output file. Note that the wave module doesn't support a frequency over 44 KHz and scipy doesn't support 24 bits, so we convert to 32 bits in that case (sample_fmt would be 's32' while bits_per_sample would be 24) to not lose any information.
Instead of using sample_fmt, we just always generate the codec name from the bits_per_sample value (and hardcode it to be signed and little-endian)
ffmpeg might output text not using utf-8, in that case we shouldn't be parsing it anyway, so we just set decode to ignore the wrong encoded characters.
So let's not try to parse the output if there's no output from ffmpeg. This can happen for example for non-existant files.
For format_test.m4a, ffprobe is returning: Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 63 kb/s (default) So in order to get the stream id, we have to split on '(' and ')' characters too
Test files in 192kHz having 16, 24, 32 and 64 bit depth I also added a 192kHz/32bit flac file for completion, since it's a common format for that kind of files.
There are certain file formats which use float samples, so now we use 32 bits for those, and thus, require scipy to use those tests (using m4a, ogg, webm extensions).
The new 192kHz tests require scipy, but also other tests that worked before, just because ffmpeg/avconv converted files automatically to 16 bits samples, but now, if the file format uses float samples, we use 32 bit samples.
3.3.1 is not <= 3.3, so better compare for < 3.4
It seems the output of avprobe is slightly different in ubuntu than what I'm getting in openSUSE, so let's try if we can parse it like this.
Would like to help you with testing. I benefit from getting it working. I work with High Rez audio from 24bit 48000 96000 176400. I also deal with DSF audio files The Flac 24bit 48000 96000 176400 files converted with Ffmpeg and Sox to Wav give all the Error: unknown format: 65534 of wave.py. As determined by @antlarr: By default, avconv / ffmpeg generates pcm_s16le data when converting to a wav file, even if the input is using 24/32 bits. The working solution is: For dsf to flac: For flac to wav |
Hi @Crojav, I don't have any dsf file to test, but it would be nice if you could test my changes with your files. Please, check that you have scipy installed, as that's required for >48kHz support. Also, there are file formats that use a float as sample_fmt (like m4a or ogg vorbis). Those formats were supported because ffmpeg automatically converted them to 16 bits, but are actually 32 bits. My patches currently convert them to pcm_s32le (or pcm_s64le in case of samples using double as sample_fmt) which I think can be improved in the future to use actual float/double values, but for now I think it's already an improvement. The problem is that this means that to open those formats, now we require scipy. @jiaaro, is that ok for you? |
Btw, I don't know why the appveyor tests don't seem to honor the skipIf decorators (or maybe it's check_module_availability which is failing), but I don't have any windows machine to test, so I'd need help with that. Also, I wouldn't know how to install scipy on windows (is just using pip ok?) which would fix it, but still, it would be nice to get the skipIf decorators working (as it does on Linux) |
numpy.ndarray.tobytes was included in 1.9.0, so in case it's not available, just use bytes(array)
This way we don't require any external dependency and still can read all kind of bit depths and frame rates. I tested that all format fields and raw_data read using the scipy/wave modules and this new function are the same for test data. Since we now don't require scipy for >48kHz files anymore, I removed all conditions to skip tests, since we can run all of them without any condition.
This way we treat the same all kind of data and recognize 24/32 bit audio either if it's read from a file or if it's stored in memory.
Test with 24b - 32b / 44100 48000 88000 96000 176400 192000 hz samples trim_01a_32-44100.wav play trim_03d_32-176400.wav wave.Error: unknown format: 65534 |
@Crojav: Please, make sure you're using the latest code from my fork or wait until this pull request is accepted to test it. |
There are certain ffprobe versions (mostly on windows) that show that mp3/mp4/aac/webm/ogg files contain fltp samples. But afaik, those formats always use 16 bit samples, so this commit add a workaround to force those cases to use 16 bit depth. If for some reason, those format contain s24 or s32 samples, they'll be used correctly, but until ffprobe is "fixed", let's just convert fltp samples to s16le.
O.k now the latest code from @antlarr fork: 01a_32-44100.wav not playing |
Thanks for testing! Exactly how are you processing those files in your tests? Can you send me those files (or a download link) to test them myself? Are you testing on linux or windows? if linux, what distribution? using ffmpeg or avconv? |
https://github.com/Crojav/pytori/tree/master/test-wav_24b-32b https://github.com/Crojav/pytori/blob/master/pytori-playlist-pitch.py I am not aware of using ffmpeg or avconv for playing the test-wav_24b-32b ffmpeg is installed LinuxMint |
os.fsdecode was introduced in python 3.2, so I implemented fsdecode in utils.py to be able to use it on previous versions. fsdecode thus returns the filename passed as parameter as a unicode string (either if it's str or bytes), and if a PathLike object is used, it returns its filename. In other case, it raises a TypeError exception. Changed code in from_file and mediainfo_json to use it, which simplifies the code a bit and hopefully fixes the last windows appveyor error.
@Crojav, I did a quick test opening the files and it seems they're read correctly. Your script doesn't work since you first open them with stdlib's wave module but it raises an "Error: unknown format: 65534" exception, so it doesn't even reach the code that uses pydub. |
O.k good to know! I will leave out Wave to get framerate sampwidth and handle this with Scipy |
This allows pydub to open files having more than 48KHz
and/or 32-bit data if scipy is available.
If scipy is not available, it falls back to using the standard
wave module as before.
Also, add a new from_file function that does all the reading on memory with
pipes, not using any temporary file, which is faster and doesn't wear
down disks for heavy usages.
The new from_file function reads the input file and passes it to ffmpeg
using a pipe and then reads ffmpeg output using another pipe directly
to memory.
Since wav files have the file length in the header and ffmpeg can't
write it since it's working on a stream, we modify the resulting raw data
from ffmpeg before reading it using the standard method.
Also, rename from_file to from_file_using_temporary_files just in case there's
any case in which the new from_file function doesn't work (I couldn't find any,
but just in case, I guess it would be nice to keep it maybe as
deprecated).
Fixes #134
Fixes #237
Might also fix #209