-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WAV.jl not reading wav files with incorrect file size in header #109
Comments
If the original file was 0x61f92214 + 8 = 1_643_717_148 bytes long, then the first chunk size in the file was clearly wrong, and this would not be a valid RIFF/WAV file. Some implementations may simply ignore the outermost chunk size, but as you can see in function wavread, this implementation starts by reading You could write yourself a very simple tool to correct such broken files, or even better fix the source of these files. How was it produced? I guess to decide whether it is worth to make |
@mgkuhn Thank you very much for the response. The source was the internal audio storage of a hydrophone. As to whether this is a common issue, I have no idea. It seems the typical applications in my field (marine bioacoustics) ignore the outer chunk size, but I have only tried one group of files from a particular hydrophone. I will contact the developer and let them know. I have several datasets from different sources that I can check to see if this is an isolated case and let you know. Should WAV.jl produce some kind of error message when this occurs? It currently just produces an empty array. Or again this may depend on if this is a common issue. |
I got the impression that If a process writes a WAV file while data is being recorded, it may not know from the beginning how long the file is going to be. In such situations, the process needs to seek at the end of the recording back to near the start of the file to update the chunk size there. If a recording was aborted in some uncontrolled way (e.g., your hydrophone lost power or someone pulled out the storage medium without stopping the recording first), that final adjustment of the chunk size may never have happened, and you ended up with a slightly corrupted file. |
I have come across a wav file that I cannot read using WAV.jl (or LibSndFile.jl). However, I can open it in about any other audio program or library (Matlab, R, Raven, Audacity, …) that I have tried. If I read the wav file into Matlab and just write it back out, then it reads fine with WAV.jl. This later process seems to correct the reading issue using WAV.jl.
I compared the problem wav file with a corrected version created by reading into matlab and writing back out. I am calling the original wav file "mdoc" and the corrected version "mdoc_rewrite". I looked for differences using hexdump. This is the result (*.txt is hexdump output):
I believe positions 5-8 are the file chunk size (little endian) and there is a difference. The file size rewritten from Matlab is correct (if the position 8 value is 61, the chunk correctly corresponds to the wav file size minus 8 bytes). I've also tried putting arbitrary numbers in positions 5-8…matlab can still read the file.
Is this an issue with WAV.jl or an incorrectly written wav file? Does it not matter if the file size in the header is incorrect with most programs? Or is there something I’m missing in my understanding of WAV.jl or wav file structure.
Thank you - Robert
The text was updated successfully, but these errors were encountered: