-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio file creation tests. #72
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the recommendation about using pytest
fixtures, I'm wondering where you got the various hard-coded tolerance / divergence values from that you are using, especially in test_wav_file
, test_mp3_file
, and test_ogg_file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't required but would clean this up somewhat. That is, if we're going to be touching the audio file tests, I would recommend taking this opportunity to also remove the contents of this file out of an offline script and use the fixture creation functionality provided by pytest
so that we don't have to commit the audio files.
This can be done using by following these steps:
- Move all helper methods to
tests/conftest.py
. - Still in
tests/conftest.py
or in the test module itself, the actual fixture methods can use thepytest
providedtmp_path
fixture which provides a temporary filepath that's persisted within the scope of the tests. For example:
@pytest.fixture
def mp3_filepath(tmp_path: Path) -> Path:
stereo_audio, sr = create_test_audio_stereo()
mp3_buffer = numpy_to_mp3(stereo_audio, sr)
save_path = tmp_path / "test_stereo.mp3"
with open(save_path, "wb") as out_file:
out_file.write(mp3_buffer.getvalue())
mp3_buffer.seek(0)
return save_path
- Use
mp3_filepath
as input to thetest_mp3_file
to test the file loading functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the older tests for LoadWithLibrosa
do use the files that were created manually viacreate_test_audio_files.py
. These tests could be replaced now with the fixture method, which would allow us to remove some of the wavs from this repo. That would be nice, but I'm going to treat it as a separate issue.
The tests in this PR operate on two kinds of audio
- in-memory test signals
- audio files with music content (on disk, in repo) which are used as a known baseline. These shouldn't be automatically generated, because we need a baseline to test against (otherwise our test just verifies that the audio file encoders output the same thing when called in succession, as opposed to verifying that they output the correct thing)
All the hard coded values are based on my empirical observations about values that reflected known-working condition. For wav files, we could manually calculate the expected delta between floating point audio and discrete 16/24 bit PCM audio. That seems like overkill to me, especially when it's the compressed file format writers that come with the most risk. Can you think of a better way to do this? If not, I think the current implementation is a reasonable balance of safety/effort...but I welcome suggestions. The main danger with the With that in mind, the next steps I'm imagining after confirming these tests are satisfactory:
|
We want to add support for writing data-compressed audio formats. However, there are some obstacles associated with reliably writing compressed audio from python
Avoid MP3s
I would like to avoid mp3s because mp3 compression introduces enough delay to make files out-of-sync. It's not suitable for multitrack audio or audio that needs to loop seamlessly. We are working with multitracks (producer model) so this is a non-starter.
Viable alternatives:
.ogg
(vorbis or opus).opus
(opus)We need a python library that can write either format reliably in-memory. Tragically, librosa and torchaudio are not good at this. Using the disk is too slow for the scale that we need.
Option 1. Python
soundfile
The python
soundfile
package can write.ogg
files with the vorbis codec, butnumpy_to_ogg
implementation)Option 2. Python
pydub
apt install
or compiling ffmpeg from scratch in the docker image.Testing
This PR adds support for testing that files encoded with lossy formats worked correctly. Once this is finalized we are setup to reliably pursue one of the options above.
The current implementation uses a few different mechanisms for testing that audio file encoding. These mechanisms are ready for review.