Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache background_noise rms data #145

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fantasyRqg
Copy link

Boost background_noise performance.

  1. Reduce audio decode and file io
  2. Reduce rms compute. maybe a diffrenece between rms(partial audio) and rms(full audio)

@iver56
Copy link
Collaborator

iver56 commented Jun 20, 2022

Hi fantasyRgg, and thanks for your PR 😃

Just for context, so I understand the problem you're proposing to solve, I want to ask some questions:

  • How large is your background noise dataset?
  • If you are training a model, how many workers do you use for preparing the audio examples that go into the training batches?
  • How much memory (RAM) is there on the computer where you are doing the training?
  • What audio file format are your background noise files? And do they have the same sample rate as the "clean" input audios that the noises get added to?
  • Are you using an SSD or a HDD?

Ideally, a good solution would work well in all kinds of combinations of answers to those questions

@fantasyRqg
Copy link
Author

  • How large is your background noise dataset?

    About 2k records

  • If you are training a model, how many workers do you use for preparing the audio examples that go into the training batches?

    Only one worker, I tried multi worker, not fast enough.

  • How much memory (RAM) is there on the computer where you are doing the training?

    I cached samples and noises. samples took 7GB, noiese took 1.5GB

  • What audio file format are your background noise files? And do they have the same sample rate as the "clean" input audios that the noises get added to?

    I don't think audio format and sample rate is problem. audio: Audio paramter will take care of all problem.

  • Are you using an SSD or a HDD?

    HDD

@iver56
Copy link
Collaborator

iver56 commented Jun 29, 2022

Thanks for the insight :) Indeed, in your case it makes sense to apply caching like this.

  • HDD
  • Not very large dataset - fits in RAM
  • Single worker

My own use case is quite different, and would actually be best without caching:

  • SSD
  • Very large dataset, cannot fit in RAM
  • Many workers

I don't think audio format and sample rate is problem. audio: Audio paramter will take care of all problem.

The reason why I asked is that resampling (in case of mismatch) may take a significant amount of CPU time, slowing down the model training.

I'm currently wrapping up the 0.11 release, and then I'll have some work preparing a few new transforms, and then after that I'll hopefully have more time to consider this caching feature. In the meantime, thanks for your patience, and I hope you're okay with using your own fork for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants