Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augmentation refactoring and torchaudio SoX effects support #124

Merged
merged 17 commits into from
Nov 13, 2020

Conversation

pzelasko
Copy link
Collaborator

@pzelasko pzelasko commented Nov 11, 2020

TL;DR

  • Changing the data augmentation APIs in Lhotse to accept a callable with signature like: def augment_fn(audio: Union[torch.tensor, np.ndarray], sampling_rate: int) -> np.ndarray
  • mirroring WavAugment capabilities with torchaudio.sox_effects

@pzelasko pzelasko changed the title [WIP] Augmentation refactoring and torchaudio SoX effects support Augmentation refactoring and torchaudio SoX effects support Nov 11, 2020
@pzelasko
Copy link
Collaborator Author

It's good to review.

@pzelasko pzelasko added this to the v0.2 milestone Nov 11, 2020
@pzelasko
Copy link
Collaborator Author

@freewym please check this out - it will be small but breaking for the recipe you're creating, but hopefully makes the whole setup way easier (no need to compile libsox, wavaugment, etc. just install the latest pytorch + torchaudio with anaconda) and gets us rid of various quirks with multiprocessing.

Copy link

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good regarding the usage of torchaudio's Sox Effects.

@vincentqb
Copy link

(glad to see the migration here! cc facebookresearch/WavAugment#16)

@freewym
Copy link
Contributor

freewym commented Nov 11, 2020

@freewym please check this out - it will be small but breaking for the recipe you're creating, but hopefully makes the whole setup way easier (no need to compile libsox, wavaugment, etc. just install the latest pytorch + torchaudio with anaconda) and gets us rid of various quirks with multiprocessing.

Cool. I think I am supposed to install PyTorch 1.7 in order to test it. I will do later today.

from .common import AugmentFn
from .wavaugment import *

if str(_torchaudio.__version__) >= '0.7.0':
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: If torchaudio hits 0.10.0 release in the future (we do not know if we will move to the major 1.0 release or not), this could produce a wrong result.

$ python
>>> '0.7.0' > '0.10.0'
True

A future-proof way would be to split the version string and compare major version and minor version as number.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -10,7 +10,7 @@
import torch

from lhotse.audio import Recording
from lhotse.augmentation import WavAugmenter
from lhotse.augmentation import AugmentFn, WavAugmenter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WavAugmenter is no longer useful (?) If it is the case, all appearances of WavAugmenter should be removed within this file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I want to keep it for now so that people with PyTorch older than 1.7 can also use it, but I will probably add a deprecation warning...

]


def reverb(sampling_rate: int) -> List[List[str]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make such functions more general, in that they can accept more arguments, e.g., the lower/up bound of room sizes can been passed into this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll change that

Copy link
Collaborator Author

@pzelasko pzelasko Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'd rather make a follow-up PR later on, as I'm not sure which parameters it makes sense to tweak and how general they should be. If we want to tweak everything it's simpler to just write your own chain...

(I'm open to suggestions)

end: Union[int, float]

def sample(self):
return random.uniform(self.start, self.end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use numpy random functions? It may be easier for me to seed it from outside

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using this function help (note that it also makes the random cut ID creation deterministic)?

https://github.com/lhotse-speech/lhotse/blob/master/lhotse/utils.py#L33

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, never mind. I will seed it in my own code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change it to numpy anyway, I guess more people are used to seeding numpy than random in training loops

return [
# Random speed perturbation factor between 0.9x and 1.1x the original speed
['speed', RandomValue(0.9, 1.1)],
['rate', sampling_rate], # Resample back to the original sampling rate (speed changes it)
Copy link
Contributor

@freewym freewym Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this line makes the running hang. It works without this line.

edit: actually not hang. It terminated with
"File "/export/fs04/a07/ywang/fairseq4/espresso/tools/lhotse/lhotse/cut.py", line 1311, in compute_and_store_features
executor.submit(
File "/export/b03/ywang/anaconda3/lib/python3.8/concurrent/futures/process.py", line 629, in submit
raise BrokenProcessPool(self._broken)
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😫

I'll have a look

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to replicate the issue and add a unit test that causes it. I submitted the issue to torchaudio here: pytorch/audio#1021

Copy link
Contributor

@freewym freewym Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if replacing RandomValue() above with a function _get_value(factor) where factor is simply returned, the running hangs as well. Do you have any clue of the cause?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the function defined as a closure (i.e. within another function) and captures some variable outside of its scope? That could explain it... Otherwise, I don't know.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freewym some good news, if you create executor like: ProcessPoolExecutor(..., mp_context=multiprocessing.get_context("spawn")) it solves the segfault/hanging problem. Could you try? If it works I'll go on and merge this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credits to @mthrok for suggesting this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to make it clear: it works for me on the grid, on my mac, and in GitHub CI, so it should be okay)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it works!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@pzelasko pzelasko merged commit d977170 into master Nov 13, 2020
@pzelasko pzelasko deleted the feature/augmentation-refactoring branch July 1, 2021 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data augmentation with Torchaudio
5 participants