Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Use torchaudio for sox effects #16

Merged
merged 2 commits into from
Dec 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 3 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ Among others, it implements the augmentations that we found to be most useful fo
Internally, WavAugment uses [libsox](http://sox.sourceforge.net/libsox.html) and allows interleaving of libsox- and pytorch-based effects.

### Requirements
* Linux (WavAugment is not tested under MacOS and might not work properly);
* [pytorch](pytorch.org) (however, there is also an option of using WavAugment directly from C++ w/o torch, see below);
* `libsox`: if you have [torchaudio](https://github.com/pytorch/audio) installed, most likely you already have `libsox`. Otherwise, you need to install it, e.g. by running `sudo apt-get install sox libsox-dev libsox-fmt-all`
* Linux or MacOS
* [pytorch](pytorch.org) >= 1.7
* [torchaudio](pytorch.org/audio) >= 0.7

### Installation
To install WavAugment, run the following command:
Expand Down Expand Up @@ -113,15 +113,6 @@ WavAugment remains explicit and doesn't add effects under the hood.
If you want to emulate a Sox command that decomposes into several effects, we advise to consult `sox -V` and apply the effects manually.
Try it out on some files before running a heavy machine-learning job.

## But I want to use it from C++
### Installation
It is possible to use directly WavAugment's C++ interface to libsox.
You will need to install `libsox`, e.g. by running
```bash
sudo apt-get install sox libsox-dev libsox-fmt-all
```
The C++ interface is provided as a single-header library, so you only need to include [this file](./augment/speech_augment.h).

## Citation
If you find WavAugment useful in your research, please consider citing:
```
Expand Down
135 changes: 0 additions & 135 deletions augment/augment.cpp

This file was deleted.

29 changes: 19 additions & 10 deletions augment/effects.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,30 @@

import torch
import numpy as np
import augment_cpp as _augment
import torchaudio
from torchaudio.sox_effects.sox_effects import effect_names as get_effect_names

def shutdown_sox() -> None:
_augment.shutdown_sox()
pass
Copy link
Contributor Author

@vincentqb vincentqb Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torchaudio now takes care of automatically initializing and shutting down sox. I kept it here to minimize the changes to the tests.



# Arguments that we can pass to effects
SoxArg = Optional[List[Union[str, int, Callable]]]


class _PyEffectChain:

def __init__(self):
self._effects = []

def add_effect(self, effect_name, effect_params):
params = [str(e) for e in effect_params]
self._effects.append([effect_name, *params])

def apply_flow_effects(self, tensor, src_info, target_info):
return torchaudio.sox_effects.apply_effects_tensor(tensor, int(src_info['rate']), self._effects)


class SoxEffect:
def __init__(self, name: str, args: SoxArg):
self.name = name
Expand Down Expand Up @@ -82,18 +96,13 @@ def _apply_sox_effects(chain: List[SoxEffect],
src_info: Dict,
target_info: Dict) -> Tuple[torch.Tensor, int]:
instantiated_chain = [x.instantiate() for x in chain]
sox_chain = _augment.PyEffectChain()
sox_chain = _PyEffectChain()
for effect_name, effect_args in instantiated_chain:
sox_chain.add_effect(effect_name, effect_args)

input_tensor.mul_(EffectChain._NORMALIZER)

out = input_tensor
sr = sox_chain.apply_flow_effects(input_tensor,
out,
out, sr = sox_chain.apply_flow_effects(input_tensor,
src_info,
target_info)
out.div_(EffectChain._NORMALIZER)
return out, sr

def apply(self,
Expand Down Expand Up @@ -279,5 +288,5 @@ def create_method(name):
return lambda s, *x: s._append_effect_to_chain(name, list(x)) # pylint: disable=protected-access


for _effect_name in _augment.get_effect_names():
for _effect_name in get_effect_names():
setattr(EffectChain, _effect_name, create_method(_effect_name))
168 changes: 0 additions & 168 deletions augment/speech_augment.h

This file was deleted.

2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
torch
torchaudio
torchaudio>=0.7
pytest
Loading