Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to write a custom CutTransform ? #1262

Closed
kobenaxie opened this issue Jan 16, 2024 · 2 comments
Closed

How to write a custom CutTransform ? #1262

kobenaxie opened this issue Jan 16, 2024 · 2 comments

Comments

@kobenaxie
Copy link

I want to use SoxEffectTransform to perform augmentaions like low_pass or pitch shift in torchaudio, how to write a custom cut_transforms, which can be used like CutMix.

transforms = []
transforms.append(
    CutCustom(...)
)
transforms.append(
    CutMix(...)
)
dataset = K2SpeechRecognitionDataset(
    cut_transforms=ransforms,
)
@pzelasko
Copy link
Collaborator

You can check this PR to see the steps needed to add e.g. volume perturbation #382

However, in your case as you're unlikely to modify either sampling rate or num_samples/duration (so the metadata is unaffected), it might be a good idea to implement those as a signal transform instead -- see the ones implemented here https://github.com/lhotse-speech/lhotse/blob/master/lhotse/dataset/signal_transforms.py

They are supported e.g. in K2SpeechRecognitionDataset via input_transforms https://github.com/lhotse-speech/lhotse/blob/master/lhotse/dataset/speech_recognition.py#L65

@kobenaxie
Copy link
Author

Thank you for your reply and suggestions, I also find input_transforms in K2SpeechRecoginitionDataset() which is what i need in my case.

@kobenaxie kobenaxie changed the title How to write a custom CutTransoform ? How to write a custom CutTransform ? Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants