Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Unispeech & Unispeech-SAT #13963

Merged
merged 30 commits into from
Oct 26, 2021

Conversation

patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Oct 11, 2021

What does this PR do?

This PR adds UniSpeech from Microsoft: https://github.com/microsoft/UniSpeech

TODOS:

Future PR:

  • Correct pretraining loss

@patrickvonplaten
Copy link
Contributor Author

Wait until #13877 is merged

@patrickvonplaten
Copy link
Contributor Author

PR is good for review IMO:

@patrickvonplaten
Copy link
Contributor Author

I think we can merge the pretrained models now. To make them "promotable" we should still do 2 things:

  • Unispeech: Add phoneme <-> text tokenizer, need some feedback here from the authors
  • Unispeech-SAT: the model should work very well for speaker-verification and speaker-diarization. We should add those two tasks and then promote the model on it as it performs very well

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding those two models!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
docs/source/model_doc/unispeech.rst Outdated Show resolved Hide resolved
src/transformers/models/unispeech/modeling_unispeech.py Outdated Show resolved Hide resolved
tests/test_modeling_unispeech.py Outdated Show resolved Hide resolved
tests/test_modeling_unispeech.py Outdated Show resolved Hide resolved
tests/test_modeling_unispeech_sat.py Outdated Show resolved Hide resolved
tests/test_modeling_unispeech_sat.py Outdated Show resolved Hide resolved
Copy link
Member

@anton-l anton-l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks a lot for debugging the original models!

src/transformers/models/unispeech/modeling_unispeech.py Outdated Show resolved Hide resolved

# quantize all (unmasked) extracted features and project to final vq dim
extract_features = self.dropout_features(outputs[1])
quantized_features, codevector_perplexity = self.quantizer(extract_features)
Copy link
Member

@anton-l anton-l Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't UniSpeech use the same masking strategy for quantization as Wav2Vec? Or did you remove masking just for debugging purposes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretraining is quite different and not implemented yet really - this code should not be used yet

@patrickvonplaten patrickvonplaten changed the title Add Unispeech Add Unispeech & Unispeech-SAT Oct 26, 2021
@patrickvonplaten patrickvonplaten merged commit 9f3aa46 into huggingface:master Oct 26, 2021
@patrickvonplaten patrickvonplaten deleted the add_unispeech branch October 26, 2021 17:00
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
* unispeech

* add copy from

* remove hubert copy from

* finish for today

* add unispeech-sat

* adapt more

* up

* up

* up

* up

* add modeling

* add tests

* up

* up

* finish

* up

* Apply suggestions from code review

* up

* up

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* up

* up

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants