Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Caroline Chen committed Mar 15, 2022
1 parent 4b47412 commit 28bbde2
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 22 deletions.
6 changes: 6 additions & 0 deletions docs/source/refs.bib
Original file line number Diff line number Diff line change
Expand Up @@ -261,3 +261,9 @@ @article{capon1969high
year={1969},
publisher={IEEE}
}
@article{kahn2022flashlight,
title={Flashlight: Enabling Innovation in Tools for Machine Learning},
author={Kahn, Jacob and Pratap, Vineel and Likhomanenko, Tatiana and Xu, Qiantong and Hannun, Awni and Cai, Jeff and Tomasello, Paden and Lee, Ann and Grave, Edouard and Avidov, Gilad and others},
journal={arXiv preprint arXiv:2201.12465},
year={2022}
}
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@
# highest scores at each time step. A language model can be incorporated into
# the scoring computation, and adding a lexicon constraint restricts the
# next possible tokens for the hypotheses so that only words from the lexicon
# can be generated. A mathematical formula for the decoder optimization can be
# can be generated.
#
# The underlying implementation is ported from `Flashlight <https://arxiv.org/pdf/2201.12465.pdf>`__'s
# beam search decoder. A mathematical formula for the decoder optimization can be
# found in the `Wav2Letter paper <https://arxiv.org/pdf/1609.03193.pdf>`__, and
# a more detailed algorithm can be found in this `blog
# <https://towardsdatascience.com/boosting-your-sequence-generation-performance-with-beam-search-language-model-decoding-74ee64de435a>`__.
Expand Down
36 changes: 15 additions & 21 deletions torchaudio/prototype/ctc_decoder/ctc_decoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,21 @@ class Hypothesis(NamedTuple):
class LexiconDecoder:
"""torchaudio.prototype.ctc_decoder.LexiconDecoder()
Lexically contrained CTC Beam Search Decoder from *Flashlight* [:footcite:`kahn2022flashlight`]
Note:
To build the decoder, please use factory function
:py:func:`lexicon_decoder`.
To build the decoder, please use the factory function :py:func:`lexicon_decoder`.
Args:
nbest (int): number of best decodings to return
lexicon (Dict): lexicon mapping of words to spellings
word_dict (_Dictionary): dictionary of words
tokens_dict (_Dictionary): dictionary of tokens
lm (_LM): language model
decoder_options (_LexiconDecoderOptions): parameters used for beam search decoding
blank_token (str): token corresopnding to blank
sil_token (str): token corresponding to silence
unk_word (str): word corresponding to unknown
"""

def __init__(
Expand All @@ -57,24 +68,6 @@ def __init__(
sil_token: str,
unk_word: str,
) -> None:
"""
CTC Decoder with Lexicon constraint.
Note:
To build the decoder, please use the factory function lexicon_decoder.
Args:
nbest (int): number of best decodings to return
lexicon (Dict): lexicon mapping of words to spellings
word_dict (_Dictionary): dictionary of words
tokens_dict (_Dictionary): dictionary of tokens
lm (_LM): language model
decoder_options (_LexiconDecoderOptions): parameters used for beam search decoding
blank_token (str): token corresopnding to blank
sil_token (str): token corresponding to silence
unk_word (str): word corresponding to unknown
"""

self.nbest = nbest
self.word_dict = word_dict
self.tokens_dict = tokens_dict
Expand Down Expand Up @@ -196,7 +189,8 @@ def lexicon_decoder(
unk_word: str = "<unk>",
) -> LexiconDecoder:
"""
Builds Ken LM CTC Lexicon Decoder with given parameters
Builds a lexicon constrained CTC lexically constrained beam search decoder from
*Flashlight* [:footcite:`kahn2022flashlight`].
Args:
lexicon (str): lexicon file containing the possible words and corresponding spellings.
Expand Down

0 comments on commit 28bbde2

Please sign in to comment.