[WIP] add buffered chunked streaming for nemo force aligner #6185

Slyne · 2023-03-13T20:42:23Z

What does this PR do ?

To support buffered chunked streaming inference for nemo neural force aligner.

Collection: [Note which collection this PR will affect]
ASR

Changelog

add an argument (keep_logits default is False) for transcribe function in FrameBatchASR to keep the inference logits.
return the logits(log probs) if keep_logits is set to True.

Usage

python align.py \
        pretrained_name="stt_en_citrinet_1024_gamma_0_25" \
        batch_size=32 \
        model_downsample_factor=8 \
        manifest_filepath=$manifest \
        use_buffered_chunked_streaming=true \
        output_dir=./buffered_streaming_test_pred \
        total_buffer_in_secs=10 \
        align_using_pred_text=true

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

tools/nemo_forced_aligner/align.py

erastorgueva-nv

Thank you for this PR Slyne. Please modify the FrameBatchASR parameter as described in the comment.

erastorgueva-nv · 2023-03-29T21:22:37Z

cc @jbalam-nv for visibility on changes to streaming_utils. I don't think they will break anything.
We need to add self.decoder, self.cfg and self.preprocessor attributes to FrameBatchASR so we can use them downstream in NFA. We also add some keep_logits flags but they are False by default.

jbalam-nv

LGTM

Signed-off-by: Slyne Deng <slyned@nvidia.com>

erastorgueva-nv

LGTM, this is the same as before

titu1994

Overall it looks awesome ! Minor comments, feel free to ignore for now

titu1994 · 2023-04-04T05:52:49Z

nemo/collections/asr/parts/utils/streaming_utils.py

        frame_buffers = self.frame_bufferer.get_buffers_batch()

        while len(frame_buffers) > 0:
            self.frame_buffers += frame_buffers[:]
            self.data_layer.set_signal(frame_buffers[:])
-            self._get_batch_preds()
+            self._get_batch_preds(keep_logits)


Is it possible to avoid changing signatures of these functions ? Ie set a bool value from config or some other way (class arg or setter function) and rest of the functions just use that ?

titu1994 · 2023-04-04T06:00:07Z

nemo/collections/asr/parts/utils/streaming_utils.py

+            return hypothesis
+
+        all_logits = []
+        for log_prob in self.all_logits:


Could you put this in a function? I feel like it gets repeated a lot

) * add nfa buffered streaming Signed-off-by: Slyne Deng <slyned@nvidia.com> * restore to previous __iter__ function Signed-off-by: Slyne Deng <slyned@nvidia.com> --------- Signed-off-by: Slyne Deng <slyned@nvidia.com> Co-authored-by: Slyne Deng <slyned@nvidia.com> Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>

github-actions bot added the ASR label Mar 13, 2023

erastorgueva-nv reviewed Mar 29, 2023

View reviewed changes

tools/nemo_forced_aligner/align.py Outdated Show resolved Hide resolved

erastorgueva-nv requested changes Mar 29, 2023

View reviewed changes

erastorgueva-nv requested a review from jbalam-nv March 29, 2023 22:05

jbalam-nv previously approved these changes Mar 30, 2023

View reviewed changes

erastorgueva-nv previously approved these changes Mar 30, 2023

View reviewed changes

Slyne dismissed stale reviews from erastorgueva-nv and jbalam-nv via 6c0c064 April 3, 2023 19:12

github-actions bot added CI common core Changes to NeMo Core NLP Speaker Tasks TTS and removed Speaker Tasks NLP TTS CI common core Changes to NeMo Core labels Apr 3, 2023

Slyne Deng added 2 commits April 3, 2023 13:02

add nfa buffered streaming

79bd9d8

Signed-off-by: Slyne Deng <slyned@nvidia.com>

restore to previous __iter__ function

7d79e5d

Signed-off-by: Slyne Deng <slyned@nvidia.com>

erastorgueva-nv approved these changes Apr 4, 2023

View reviewed changes

titu1994 reviewed Apr 4, 2023

View reviewed changes

erastorgueva-nv added 2 commits April 4, 2023 10:20

Merge branch 'main' into nfa_buffered_streaming

8fa3be3

Merge branch 'main' into nfa_buffered_streaming

5f0dae4

erastorgueva-nv merged commit 0285bee into NVIDIA:main Apr 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] add buffered chunked streaming for nemo force aligner #6185

[WIP] add buffered chunked streaming for nemo force aligner #6185

Slyne commented Mar 13, 2023

erastorgueva-nv left a comment

erastorgueva-nv commented Mar 29, 2023

jbalam-nv left a comment

erastorgueva-nv left a comment

titu1994 left a comment

titu1994 Apr 4, 2023

titu1994 Apr 4, 2023

[WIP] add buffered chunked streaming for nemo force aligner #6185

[WIP] add buffered chunked streaming for nemo force aligner #6185

Conversation

Slyne commented Mar 13, 2023

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

erastorgueva-nv left a comment

Choose a reason for hiding this comment

erastorgueva-nv commented Mar 29, 2023

jbalam-nv left a comment

Choose a reason for hiding this comment

erastorgueva-nv left a comment

Choose a reason for hiding this comment

titu1994 left a comment

Choose a reason for hiding this comment

titu1994 Apr 4, 2023

Choose a reason for hiding this comment

titu1994 Apr 4, 2023

Choose a reason for hiding this comment