Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add buffered chunked streaming for nemo force aligner #6185

Merged
merged 4 commits into from
Apr 4, 2023
Merged

[WIP] add buffered chunked streaming for nemo force aligner #6185

merged 4 commits into from
Apr 4, 2023

Conversation

Slyne
Copy link
Contributor

@Slyne Slyne commented Mar 13, 2023

What does this PR do ?

To support buffered chunked streaming inference for nemo neural force aligner.

Collection: [Note which collection this PR will affect]
ASR

Changelog

  • add an argument (keep_logits default is False) for transcribe function in FrameBatchASR to keep the inference logits.
  • return the logits(log probs) if keep_logits is set to True.

Usage

python align.py \
        pretrained_name="stt_en_citrinet_1024_gamma_0_25" \
        batch_size=32 \
        model_downsample_factor=8 \
        manifest_filepath=$manifest \
        use_buffered_chunked_streaming=true \
        output_dir=./buffered_streaming_test_pred \
        total_buffer_in_secs=10 \
        align_using_pred_text=true

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@github-actions github-actions bot added the ASR label Mar 13, 2023
Copy link
Collaborator

@erastorgueva-nv erastorgueva-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR Slyne. Please modify the FrameBatchASR parameter as described in the comment.

@erastorgueva-nv
Copy link
Collaborator

cc @jbalam-nv for visibility on changes to streaming_utils. I don't think they will break anything.
We need to add self.decoder, self.cfg and self.preprocessor attributes to FrameBatchASR so we can use them downstream in NFA. We also add some keep_logits flags but they are False by default.

jbalam-nv
jbalam-nv previously approved these changes Mar 30, 2023
Copy link
Collaborator

@jbalam-nv jbalam-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Slyne Deng added 2 commits April 3, 2023 13:02
Signed-off-by: Slyne Deng <slyned@nvidia.com>
Signed-off-by: Slyne Deng <slyned@nvidia.com>
Copy link
Collaborator

@erastorgueva-nv erastorgueva-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is the same as before

Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks awesome ! Minor comments, feel free to ignore for now

frame_buffers = self.frame_bufferer.get_buffers_batch()

while len(frame_buffers) > 0:
self.frame_buffers += frame_buffers[:]
self.data_layer.set_signal(frame_buffers[:])
self._get_batch_preds()
self._get_batch_preds(keep_logits)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid changing signatures of these functions ? Ie set a bool value from config or some other way (class arg or setter function) and rest of the functions just use that ?

return hypothesis

all_logits = []
for log_prob in self.all_logits:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put this in a function? I feel like it gets repeated a lot

@erastorgueva-nv erastorgueva-nv merged commit 0285bee into NVIDIA:main Apr 4, 2023
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
)

* add nfa buffered streaming

Signed-off-by: Slyne Deng <slyned@nvidia.com>

* restore to previous __iter__ function

Signed-off-by: Slyne Deng <slyned@nvidia.com>

---------

Signed-off-by: Slyne Deng <slyned@nvidia.com>
Co-authored-by: Slyne Deng <slyned@nvidia.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants