-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerate transcribe_speech.py
for short-form data: pre-sorting support
#8564
Conversation
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
5b00dbf
to
3282a5b
Compare
transcribe_speech.py
for short-form data: sorting and bucketing supporttranscribe_speech.py
for short-form data: pre-sorting support
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, otherwise looks good
@@ -389,6 +389,7 @@ def transcribe( | |||
augmentor: DictConfig = None, | |||
verbose: bool = True, | |||
override_config: Optional[MultiTaskTranscriptionConfig] = None, | |||
**config_kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No **kwargs in public apis please. Only the base class has it to enable sending subclass args inside the function via super
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were leftovers from addition of bucketing args, there were 6 of them so it was cumbersome to add each explicitly; will remove
@@ -852,16 +856,15 @@ def _setup_transcribe_dataloader(self, config: Dict) -> 'torch.utils.data.DataLo | |||
dl_config = { | |||
'manifest_filepath': os.path.join(config['temp_dir'], 'manifest.json'), | |||
'sample_rate': self.preprocessor._sample_rate, | |||
'batch_size': batch_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, good catch
@@ -125,6 +125,7 @@ def transcribe( | |||
augmentor: DictConfig = None, | |||
verbose: bool = True, | |||
override_config: Optional[TranscribeConfig] = None, | |||
**config_kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert please
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
…into transcribe-bucketing
jenkins |
@@ -387,6 +387,8 @@ def transcribe( | |||
num_workers: int = 0, | |||
channel_selector: Optional[ChannelSelectorType] = None, | |||
augmentor: DictConfig = None, | |||
text_field: str = "text", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isnt very good - adding such fields opens the way for dozens of other fields like src_text field and dest_text_field and pnc etc for all models even those that don't use it.
Better way of doing it is to hide them inside of the data classes of these models. If the user needs to override the default, they can instantiate an object of the dataclass and use override_config
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
d3183b0
to
73c5c30
Compare
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment, LGTM
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much much better now, thanks for the awesome work !
audio_key = cfg.get('audio_key', 'audio_filepath') | ||
audio_file = get_full_path(audio_file=item[audio_key], manifest_file=cfg.dataset_manifest) | ||
filepaths.append(audio_file) | ||
has_two_fields = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment that you mean offset and duration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
jenkins |
2 similar comments
jenkins |
jenkins |
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
124bb31
to
9629b2b
Compare
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great !
…pport (#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com>
…pport (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: Zeeshan Patel <zeeshanp@berkeley.edu>
…pport (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: duongvdo <duong.do@hs-osnabrueck.de>
assert cfg.dataset_manifest is not None | ||
if cfg.presort_manifest: | ||
with NamedTemporaryFile("w", suffix=".json", delete=False) as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…pport (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: Agoniii <815244047@qq.com>
…pport (#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: ataghibakhsh <ataghibakhsh@nvidia.com>
…pport (#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: Pablo Garay <pagaray@nvidia.com>
…t-only) dataloading (#8581) * wip Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Partially working config groups Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with abasic group in the input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with nested groups in input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with specifying a YAML path for input_cfg Signed-off-by: Piotr Żelasko <petezor@gmail.com> * a very rough example of text dataloading via lhotse Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Cleaner integration of multimodal audio/text loading that allows to control the effective audio vs text size (requires latest lhotse) Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove obsolete test Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix an import in export_utils.py (#8571) Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Yttm deprecation (#8322) * yttm deprecation init commit Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * removed tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * path fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * fixing path Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated tests to spm Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated Jenkinsfile Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * new model with spm in tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * yttm removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated aayn config Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> --------- Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed missing copy import in rnnt_decoder.py (#8580) * Added copy import to rnnt_decoding.py Signed-off-by: Isaac McFadyen <isaac@imcf.me> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix bug in RNNT Joint WER calculation for fused batch (#8587) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed Context Parallel HtoD sync (#8557) * Fixed cp HtoD sync Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * change default and add key to config files (#8594) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix triton import guards (#8552) * Fix triton import guards Signed-off-by: Michal Futrega <mfutrega@nvidia.com> * Update attention.py Signed-off-by: Michal Futrega <mfutrega@nvidia.com> --------- Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add config key for dropout position in LoRA adapter (#8583) Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix ia3 mlp infused adapter (#8597) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Prevent Redundant Gather for LoRA Sequence Parallel (#8602) * enable layernorm output gathered Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Accelerate `transcribe_speech.py` for short-form data: pre-sorting support (#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Bump min required lhotse version Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add some documentation about this config format and the multimodal features Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add caution about multiple shards Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Address Tom's code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add copyright header Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix (hopefully) issue with forced ascii encoding in CI Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Support resolving input_cfg path into config contents Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code review changes in docs Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix unicode decode error Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Signed-off-by: Isaac McFadyen <isaac@imcf.me> Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Selvaraj Anandaraj <anandaraj@wisc.edu> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: Pablo Garay <palenq@gmail.com>
…t-only) dataloading (NVIDIA#8581) * wip Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Partially working config groups Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with abasic group in the input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with nested groups in input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with specifying a YAML path for input_cfg Signed-off-by: Piotr Żelasko <petezor@gmail.com> * a very rough example of text dataloading via lhotse Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Cleaner integration of multimodal audio/text loading that allows to control the effective audio vs text size (requires latest lhotse) Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove obsolete test Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix an import in export_utils.py (NVIDIA#8571) Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Yttm deprecation (NVIDIA#8322) * yttm deprecation init commit Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * removed tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * path fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * fixing path Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated tests to spm Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated Jenkinsfile Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * new model with spm in tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * yttm removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated aayn config Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> --------- Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed missing copy import in rnnt_decoder.py (NVIDIA#8580) * Added copy import to rnnt_decoding.py Signed-off-by: Isaac McFadyen <isaac@imcf.me> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix bug in RNNT Joint WER calculation for fused batch (NVIDIA#8587) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed Context Parallel HtoD sync (NVIDIA#8557) * Fixed cp HtoD sync Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * change default and add key to config files (NVIDIA#8594) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix triton import guards (NVIDIA#8552) * Fix triton import guards Signed-off-by: Michal Futrega <mfutrega@nvidia.com> * Update attention.py Signed-off-by: Michal Futrega <mfutrega@nvidia.com> --------- Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add config key for dropout position in LoRA adapter (NVIDIA#8583) Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix ia3 mlp infused adapter (NVIDIA#8597) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Prevent Redundant Gather for LoRA Sequence Parallel (NVIDIA#8602) * enable layernorm output gathered Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Accelerate `transcribe_speech.py` for short-form data: pre-sorting support (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Bump min required lhotse version Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add some documentation about this config format and the multimodal features Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add caution about multiple shards Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Address Tom's code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add copyright header Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix (hopefully) issue with forced ascii encoding in CI Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Support resolving input_cfg path into config contents Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code review changes in docs Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix unicode decode error Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Signed-off-by: Isaac McFadyen <isaac@imcf.me> Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Selvaraj Anandaraj <anandaraj@wisc.edu> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: Pablo Garay <palenq@gmail.com>
…t-only) dataloading (NVIDIA#8581) * wip Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Partially working config groups Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with abasic group in the input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with nested groups in input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with specifying a YAML path for input_cfg Signed-off-by: Piotr Żelasko <petezor@gmail.com> * a very rough example of text dataloading via lhotse Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Cleaner integration of multimodal audio/text loading that allows to control the effective audio vs text size (requires latest lhotse) Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove obsolete test Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix an import in export_utils.py (NVIDIA#8571) Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Yttm deprecation (NVIDIA#8322) * yttm deprecation init commit Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * removed tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * path fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * fixing path Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated tests to spm Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated Jenkinsfile Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * new model with spm in tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * yttm removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated aayn config Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> --------- Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed missing copy import in rnnt_decoder.py (NVIDIA#8580) * Added copy import to rnnt_decoding.py Signed-off-by: Isaac McFadyen <isaac@imcf.me> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix bug in RNNT Joint WER calculation for fused batch (NVIDIA#8587) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed Context Parallel HtoD sync (NVIDIA#8557) * Fixed cp HtoD sync Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * change default and add key to config files (NVIDIA#8594) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix triton import guards (NVIDIA#8552) * Fix triton import guards Signed-off-by: Michal Futrega <mfutrega@nvidia.com> * Update attention.py Signed-off-by: Michal Futrega <mfutrega@nvidia.com> --------- Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add config key for dropout position in LoRA adapter (NVIDIA#8583) Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix ia3 mlp infused adapter (NVIDIA#8597) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Prevent Redundant Gather for LoRA Sequence Parallel (NVIDIA#8602) * enable layernorm output gathered Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Accelerate `transcribe_speech.py` for short-form data: pre-sorting support (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Bump min required lhotse version Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add some documentation about this config format and the multimodal features Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add caution about multiple shards Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Address Tom's code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add copyright header Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix (hopefully) issue with forced ascii encoding in CI Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Support resolving input_cfg path into config contents Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code review changes in docs Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix unicode decode error Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Signed-off-by: Isaac McFadyen <isaac@imcf.me> Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Selvaraj Anandaraj <anandaraj@wisc.edu> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: Pablo Garay <palenq@gmail.com>
…t-only) dataloading (#8581) * wip Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Partially working config groups Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with abasic group in the input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with nested groups in input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with specifying a YAML path for input_cfg Signed-off-by: Piotr Żelasko <petezor@gmail.com> * a very rough example of text dataloading via lhotse Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Cleaner integration of multimodal audio/text loading that allows to control the effective audio vs text size (requires latest lhotse) Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove obsolete test Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix an import in export_utils.py (#8571) Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Yttm deprecation (#8322) * yttm deprecation init commit Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * removed tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * path fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * fixing path Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated tests to spm Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated Jenkinsfile Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * new model with spm in tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * yttm removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated aayn config Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> --------- Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed missing copy import in rnnt_decoder.py (#8580) * Added copy import to rnnt_decoding.py Signed-off-by: Isaac McFadyen <isaac@imcf.me> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix bug in RNNT Joint WER calculation for fused batch (#8587) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed Context Parallel HtoD sync (#8557) * Fixed cp HtoD sync Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * change default and add key to config files (#8594) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix triton import guards (#8552) * Fix triton import guards Signed-off-by: Michal Futrega <mfutrega@nvidia.com> * Update attention.py Signed-off-by: Michal Futrega <mfutrega@nvidia.com> --------- Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add config key for dropout position in LoRA adapter (#8583) Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix ia3 mlp infused adapter (#8597) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Prevent Redundant Gather for LoRA Sequence Parallel (#8602) * enable layernorm output gathered Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Accelerate `transcribe_speech.py` for short-form data: pre-sorting support (#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Bump min required lhotse version Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add some documentation about this config format and the multimodal features Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add caution about multiple shards Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Address Tom's code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add copyright header Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix (hopefully) issue with forced ascii encoding in CI Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Support resolving input_cfg path into config contents Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code review changes in docs Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix unicode decode error Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Signed-off-by: Isaac McFadyen <isaac@imcf.me> Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Selvaraj Anandaraj <anandaraj@wisc.edu> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Signed-off-by: Ao Tang <aot@nvidia.com>
…pport (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com>
…t-only) dataloading (NVIDIA#8581) * wip Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Partially working config groups Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with abasic group in the input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with nested groups in input config Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Working test with specifying a YAML path for input_cfg Signed-off-by: Piotr Żelasko <petezor@gmail.com> * a very rough example of text dataloading via lhotse Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Cleaner integration of multimodal audio/text loading that allows to control the effective audio vs text size (requires latest lhotse) Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove obsolete test Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix an import in export_utils.py (NVIDIA#8571) Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Yttm deprecation (NVIDIA#8322) * yttm deprecation init commit Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * removed tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * path fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * fixing path Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated tests to spm Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated Jenkinsfile Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * new model with spm in tests Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * yttm removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * updated aayn config Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> --------- Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed missing copy import in rnnt_decoder.py (NVIDIA#8580) * Added copy import to rnnt_decoding.py Signed-off-by: Isaac McFadyen <isaac@imcf.me> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix bug in RNNT Joint WER calculation for fused batch (NVIDIA#8587) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fixed Context Parallel HtoD sync (NVIDIA#8557) * Fixed cp HtoD sync Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * change default and add key to config files (NVIDIA#8594) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix triton import guards (NVIDIA#8552) * Fix triton import guards Signed-off-by: Michal Futrega <mfutrega@nvidia.com> * Update attention.py Signed-off-by: Michal Futrega <mfutrega@nvidia.com> --------- Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add config key for dropout position in LoRA adapter (NVIDIA#8583) Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix ia3 mlp infused adapter (NVIDIA#8597) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Prevent Redundant Gather for LoRA Sequence Parallel (NVIDIA#8602) * enable layernorm output gathered Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Accelerate `transcribe_speech.py` for short-form data: pre-sorting support (NVIDIA#8564) * POC using bucketing in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * extend to multi task aed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes for aed multi task text/lang field selectors Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove assert Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix Signed-off-by: Piotr Żelasko <petezor@gmail.com> * expose option for bucket buffer size Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fixes, ctc support Signed-off-by: Piotr Żelasko <petezor@gmail.com> * support pre-sorting manifests in transcribe_speech.py Signed-off-by: Piotr Żelasko <petezor@gmail.com> * cleanup Signed-off-by: Piotr Żelasko <petezor@gmail.com> * reorder transcriptions back to original manifest order Signed-off-by: Piotr Żelasko <petezor@gmail.com> * remove bucketing entirely Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes Signed-off-by: Piotr Żelasko <petezor@gmail.com> * code review changes--amend Signed-off-by: Piotr Żelasko <petezor@gmail.com> * refactor text_field/lang_field passing Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix reordering bug; disable presorting for multi task for now Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add support for presort + multi task model Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code reviews Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix jenkins tests, add user-friendly error msg for canary Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Bump min required lhotse version Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add some documentation about this config format and the multimodal features Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add caution about multiple shards Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Address Tom's code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Add copyright header Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix (hopefully) issue with forced ascii encoding in CI Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Support resolving input_cfg path into config contents Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Code review changes in docs Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix unicode decode error Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Signed-off-by: Isaac McFadyen <isaac@imcf.me> Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: w4-jinhyeonkim <131935801+w4-jinhyeonkim@users.noreply.github.com> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Selvaraj Anandaraj <anandaraj@wisc.edu> Co-authored-by: Selvaraj Anandaraj <selvaraja@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Michal Futrega <mfutrega@nvidia.com> Co-authored-by: Pablo Garay <palenq@gmail.com>
What does this PR do ?
Adds manifest pre-sorting option (new default)
and options to enable and control dynamic bucketing for inference.Depending on data distribution, pre-sorting can significantly increase inference speed (we observed as much as 2.5x throughput).
Bucketing is approximate so expect smaller gains (we observed 1.7-2x increase in throughput depending on the model and decoding algorithm selection).EDIT: I removed bucketing support after realizing it'd require another refactoring of transcribe internals to be able to restore the original order.
Also fixes a bug in processing manifests for multi task AED models (we can now override text/lang fields from transcribe_speech CLI)
Collection: ASR
Changelog
transcribe_speech.py
for short-form data: pre-sorting supportUsage
# Add a code snippet demonstrating how to use this
Jenkins CI
To run Jenkins, a NeMo User with write access must comment
jenkins
on the PR.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information