Release NVIDIA Neural Modules 2.0.0 · NVIDIA/NeMo

Highlights

Large language models & Multi modal

Training
- Long context recipe
- PyTorch Native FSDP 1
Models
- Llama 3
- Mixtral
- Nemotron
NeMo 1.0
- SDXL (text-2-image)
- Model Opt
  - Depth Pruning (docs)
  - Logit based Knowledge Distillation (docs)

Export

TensorRT-LLM v0.12 integration
LoRA support for vLLM
FP8 checkpoint

ASR

Parakeet large (ASR with PnC model)
Added Uzbek offline and Gregorian streaming models
Optimization feature for efficient bucketing to improve bs consumption on GPUs

Detailed Changelogs

ASR

Changelog

add parakeet-tdt_ctc-110m model by @nithinraok :: PR: #10461
fix asr finetune by @stevehuang52 :: PR: #10508
replace unbiased with correction by @nithinraok :: PR: #10555
Update Multi_Task_Adapters.ipynb by @pzelasko :: PR: #10600
Fix asr warnings by @nithinraok :: PR: #10469
Fix typo in ASR RNNT BPE model by @pzelasko :: PR: #10742
TestEncDecMultiTaskModel for canary parallel by @karpnv :: PR: #10740
fix chunked infer by @stevehuang52 :: PR: #10581
training code for hybrid-autoregressive inference model by @hainan-xv :: PR: #10841
remove stacking operation from batched functions by @lilithgrigoryan :: PR: #10524
Add lhotse fixes for rnnt model training and WER hanging issue with f… by @nithinraok :: PR: #10821
Fix ASR tests by @artbataev :: PR: #10794
[Fix] Fixed sampler override and audio_key in prepare_audio_data by @anteju :: PR: #10980
[WIP] Add docs for NEST SSL by @stevehuang52 :: PR: #10804
Akoumparouli/mixtral recipe fix r2.0.0 by @akoumpa :: PR: #10994
TDT compute timestamps option and Extra Whitespace handling for SPE by @monica-sekoyan :: PR: #10875
ci: Switch to CPU only runner by @ko3n1g :: PR: #11035
Fix timestamps tests by @monica-sekoyan :: PR: #11053
ci: Pin release freeze by @ko3n1g :: PR: #11143
Fix RNN-T loss memory usage by @artbataev :: PR: #11144
Added deprecation notice by @Ssofja :: PR: #11133
Fixes for Canary adapters tutorial by @pzelasko :: PR: #11184
add ipython import guard by @nithinraok :: PR: #11191
Self Supervised Pre-Training tutorial Fix by @monica-sekoyan :: PR: #11206
update the return type by @nithinraok :: PR: #11210
Timestamps to transcribe by @nithinraok :: PR: #10950
[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
Beam search algorithm implementation for TDT models by @lilithgrigoryan :: PR: #10903

TTS

Changelog

Fix asr warnings by @nithinraok :: PR: #10469
Make nemo text processing optional in TTS by @blisc :: PR: #10584
[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045

NLP / NMT

Changelog

Bump Dockerfile.ci (2024-09-09) by @ko3n1g :: PR: #10423
MCORE interface for TP-only FP8 AMAX reduction by @erhoo82 :: PR: #10437
Remove Apex dependency if not using MixedFusedLayerNorm by @cuichenx :: PR: #10468
Add missing import guards for causal_conv1d and mamba_ssm dependencies by @janekl :: PR: #10429
Update doc for fp8 trt-llm export by @Laplasjan107 :: PR: #10444
set mock in GPTDatasetConfig by @akoumpa :: PR: #10435
Remove running validating after finetuning by @huvunvidia :: PR: #10560
Extending modelopt spec for TEDotProductAttention by @janekl :: PR: #10523
Fix mb_calculator import in lora tutorial by @BoxiangW :: PR: #10624
.nemo conversion bug fix by @dimapihtar :: PR: #10598
Updating modelopt spec for Mixtral by @janekl :: PR: #10660
Require setuptools>=70 and update deprecated api by @thomasdhc :: PR: #10659
Akoumparouli/fix get tokenizer list by @akoumpa :: PR: #10596
[McoreDistOptim] fix the naming to match apex.dist by @gdengk :: PR: #10707
[fix] Ensures disabling exp_manager with exp_manager=null does not error by @terrykong :: PR: #10651
[feat] Update get_model_parallel_src_rank to support tp-pp-dp ordering by @terrykong :: PR: #10652
feat: Migrate GPTSession refit path in Nemo export to ModelRunner for Aligner by @terrykong :: PR: #10654
[MCoreDistOptim] Add assertions for McoreDistOptim and fix fp8 arg specs by @gdengk :: PR: #10748
Fix for crashes with tensorboard_logger=false and VP + LoRA by @vysarge :: PR: #10792
Adding init_model_parallel to FabricMegatronStrategy by @marcromeyn :: PR: #10733
Moving steps to MegatronParallel to improve UX for Fabric by @marcromeyn :: PR: #10732
Adding setup_megatron_optimizer to FabricMegatronStrategy by @marcromeyn :: PR: #10833
Make FabricMegatronMixedPrecision match MegatronMixedPrecision by @marcromeyn :: PR: #10835
Fix VPP bug in MegatronStep by @marcromeyn :: PR: #10847
Expose drop_last in MegatronDataSampler by @farhadrgh :: PR: #10837
Move collectiob.nlp imports inline for t5 by @marcromeyn :: PR: #10877
Use a context-manager when opening files by @akoumpa :: PR: #10895
Packed sequence bug fixes by @cuichenx :: PR: #10898
ckpt convert bug fixes by @dimapihtar :: PR: #10878
remove deprecated ci tests by @dimapihtar :: PR: #10922
Adithyare/oai chat completion by @arendu :: PR: #10785
Update T5 tokenizer (adding additional tokens to tokenizer config) by @huvunvidia :: PR: #10972
Add support and recipes for HF models via AutoModelForCausalLM by @akoumpa :: PR: #10962
- gpt3 175b cli by @malay-nagda :: PR: #10985
- Fix for crash with LoRA + tp_overlap_comm=false + sequence_parallel=true by @vysarge :: PR: #10920
- Update BaseMegatronSampler for compatibility with PTL'''s _BatchProgress by @ashors1 :: PR: #11016
- add deprecation note by @dimapihtar :: PR: #11024
- Update ModelOpt Width Pruning example defaults by @kevalmorabia97 :: PR: #10902
- switch to NeMo 2.0 recipes by @dimapihtar :: PR: #10948
- NeMo 1.0: upcycle dense to moe by @akoumpa :: PR: #11002
- Update mcore parallelism initialization in nemo2 by @yaoyu-33 :: PR: #10643
- Gemma2 in Nemo2 with Recipes by @suiyoubi :: PR: #11037
- Add Packed Seq option to GPT based models by @suiyoubi :: PR: #11100
- Fix MCoreGPTModel import in llm.gpt.model.base by @hemildesai :: PR: #11109
- TP+MoE peft fix by @akoumpa :: PR: #11114
- GPT recipes to use full te spec by @JimmyZhang12 :: PR: #11119
- Virtual pipeline parallel support for LoRA in NLPAdapterModelMixin by @vysarge :: PR: #11128
- update nemo args for mcore flash decode arg change by @HuiyingLi :: PR: #11138
- Call ckpt_to_weights_subdir from MegatronCheckpointIO by @ashors1 :: PR: #10897
- fix typo by @dimapihtar :: PR: #11234
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Neural Modules 2.0.0

Highlights

Large language models & Multi modal

Export

ASR

Detailed Changelogs

ASR

TTS

NLP / NMT

Contributors