Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[NeMo-UX] Adding MegatronParallel (#8987)
* Adding MegatronParallel Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Minor quantization pipeline updates (#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <janek.lasek@gmail.com> --------- Signed-off-by: Jan Lasek <janek.lasek@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Fix converter (#8960) Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Fix memory leak at loss func (#8868) * PR #8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <jaeminc@nvidia.com> * PR #8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <jaeminc@nvidia.com> * PR #8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <jaeminc@nvidia.com> --------- Signed-off-by: Jaemin Choi <jaeminc@nvidia.com> Signed-off-by: Shriya Palsamudram <69161273+ShriyaPalsamudram@users.noreply.github.com> Co-authored-by: Jaemin Choi <jaeminc@nvidia.com> Co-authored-by: Eric Harper <complex451@gmail.com> Co-authored-by: Shriya Palsamudram <69161273+ShriyaPalsamudram@users.noreply.github.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * PP support in LoRA merge script (#8934) * initial commit Signed-off-by: Chen Cui <chcui@nvidia.com> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <chcui@nvidia.com> --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <adithya.r@gmail.com> Co-authored-by: Eric Harper <complex451@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Mingyuanm/sdxl export (#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * Update file paths Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * minor update Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * remove unused imports/vars Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> --------- Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> --------- Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * update (#8978) Signed-off-by: eharper <eharper@nvidia.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * change the condition for get qkv tensor from linear_qkv output (#8965) Signed-off-by: HuiyingLi <willwin.lee@gmail.com> Co-authored-by: Adi Renduchintala <adithya.r@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Update Latest News (#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <shashankv@nvidia.com> * Fold the first news item by default Signed-off-by: Shashank Verma <shashankv@nvidia.com> --------- Signed-off-by: Shashank Verma <shashankv@nvidia.com> Signed-off-by: Shashank Verma <shashank3959@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Fix incorrect link to latest news in README (#8985) Signed-off-by: Shashank Verma <shashankv@nvidia.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * make unit tests works Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Enable using hybrid asr models in CTC Segmentation tool (#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Add safety checks for 'data' key in MegatronGPTModel cfg (#8991) Signed-off-by: HuiyingLi <willwin.lee@gmail.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * address some comments Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * TDT confidence fix (#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> * Address PR comments Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> --------- Signed-off-by: Marc Romeyn <marcromeyn@gmail.com> Signed-off-by: Jan Lasek <janek.lasek@gmail.com> Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: Jaemin Choi <jaeminc@nvidia.com> Signed-off-by: Shriya Palsamudram <69161273+ShriyaPalsamudram@users.noreply.github.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Mingyuan Ma <mingyuanm@nvidia.com> Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> Signed-off-by: eharper <eharper@nvidia.com> Signed-off-by: HuiyingLi <willwin.lee@gmail.com> Signed-off-by: Shashank Verma <shashankv@nvidia.com> Signed-off-by: Shashank Verma <shashank3959@gmail.com> Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> Co-authored-by: Marc Romeyn <marcromeyn@gmail.com> Co-authored-by: Jan Lasek <janek.lasek@gmail.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Jaemin Choi <minitu77@gmail.com> Co-authored-by: Jaemin Choi <jaeminc@nvidia.com> Co-authored-by: Eric Harper <complex451@gmail.com> Co-authored-by: Shriya Palsamudram <69161273+ShriyaPalsamudram@users.noreply.github.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <adithya.r@gmail.com> Co-authored-by: Ming <111467530+Victor49152@users.noreply.github.com> Co-authored-by: Alexey Panteleev <apanteleev87@gmail.com> Co-authored-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> Co-authored-by: Huiying <willwin.lee@gmail.com> Co-authored-by: Shashank Verma <shashank3959@gmail.com> Co-authored-by: Shashank Verma <shashankv@nvidia.com> Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
- Loading branch information