Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NeMo Multimodal Docs and Tests Initial PR #8028

Merged
merged 561 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
561 commits
Select commit Hold shift + click to select a range
febcab0
[TTS] Fix FastPitch data prep tutorial (#7524)
rlangman Sep 27, 2023
147e7ac
add italian tokenization (#7486)
GiacomoLeoneMaria Sep 27, 2023
301a266
Replace None strategy with auto in tutorial notebooks (#7521) (#7527)
github-actions[bot] Sep 27, 2023
9bee661
unpin setuptools (#7534) (#7535)
github-actions[bot] Sep 27, 2023
b546643
remove auto generated examples (#7510)
arendu Sep 27, 2023
08e91e1
Add the `strategy` argument to `MegatronGPTModel.generate()` (#7264)
odelalleau Sep 27, 2023
677960d
Fix PTL2.0 related ASR bugs in r1.21.0: Val metrics logging, None dat…
github-actions[bot] Sep 27, 2023
3b435ed
gpus -> devices (#7542) (#7545)
github-actions[bot] Sep 28, 2023
2bb56f7
Update FFMPEG version to fix issue with torchaudio (#7551) (#7553)
github-actions[bot] Sep 28, 2023
82f547f
PEFT GPT & T5 Refactor (#7308)
meatybobby Sep 28, 2023
1689790
fix a typo (#7496)
BestJuly Sep 28, 2023
3d28306
[TTS] remove curly braces from ${BRANCH} in jupyer notebook cell. (#7…
github-actions[bot] Sep 28, 2023
b38c28a
add youtube embed url (#7570)
XuesongYang Sep 29, 2023
b9033f2
Remap speakers to continuous range of speaker_id for dataset AISHELL3…
RobinDong Sep 29, 2023
62097e5
fix validation_step_outputs initialization for multi-dataloader (#754…
github-actions[bot] Sep 29, 2023
fe50fa3
Append output of val step to self.validation_step_outputs (#7530) (#7…
github-actions[bot] Sep 29, 2023
bf88a23
[TTS] fixed trainer's accelerator and strategy. (#7569) (#7574)
github-actions[bot] Sep 29, 2023
7987c21
Append val/test output to instance variable in EncDecSpeakerLabelMode…
github-actions[bot] Sep 29, 2023
50ab483
Fix CustomProgressBar for resume (#7427) (#7522)
github-actions[bot] Sep 30, 2023
2cb9e4c
fix typos in nfa and speech enhancement tutorials (#7580) (#7583)
github-actions[bot] Sep 30, 2023
2295e44
Add strategy as ddp_find_unused_parameters_true for glue_benchmark.py…
github-actions[bot] Sep 30, 2023
1be5988
update strategy (#7577) (#7578)
github-actions[bot] Sep 30, 2023
8f36214
Fix typos (#7581)
Kipok Oct 2, 2023
f29a917
Change hifigan finetune strategy to ddp_find_unused_parameters_true (…
github-actions[bot] Oct 2, 2023
dc60a47
[BugFix] Add missing quotes for auto strategy in tutorial notebooks (…
github-actions[bot] Oct 2, 2023
879047e
add build os key (#7596) (#7599)
github-actions[bot] Oct 2, 2023
0b1ea36
StarCoder SFT test + bump PyT NGC image to 23.09 (#7540)
janekl Oct 2, 2023
703d2e8
defaults changed (#7600)
arendu Oct 3, 2023
8b77683
add ItalianPhonemesTokenizer (#7587)
GiacomoLeoneMaria Oct 3, 2023
e603cad
best ckpt fix (#7564) (#7588)
github-actions[bot] Oct 3, 2023
4d5184c
Add files via upload (#7598)
Jorjeous Oct 3, 2023
f10f93b
Fix validation in G2PModel and ThutmoseTaggerModel (#7597) (#7606)
github-actions[bot] Oct 3, 2023
a12835e
Broadcast loss only when using pipeline parallelism and within the pi…
github-actions[bot] Oct 3, 2023
5211e5b
Safeguard nemo_text_processing installation on ARM (#7485)
blisc Oct 3, 2023
9590c3c
Bound transformers version in requirements (#7620)
athitten Oct 4, 2023
fe5af22
fix llama2 70b lora tuning bug (#7622)
cuichenx Oct 4, 2023
381d84e
Fix import error no module name model_utils (#7629)
menon92 Oct 4, 2023
19f32c5
add fc large ls models (#7641)
nithinraok Oct 4, 2023
329bd3c
bugfix: trainer.gpus, trainer.strategy, trainer.accelerator (#7621) (…
github-actions[bot] Oct 5, 2023
e109c6e
fix ssl models ptl monitor val through logging (#7608) (#7614)
github-actions[bot] Oct 5, 2023
b36555b
Fix metrics for SE tutorial (#7604) (#7612)
github-actions[bot] Oct 5, 2023
a0053a6
Add ddp_find_unused_parameters=True and change accelerator to auto (#…
github-actions[bot] Oct 5, 2023
358f5c6
Fix py3.11 dataclasses issue (#7616)
github-actions[bot] Oct 5, 2023
6aff5e9
[Stable Diffusion/ControlNet] Enable O2 training for SD and Fix Contr…
Victor49152 Oct 9, 2023
37e9706
Merge branch 'mingyuanm/sd_o2' into 'internal/main'
Victor49152 Oct 9, 2023
32e4fba
Mingyuanm/dreambooth fix
Victor49152 Oct 10, 2023
3bf91c3
Merge branch 'mingyuanm/dreambooth_fix' into 'internal/main'
Victor49152 Oct 10, 2023
173c468
Fix NeMo CI Infer Issue
suiyoubi Oct 10, 2023
32af5bc
Merge branch 'aot/imagen_fix' into 'internal/main'
Oct 10, 2023
3e038fd
DreamFusion
ahmadki Oct 11, 2023
981fca5
Move neva export changes
meatybobby Oct 12, 2023
ed895d0
Add Imagen Synthetic Dataloader
suiyoubi Oct 13, 2023
ad7ef5a
Merge branch 'aot/syn_dataset_imagen' into 'internal/main'
Oct 13, 2023
fd7c1d3
Add VITWrapper and export stuff to wrapper
meatybobby Oct 13, 2023
dd67c95
Update neva with megatron-core support
Oct 13, 2023
90c0559
Merge branch 'yuya/neva_mcore2' into 'internal/main'
Oct 13, 2023
1e4c2b2
Fix issues with Dockerfile (#7650) (#7652)
github-actions[bot] Oct 6, 2023
798f6fc
[ASR] RNN-T greedy decoding max_frames fix for alignment and confiden…
GNroy Oct 6, 2023
d9861d1
[ASR] Fix type error in jasper (#7636) (#7653)
github-actions[bot] Oct 6, 2023
3e38b79
[TTS] Add STFT and SI-SDR loss to audio codec recipe (#7468)
rlangman Oct 6, 2023
2209a30
Create per.py (#7538)
ssh-meister Oct 7, 2023
70c0a37
conversion issue fix (#7648) (#7668)
github-actions[bot] Oct 10, 2023
b7bcf08
layernorm1p fix (#7523) (#7567)
github-actions[bot] Oct 10, 2023
b3da442
generalized chat sft prompt (#7655)
yidong72 Oct 10, 2023
188f0a1
Fix vad & speech command tutorial - onnx (#7671) (#7672)
github-actions[bot] Oct 10, 2023
33d04b2
Fix in the confidence ensemble test (#7682)
Kipok Oct 11, 2023
40f8256
PEFT eval fix (#7626) (#7638)
github-actions[bot] Oct 11, 2023
79c3703
propagate mp config (#7637) (#7639)
github-actions[bot] Oct 11, 2023
aba4a00
Add find_unused_parameters_true for text_classiftn and punctuation_ca…
github-actions[bot] Oct 11, 2023
503301b
Hotfix (#7501) (#7568)
github-actions[bot] Oct 11, 2023
98e6ffe
Avoid duplicated checkpoint save (#7555) (#7566)
github-actions[bot] Oct 11, 2023
b6fecc5
Cache FP8 weight and transpose only at the first micro-batch in each …
github-actions[bot] Oct 11, 2023
292d232
Add an option to disable manual GC in validation (#7467) (#7476)
github-actions[bot] Oct 11, 2023
9c48ce1
Remove PUBLICATIONS.md, point to github.io NeMo page instead (#7694) …
github-actions[bot] Oct 11, 2023
762b5ca
Fix multi rank finetune for ASR (#7684) (#7699)
github-actions[bot] Oct 11, 2023
7755c17
Update docs: readme, getting started, ASR intro (#7679)
erastorgueva-nv Oct 11, 2023
5f35a8c
fix onnx (#7703) (#7704)
github-actions[bot] Oct 12, 2023
29910cd
move core install to /workspace (#7706)
aklife97 Oct 12, 2023
aa3a977
Fix typo in audio codec config, encoder target (#7697)
anteju Oct 12, 2023
eab0f54
Replace strategy='dp'/None with 'auto' (#7681) (#7696)
github-actions[bot] Oct 13, 2023
233e62b
[ASR] Multichannel mask estimator with flex number of channels (#7317)
anteju Oct 13, 2023
3cd9fbd
fix ptl_bugs in slu_models.py (#7689) (#7712)
github-actions[bot] Oct 13, 2023
ddf546d
fix code block typo (#7717)
erastorgueva-nv Oct 13, 2023
ff7154d
Update key mapping logic
Victor49152 Oct 16, 2023
f73180d
Merge branch 'main' into internal/main
yaoyu-33 Oct 16, 2023
0087ee3
Few merge fixes
yaoyu-33 Oct 16, 2023
8bdbd47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 16, 2023
7be8108
Fix diff for non-mm models
yaoyu-33 Oct 16, 2023
aab3c40
Fix diff for non-mm models
yaoyu-33 Oct 16, 2023
38dc290
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 16, 2023
563cadb
Remove deployment and export scripts
yaoyu-33 Oct 16, 2023
9a566be
Improve the unet ckpt loading logic.
Victor49152 Oct 16, 2023
7a0ae36
Improve the unet ckpt loading logic.
Victor49152 Oct 16, 2023
576c652
Add checkpoint_averaging script
yaoyu-33 Oct 17, 2023
d6900f9
Hide multimodal code changes
yaoyu-33 Oct 17, 2023
3b1b802
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 17, 2023
526924d
Merge branch 'main' into multimodal_merge
ericharper Oct 19, 2023
a1f7296
Fix Eric's comments
yaoyu-33 Oct 23, 2023
41632c6
Revert "Hide multimodal code changes"
yaoyu-33 Oct 23, 2023
f40b56e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 23, 2023
c032a6d
Merge branch 'multimodal/merge_mm_code' into internal/main
yaoyu-33 Oct 24, 2023
ec8256b
Fix configs
yaoyu-33 Oct 24, 2023
5dad277
Fix neva model
yaoyu-33 Oct 24, 2023
c1c5981
Fix neva casting
yaoyu-33 Oct 24, 2023
b0c5320
Fix neva LoRA non MCore version
yaoyu-33 Oct 25, 2023
14cf3bd
Merge branch 'main' into multimodal_merge
ericharper Oct 25, 2023
4e178e3
Fix neva LoRA MCore
yaoyu-33 Oct 25, 2023
cacf9a8
[SD] group norm fixes
sjmikler Oct 25, 2023
2da64db
Fix neva cfg merge
yaoyu-33 Oct 26, 2023
fba2548
remove groupnorm dependency
suiyoubi Oct 27, 2023
a2da20d
Merge branch 'main' into multimodal_merge
ericharper Oct 30, 2023
41b1b51
Fix copyright headers
yaoyu-33 Oct 30, 2023
438617e
Merge branch 'aot/apex_gn' into 'internal/main'
Oct 30, 2023
7422dbe
LLaVA 1_5 and LORA update
Oct 30, 2023
de405b9
Merge branch 'yuya/llava_1_5_update' into 'internal/main'
Oct 30, 2023
5965a5f
Fix logs
yaoyu-33 Oct 30, 2023
26ee7dc
Fix neva mcore infernece
yaoyu-33 Oct 31, 2023
7356b1c
Fix ema
yaoyu-33 Oct 31, 2023
93e4f99
Fix ema
yaoyu-33 Oct 31, 2023
ca3d8f9
Address Somshubra comments
yaoyu-33 Nov 1, 2023
544e5ea
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 1, 2023
8493a8a
Fix NeVA
yaoyu-33 Nov 1, 2023
ea3d4fc
Remove llama tricks since we are padding the embedding weights direct…
yaoyu-33 Nov 1, 2023
2d5f5ab
Merge branch 'multimodal/merge' into multimodal/merge_mm_code
yaoyu-33 Nov 1, 2023
6f5df3f
Update Dockerfile and mm requirements
meatybobby Nov 1, 2023
65bcec3
Merge branch 'bobchen/nemo_toolkit' into 'internal/main'
Nov 1, 2023
4dff83f
Multimodal unit and jenkins tests
Nov 1, 2023
02cc05d
Merge branch 'mm_tests' into 'internal/main'
Nov 1, 2023
724c956
Add Multimodal Docs
Nov 1, 2023
4951f4f
Merge branch 'mm_docs' into 'internal/main'
Nov 1, 2023
6beaa50
update default conv_template
yaoyu-33 Nov 1, 2023
2f4e334
Merge branch 'internal/main' into multimodal/merge_mm_code
yaoyu-33 Nov 1, 2023
c083f0f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 1, 2023
367723f
Merge branch 'main' into multimodal_merge
ericharper Nov 1, 2023
2840014
Fix neva evaluation
yaoyu-33 Nov 1, 2023
97d9bf9
Update Dockerfile
yaoyu-33 Nov 1, 2023
9173dc2
Merge branch 'internal/main' into multimodal/merge_mm_code
yaoyu-33 Nov 1, 2023
149cdde
Merge branch 'main' into multimodal_merge
ericharper Nov 2, 2023
6b84cef
Fix evaluation loading
yaoyu-33 Nov 2, 2023
ccd6cb5
Fix evaluation API
yaoyu-33 Nov 2, 2023
e0a74da
Merge branch 'internal/main' into multimodal/merge_mm_code
yaoyu-33 Nov 2, 2023
85bd797
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2023
9b4c9c2
Change quick-gelu to approx-gelu
yaoyu-33 Nov 2, 2023
e2ccc88
hide multimodal
yaoyu-33 Nov 2, 2023
1057139
Merge branch 'multimodal/merge' into multimodal/merge_mm_code
yaoyu-33 Nov 2, 2023
7ed6283
Revert "hide multimodal"
yaoyu-33 Nov 2, 2023
f6ef703
REstructure
yaoyu-33 Nov 2, 2023
9751d10
REstructure again
yaoyu-33 Nov 3, 2023
9ac6102
Update neva evalution code
yaoyu-33 Nov 3, 2023
d4fe16c
Merge branch 'internal/main_change_structure' into multimodal/merge_m…
yaoyu-33 Nov 3, 2023
488d7e9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 3, 2023
2f29c5d
Merge branch 'main' into multimodal/merge_mm_code
yaoyu-33 Nov 3, 2023
0e9c30c
Fix neva model after merging
yaoyu-33 Nov 3, 2023
f68ba2c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 3, 2023
5df0c40
Restructure
yaoyu-33 Nov 6, 2023
b1555a6
Restructure, rename
yaoyu-33 Nov 6, 2023
71141c5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 6, 2023
87b724e
Restructure
yaoyu-33 Nov 6, 2023
e9ba432
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 6, 2023
76df1d8
Merge branch 'main' into multimodal/merge_mm_code
yaoyu-33 Nov 6, 2023
b49f12b
Remove package requirement
meatybobby Nov 3, 2023
d2c200c
hide docs and artifacts
yaoyu-33 Nov 6, 2023
8007765
Merge remote-tracking branch 'github/multimodal/merge_mm_code' into m…
yaoyu-33 Nov 6, 2023
72c683e
Rename Nerf
yaoyu-33 Nov 7, 2023
782316f
Hide Nerf and text to image
yaoyu-33 Nov 7, 2023
d24f74d
Merge branch 'main' into multimodal/merge_mm_code
ericharper Nov 10, 2023
66d42be
Merge branch 'main' into multimodal/merge_mm_code
ericharper Nov 16, 2023
c8dd7e3
Update examples/multimodal/multimodal_llm/neva/convert_hf_llava_to_ne…
yaoyu-33 Nov 16, 2023
565e617
Update examples/multimodal/multimodal_llm/neva/convert_hf_llava_to_ne…
yaoyu-33 Nov 16, 2023
fd1ada8
Fix PR comments, clean comments, move to torch_dtype_from_precision
yaoyu-33 Nov 16, 2023
bccb0ea
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 16, 2023
d596f59
Update to torch_dtype_from_precision
yaoyu-33 Nov 16, 2023
ed9145c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 16, 2023
084ff69
Merge branch 'main' into multimodal/merge_mm_code
ericharper Nov 21, 2023
993f969
Merge branch 'main' into multimodal/merge_mm_code
ericharper Nov 27, 2023
2d3a6b7
Fix PR comments
yaoyu-33 Dec 4, 2023
00ef2b4
Fix copyright and docstrings
yaoyu-33 Dec 4, 2023
0ccf916
Update docstrings
yaoyu-33 Dec 4, 2023
3574590
Optimize imports
yaoyu-33 Dec 4, 2023
90d08a8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 4, 2023
e14f713
Revert "Hide Nerf and text to image"
yaoyu-33 Dec 4, 2023
4d94fef
Add copyright information
yaoyu-33 Dec 4, 2023
50e8871
Optimize imports
yaoyu-33 Dec 4, 2023
2ce8e36
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 4, 2023
96b20b2
Merge branch 'multimodal/merge_mm_code' into multimodal/merge_mm_text…
yaoyu-33 Dec 4, 2023
2418283
Optimize imports
yaoyu-33 Dec 4, 2023
cd56c9e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 4, 2023
63304d3
Merge branch 'main' into multimodal/merge_mm_text2img_nerf
yaoyu-33 Dec 13, 2023
a0b4861
Update multimodal docs and tests
yaoyu-33 Dec 13, 2023
660f657
Update multimodal jenkins
yaoyu-33 Dec 13, 2023
490cd1b
Merge branch 'main' into multimodal/merge_mm_text2img_nerf
yaoyu-33 Dec 13, 2023
a1ae609
Update unit test
yaoyu-33 Dec 13, 2023
4cf447c
Update docs
yaoyu-33 Dec 13, 2023
94ef0fb
Update docs
yaoyu-33 Dec 13, 2023
7165ada
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 14, 2023
b677858
Merge branch 'main' into multimodal/merge_mm_text2img_nerf
ericharper Dec 15, 2023
5835c8a
Address comments
yaoyu-33 Dec 15, 2023
5ef5dc2
Merge branch 'multimodal/merge_mm_text2img_nerf' into multimodal/merg…
yaoyu-33 Dec 15, 2023
cf6a075
Bug fix due to restructure
yaoyu-33 Dec 15, 2023
eae4edc
Fix unit test
yaoyu-33 Dec 15, 2023
76d07fc
Bug fix due to restructure
yaoyu-33 Dec 15, 2023
21df321
remove color map detector
Victor49152 Dec 15, 2023
bdf1dc6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 15, 2023
032a819
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 15, 2023
77f3b40
Merge remote-tracking branch 'github/multimodal/merge_mm_text2img_ner…
yaoyu-33 Dec 15, 2023
28102bc
Merge remote-tracking branch 'github/multimodal/merge_mm_docs_tests' …
yaoyu-33 Dec 15, 2023
a26caa2
Dreambooth loading fix
yaoyu-33 Dec 16, 2023
c9a2d6c
Fix Jenkinsfile
yaoyu-33 Dec 18, 2023
01d1862
copyright
yaoyu-33 Dec 18, 2023
d0b30ef
Fix jekins tests for sd/controlnet/dreambooth
Victor49152 Dec 18, 2023
7ed9491
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2023
a39bf94
Merge branch 'main' into multimodal/merge_mm_text2img_nerf
ericharper Dec 19, 2023
7165348
Merge branch 'multimodal/merge_mm_text2img_nerf' into multimodal/merg…
ericharper Jan 2, 2024
3887a76
neva api bug fix
yaoyu-33 Jan 9, 2024
5688ea7
Merge branch 'main' into multimodal/merge_mm_docs_tests
yaoyu-33 Jan 11, 2024
f810c42
code scan clean
yaoyu-33 Jan 11, 2024
d4d44a9
Fix Jenkinsfile
yaoyu-33 Jan 11, 2024
866c6dc
Fix copyright
yaoyu-33 Jan 11, 2024
10bd961
Update requirements
yaoyu-33 Jan 11, 2024
4c3c9bc
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 11, 2024
7966fd1
Remove git versioning in requirements_test.txt
yaoyu-33 Jan 11, 2024
bd7705c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 11, 2024
716f912
Update requirements
yaoyu-33 Jan 11, 2024
0ff66c2
Fix imports
yaoyu-33 Jan 11, 2024
fd3609e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 11, 2024
ced5795
Fix imports
yaoyu-33 Jan 11, 2024
1e3856f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 11, 2024
37fa699
Fix imports
yaoyu-33 Jan 11, 2024
0a8ea16
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 12, 2024
0a39a55
Add a guard to warn users always to use spartial transformer
Victor49152 Jan 5, 2024
dc1f064
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 12, 2024
d08d844
Fix logging
yaoyu-33 Jan 12, 2024
5a84a5e
Fix jenkins
yaoyu-33 Jan 12, 2024
c22cc9f
Hide multimodal unit test
yaoyu-33 Jan 12, 2024
408d539
Remove flash attn requirement
yaoyu-33 Jan 12, 2024
e37c8c5
Hide vision test
yaoyu-33 Jan 12, 2024
04b234e
Fix requirements for multimodal
yaoyu-33 Jan 12, 2024
9025ffa
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 12, 2024
3232d42
Fix requirements for multimodal
yaoyu-33 Jan 12, 2024
ba250bc
Fix readme
yaoyu-33 Jan 12, 2024
a63f38d
Mv Multimodal jenkins earlier
yaoyu-33 Jan 12, 2024
ec05686
Fix controlnet
yaoyu-33 Jan 12, 2024
89623cd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 12, 2024
3140877
Remove some requirements
yaoyu-33 Jan 12, 2024
311cd49
Don't use flash attn in jenkins
yaoyu-33 Jan 12, 2024
65770a1
Turn off flash-attn
yaoyu-33 Jan 13, 2024
f0b5bce
Hide MM jenkins tests
yaoyu-33 Jan 13, 2024
bfaade8
replaced pymeshlab library with trimesh
ahmadki Jan 16, 2024
914c7ed
dropped pymeshlab from multimodal requirements file
ahmadki Jan 16, 2024
a3d0ac3
Update imageio requirements
yaoyu-33 Jan 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
273 changes: 272 additions & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,256 @@ pipeline {
sh 'CUDA_VISIBLE_DEVICES="" NEMO_NUMBA_MINVER=0.53 pytest -m "not pleasefixme" --cpu --with_downloads --relax_numba_compat'
}
}
//
// stage('L2: Multimodal Imagen Train') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/multimodal/imagen_train"
// sh "pip install webdataset==0.2.48"
// sh "python examples/multimodal/text_to_image/imagen/imagen_training.py \
// trainer.precision=16 \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=1 \
// model.global_batch_size=1 \
// model.data.synthetic_data=True \
// exp_manager.exp_dir=/home/TestData/multimodal/imagen_train \
// model.inductor=False \
// model.unet.flash_attention=False \
// "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/multimodal/imagen_train"
// }
// }
//
// stage('L2: Multimodal Stable Diffusion Train') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/multimodal/stable_diffusion_train"
// sh "pip install webdataset==0.2.48"
// sh "python examples/multimodal/text_to_image/stable_diffusion/sd_train.py \
// trainer.precision=16 \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=1 \
// model.global_batch_size=1 \
// model.data.synthetic_data=True \
// exp_manager.exp_dir=/home/TestData/multimodal/stable_diffusion_train \
// model.inductor=False \
// model.cond_stage_config._target_=nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder \
// ++model.cond_stage_config.version=openai/clip-vit-large-patch14 \
// ++model.cond_stage_config.max_length=77 \
// ~model.cond_stage_config.restore_from_path \
// ~model.cond_stage_config.freeze \
// ~model.cond_stage_config.layer \
// model.unet_config.from_pretrained=null \
// model.first_stage_config.from_pretrained=null \
// model.unet_config.use_flash_attention=False \
// "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/multimodal/stable_diffusion_train"
// }
// }
// stage('L2: Multimodal ControlNet Train') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/multimodal/controlnet_train"
// sh "pip install webdataset==0.2.48"
// sh "python examples/multimodal/text_to_image/controlnet/controlnet_train.py \
// trainer.precision=16 \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=1 \
// model.global_batch_size=1 \
// model.data.synthetic_data=True \
// exp_manager.exp_dir=/home/TestData/multimodal/controlnet_train \
// model.inductor=False \
// model.image_logger.max_images=0 \
// model.control_stage_config.params.from_pretrained_unet=null \
// model.unet_config.from_pretrained=null \
// model.first_stage_config.from_pretrained=null \
// model.unet_config.use_flash_attention=False \
// "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/multimodal/controlnet_train"
// }
// }
// stage('L2: Multimodal DreamBooth Train') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/multimodal/dreambooth_train"
// sh "pip install webdataset==0.2.48"
// sh "python examples/multimodal/text_to_image/dreambooth/dreambooth.py \
// trainer.precision=16 \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=1 \
// model.global_batch_size=1 \
// exp_manager.exp_dir=/home/TestData/multimodal/dreambooth_train \
// model.inductor=False \
// model.cond_stage_config._target_=nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder \
// ++model.cond_stage_config.version=openai/clip-vit-large-patch14 \
// ++model.cond_stage_config.max_length=77 \
// ~model.cond_stage_config.restore_from_path \
// ~model.cond_stage_config.freeze \
// ~model.cond_stage_config.layer \
// model.unet_config.from_pretrained=null \
// model.first_stage_config.from_pretrained=null \
// model.data.instance_dir=/home/TestData/multimodal/tiny-dreambooth \
// model.unet_config.use_flash_attention=False \
// "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/multimodal/dreambooth_train"
// }
// }
// stage('L2: Vision ViT Pretrain TP=1') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/vision/vit_pretrain_tp1"
// sh "pip install webdataset==0.2.48"
// sh "python examples/vision/vision_transformer/megatron_vit_classification_pretrain.py \
// trainer.precision=16 \
// model.megatron_amp_O2=False \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// trainer.val_check_interval=5 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=2 \
// model.global_batch_size=4 \
// model.tensor_model_parallel_size=1 \
// model.pipeline_model_parallel_size=1 \
// model.data.num_workers=0 \
// exp_manager.create_checkpoint_callback=False \
// model.data.data_path=[/home/TestData/multimodal/tiny-imagenet/train,/home/TestData/multimodal/tiny-imagenet/val] \
// exp_manager.exp_dir=/home/TestData/vision/vit_pretrain_tp1 "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/vision/vit_pretrain_tp1"
// }
// }
//
// stage('L2: Multimodal CLIP Pretrain TP=1') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/multimodal/clip_pretrain_tp1"
// sh "pip install webdataset==0.2.48"
// sh "python examples/multimodal/vision_language_foundation/clip/megatron_clip_pretrain.py \
// trainer.precision=16 \
// model.megatron_amp_O2=False \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// trainer.val_check_interval=10 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=1 \
// model.global_batch_size=1 \
// model.tensor_model_parallel_size=1 \
// model.pipeline_model_parallel_size=1 \
// exp_manager.create_checkpoint_callback=False \
// model.data.num_workers=0 \
// model.vision.num_layers=2 \
// model.text.num_layers=2 \
// model.vision.patch_dim=32 \
// model.vision.encoder_seq_length=49 \
// model.vision.class_token_length=7 \
// model.data.train.dataset_path=[/home/TestData/multimodal/tiny-clip/00000.tar] \
// model.data.validation.dataset_path=[/home/TestData/multimodal/tiny-clip/00000.tar] \
// model.data.webdataset.local_root_path=/ \
// exp_manager.exp_dir=/home/TestData/multimodal/clip_pretrain_tp1 "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/multimodal/clip_pretrain_tp1"
// }
// }
//
// stage('L2: Multimodal NeVA Pretrain TP=1') {
// when {
// anyOf {
// branch 'main'
// changeRequest target: 'main'
// }
// }
// failFast true
// steps {
// sh "rm -rf /home/TestData/multimodal/neva_pretrain_tp1"
// sh "pip install webdataset==0.2.48"
// sh "python examples/multimodal/multimodal_llm/neva/neva_pretrain.py \
// trainer.precision=bf16 \
// model.megatron_amp_O2=False \
// trainer.num_nodes=1 \
// trainer.devices=1 \
// trainer.val_check_interval=10 \
// trainer.limit_val_batches=5 \
// trainer.log_every_n_steps=1 \
// ++exp_manager.max_time_per_run=00:00:03:00 \
// trainer.max_steps=20 \
// model.micro_batch_size=2 \
// model.global_batch_size=4 \
// model.tensor_model_parallel_size=1 \
// model.pipeline_model_parallel_size=1 \
// exp_manager.create_checkpoint_callback=False \
// model.data.data_path=/home/TestData/multimodal/tiny-neva/dummy.json \
// model.data.image_folder=/home/TestData/multimodal/tiny-neva/images \
// model.tokenizer.library=sentencepiece \
// model.tokenizer.model=/home/TestData/multimodal/tiny-neva/tokenizer_add_special.model \
// model.num_layers=2 \
// model.hidden_size=5120 \
// model.ffn_hidden_size=13824 \
// model.num_attention_heads=40 \
// model.normalization=rmsnorm \
// model.data.num_workers=0 \
// model.data.conv_template=llama_2 \
// model.mm_cfg.vision_encoder.from_pretrained='openai/clip-vit-large-patch14' \
// model.mm_cfg.llm.from_pretrained=null \
// model.use_flash_attention=false \
// exp_manager.exp_dir=/home/TestData/multimodal/neva_pretrain_tp1 "
// sh "pip install 'webdataset>=0.1.48,<=0.1.62'"
// sh "rm -rf /home/TestData/multimodal/neva_pretrain_tp1"
// }
// }

// TODO: this requires TE >= v0.11 which is not available in 23.06.
// please uncomment this test once mcore CI is ready.
Expand Down Expand Up @@ -4815,6 +5065,7 @@ assert_frame_equal(training_curve, gt_curve, rtol=1e-3, atol=1e-3)"'''
}
}
}

stage('L2: TTS Fast dev runs 1') {
when {
anyOf {
Expand Down Expand Up @@ -4960,7 +5211,27 @@ assert_frame_equal(training_curve, gt_curve, rtol=1e-3, atol=1e-3)"'''
}
}
}

stage('L2: NeRF') {
when {
anyOf {
branch 'r1.21.0'
changeRequest target: 'r1.21.0'
}
}
parallel {
stage('DreamFusion') {
steps {
sh 'python examples/multimodal/text_to_image/nerf/main.py \
trainer.num_nodes=1 \
trainer.devices="[0]" \
trainer.max_steps=1000 \
model.prompt="a DSLR photo of a delicious hamburger" \
exp_manager.exp_dir=examples/multimodal/text_to_image/nerf/dreamfusion_results'
sh 'rm -rf examples/multimodal/text_to_image/nerf/dreamfusion_results'
}
}
}
}
stage('L??: Speech Checkpoints tests') {
when {
anyOf {
Expand Down
5 changes: 5 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@
'ipadic',
'psutil',
'regex',
'PIL',
'boto3',
'taming',
]

_skipped_autodoc_mock_imports = ['wrapt', 'numpy']
Expand Down Expand Up @@ -125,6 +128,8 @@
'tts/tts_all.bib',
'text_processing/text_processing_all.bib',
'core/adapters/adapter_bib.bib',
'multimodal/mm_all.bib',
'vision/vision_all.bib',
]

intersphinx_mapping = {
Expand Down
19 changes: 18 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ NVIDIA NeMo User Guide
nlp/api
nlp/megatron_onnx_export
nlp/models


.. toctree::
:maxdepth: 1
Expand All @@ -71,6 +71,23 @@ NVIDIA NeMo User Guide
text_processing/g2p/g2p
common/intro

.. toctree::
:maxdepth: 3
:caption: Multimodal (MM)
:name: Multimodal

multimodal/mllm/intro
multimodal/vlm/intro
multimodal/text2img/intro
multimodal/nerf/intro
multimodal/api

.. toctree::
:maxdepth: 2
:caption: Vision
:name: vision

vision/intro

.. toctree::
:maxdepth: 3
Expand Down
Loading
Loading