Commit 7e0ea69
HF Trainer: ALST/Ulysses sequence parallelism integration via HF Accelerate (#41832)
* HF Trainer: ALST/Ulysses sequence parallelism integration via HF Accelerate
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* make it work + tests
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* cleanup
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* undo
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* normalize
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* always return cp_size
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* cleanup
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* extract code into _deepspeed_cp_compute_loss
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* fix
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* ALST/Ulysses sequence parallelism docs
* typo
* add link to UlyssesSPDataLoaderAdapter
* adapt to renaming to SP
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* improve
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* fix
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* address comments
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* address comments
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* Update src/transformers/trainer.py
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* address comments
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* address comments
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* Update src/transformers/trainer.py
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* Update src/transformers/trainer.py
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* style
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* Account for Sequence Parallelism (SP) dataloader adapter effect
* Update src/transformers/trainer.py
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* model_accepts_loss_kwargs to False
* better comment
* Apply suggestion from @kashif
* Apply suggestion from @kashif
* Apply suggestions from code review
* Apply suggestion from @kashif
* Apply suggestion from @kashif
* Apply suggestion from @kashif
* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Apply suggestion from @kashif
* Apply suggestion from @kashif
---------
Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
Co-authored-by: Stas Bekman <stas.bekman@snowflake.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>1 parent afdc40d commit 7e0ea69
File tree
5 files changed
+453
-21
lines changed- docs/source/en
- src/transformers
- tests/deepspeed
5 files changed
+453
-21
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
368 | 368 | | |
369 | 369 | | |
370 | 370 | | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
371 | 473 | | |
372 | 474 | | |
373 | 475 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
2005 | 2006 | | |
2006 | 2007 | | |
2007 | 2008 | | |
2008 | | - | |
2009 | | - | |
2010 | 2009 | | |
2011 | 2010 | | |
2012 | 2011 | | |
2013 | 2012 | | |
2014 | 2013 | | |
2015 | | - | |
| 2014 | + | |
2016 | 2015 | | |
2017 | 2016 | | |
2018 | 2017 | | |
| |||
2032 | 2031 | | |
2033 | 2032 | | |
2034 | 2033 | | |
| 2034 | + | |
| 2035 | + | |
2035 | 2036 | | |
2036 | 2037 | | |
2037 | 2038 | | |
| |||
2078 | 2079 | | |
2079 | 2080 | | |
2080 | 2081 | | |
2081 | | - | |
| 2082 | + | |
2082 | 2083 | | |
2083 | 2084 | | |
2084 | 2085 | | |
| |||
4076 | 4077 | | |
4077 | 4078 | | |
4078 | 4079 | | |
| 4080 | + | |
| 4081 | + | |
| 4082 | + | |
| 4083 | + | |
| 4084 | + | |
| 4085 | + | |
| 4086 | + | |
| 4087 | + | |
| 4088 | + | |
| 4089 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
603 | 603 | | |
604 | 604 | | |
605 | 605 | | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
606 | 611 | | |
607 | 612 | | |
608 | 613 | | |
| |||
2163 | 2168 | | |
2164 | 2169 | | |
2165 | 2170 | | |
| 2171 | + | |
| 2172 | + | |
| 2173 | + | |
| 2174 | + | |
| 2175 | + | |
| 2176 | + | |
| 2177 | + | |
| 2178 | + | |
| 2179 | + | |
| 2180 | + | |
| 2181 | + | |
| 2182 | + | |
| 2183 | + | |
| 2184 | + | |
| 2185 | + | |
| 2186 | + | |
2166 | 2187 | | |
2167 | 2188 | | |
2168 | 2189 | | |
| |||
2180 | 2201 | | |
2181 | 2202 | | |
2182 | 2203 | | |
2183 | | - | |
2184 | | - | |
| 2204 | + | |
| 2205 | + | |
| 2206 | + | |
| 2207 | + | |
| 2208 | + | |
| 2209 | + | |
| 2210 | + | |
| 2211 | + | |
| 2212 | + | |
| 2213 | + | |
| 2214 | + | |
| 2215 | + | |
| 2216 | + | |
2185 | 2217 | | |
2186 | 2218 | | |
2187 | 2219 | | |
| |||
2305 | 2337 | | |
2306 | 2338 | | |
2307 | 2339 | | |
| 2340 | + | |
| 2341 | + | |
| 2342 | + | |
| 2343 | + | |
| 2344 | + | |
2308 | 2345 | | |
2309 | 2346 | | |
2310 | 2347 | | |
| |||
3639 | 3676 | | |
3640 | 3677 | | |
3641 | 3678 | | |
3642 | | - | |
3643 | | - | |
3644 | | - | |
3645 | | - | |
3646 | | - | |
| 3679 | + | |
| 3680 | + | |
| 3681 | + | |
| 3682 | + | |
| 3683 | + | |
| 3684 | + | |
| 3685 | + | |
| 3686 | + | |
| 3687 | + | |
| 3688 | + | |
| 3689 | + | |
| 3690 | + | |
| 3691 | + | |
| 3692 | + | |
| 3693 | + | |
| 3694 | + | |
| 3695 | + | |
| 3696 | + | |
3647 | 3697 | | |
3648 | 3698 | | |
3649 | 3699 | | |
3650 | 3700 | | |
3651 | 3701 | | |
3652 | 3702 | | |
3653 | | - | |
3654 | | - | |
3655 | | - | |
3656 | | - | |
3657 | | - | |
3658 | | - | |
3659 | 3703 | | |
3660 | 3704 | | |
3661 | 3705 | | |
| |||
3824 | 3868 | | |
3825 | 3869 | | |
3826 | 3870 | | |
| 3871 | + | |
| 3872 | + | |
| 3873 | + | |
| 3874 | + | |
3827 | 3875 | | |
3828 | 3876 | | |
3829 | 3877 | | |
| |||
3877 | 3925 | | |
3878 | 3926 | | |
3879 | 3927 | | |
| 3928 | + | |
| 3929 | + | |
| 3930 | + | |
| 3931 | + | |
| 3932 | + | |
| 3933 | + | |
| 3934 | + | |
| 3935 | + | |
| 3936 | + | |
| 3937 | + | |
| 3938 | + | |
| 3939 | + | |
| 3940 | + | |
| 3941 | + | |
| 3942 | + | |
| 3943 | + | |
| 3944 | + | |
| 3945 | + | |
| 3946 | + | |
| 3947 | + | |
| 3948 | + | |
| 3949 | + | |
| 3950 | + | |
| 3951 | + | |
| 3952 | + | |
| 3953 | + | |
| 3954 | + | |
| 3955 | + | |
| 3956 | + | |
| 3957 | + | |
| 3958 | + | |
| 3959 | + | |
| 3960 | + | |
| 3961 | + | |
| 3962 | + | |
| 3963 | + | |
| 3964 | + | |
| 3965 | + | |
| 3966 | + | |
| 3967 | + | |
| 3968 | + | |
| 3969 | + | |
| 3970 | + | |
| 3971 | + | |
| 3972 | + | |
| 3973 | + | |
| 3974 | + | |
| 3975 | + | |
| 3976 | + | |
3880 | 3977 | | |
3881 | 3978 | | |
3882 | 3979 | | |
| |||
3917 | 4014 | | |
3918 | 4015 | | |
3919 | 4016 | | |
3920 | | - | |
| 4017 | + | |
| 4018 | + | |
| 4019 | + | |
3921 | 4020 | | |
3922 | 4021 | | |
3923 | 4022 | | |
| |||
4986 | 5085 | | |
4987 | 5086 | | |
4988 | 5087 | | |
4989 | | - | |
| 5088 | + | |
| 5089 | + | |
4990 | 5090 | | |
4991 | | - | |
| 5091 | + | |
4992 | 5092 | | |
4993 | 5093 | | |
4994 | 5094 | | |
| |||
5182 | 5282 | | |
5183 | 5283 | | |
5184 | 5284 | | |
| 5285 | + | |
| 5286 | + | |
| 5287 | + | |
| 5288 | + | |
| 5289 | + | |
5185 | 5290 | | |
5186 | 5291 | | |
5187 | 5292 | | |
| |||
0 commit comments