Skip to content

Commit

Permalink
add config
Browse files Browse the repository at this point in the history
  • Loading branch information
li126com committed Jul 25, 2024
1 parent 6ffca0e commit 2438b80
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions configs/7B_sft.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,10 @@
fsp: tensor parallel by flash-attn with sequence parallel, sequence parallel size = tensor parallel size.
isp: customed intern sequence parallel without tensor parallel, can be used with weight parallel.
pipeline parallel (dict):
1. size: int, the size of pipeline parallel.
1. size: int, the size of pipeline parallel (Default is 1F1B).
2. interleaved_overlap: bool, enable/disable communication overlap when using interleaved pipeline scheduler,
defaults to False.
3. zero_bubble: bool, enable/disable zero bubble pipeline parallelism (ZB-H1), defaults to False.
weight parallel (dict):
1. size: int, the size of weight parallel.
2. overlap: bool, enable/disable all_gather/reduce_scatter communication overlap, defaults to False.
Expand All @@ -192,7 +193,7 @@
parallel = dict(
zero1=dict(size=-1),
tensor=dict(size=1, mode="mtp"),
pipeline=dict(size=1, interleaved_overlap=True),
pipeline=dict(size=1, interleaved_overlap=True, zero_bubble=False),
weight=dict(size=1, overlap=True, memory_pool=True),
)

Expand Down

0 comments on commit 2438b80

Please sign in to comment.