-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Random] Random state management #38
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @comaniac.
# Note 1: We assume no DP and PP in this script. | ||
# Note 2: This overrides Megatron random seed management, so we only use | ||
# this script for benchmarking. | ||
slapo.set_random_seed(2013, None, None, sch.rank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i understand correctly. all the DP ranks also use the same seed so the loss wouldn't be right, but we only use this script for benchmarking
LGTM |
Description
DropoutWithTensorParallel
that can be replaced by users when writing a schedule.Notes:
set_random_seed
for users to call in the training script. Users have to manually call it and specify the rank of 3D parallelism.set_random_seed
is not called in advance.Checklist
cc @szhengac @chhzh123