-
Notifications
You must be signed in to change notification settings - Fork 603
[refactor] dynamically import TrainSpec #1740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fegin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments, stamp to unblock critical fix
|
|
||
| from torchtitan.experiments import _supported_experiments | ||
| from torchtitan.models import _supported_models | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do a sanity check here.
assert _supported_models. isdisjoint(_supported_experiments)
This allows us to avoid having duplicated name. You can change _supported_models and _supported_experiments to be set.
|
Both tests failure are real. |
wwwjn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix model name here:
| "--model.name llama3_simple_fsdp", |
|
hmmm get a question for this PR: I'm trying to import either deepseek_simple_fsdp or llama_simple_fsdp from Wonder if there is a way that I could specific one of the two model(deepseek or llama) |
which handles import more efficiently and avoids accidental failure.
After this PR, each new model or experiment need to define a
get_train_specfunction in their__init__.pyfile.