Skip to content

Conversation

@tianyu-l
Copy link
Contributor

which handles import more efficiently and avoids accidental failure.

After this PR, each new model or experiment need to define a get_train_spec function in their __init__.py file.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 23, 2025
@tianyu-l tianyu-l mentioned this pull request Sep 23, 2025
Copy link
Contributor

@fegin fegin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments, stamp to unblock critical fix


from torchtitan.experiments import _supported_experiments
from torchtitan.models import _supported_models

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do a sanity check here.

assert _supported_models. isdisjoint(_supported_experiments)

This allows us to avoid having duplicated name. You can change _supported_models and _supported_experiments to be set.

@fegin
Copy link
Contributor

fegin commented Sep 23, 2025

Both tests failure are real.

Copy link
Contributor

@wwwjn wwwjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix model name here:

@tianyu-l tianyu-l merged commit 5a8256c into main Sep 23, 2025
11 checks passed
@tianyu-l tianyu-l deleted the import branch September 23, 2025 06:40
tianyu-l added a commit that referenced this pull request Sep 23, 2025
tianyu-l added a commit that referenced this pull request Sep 23, 2025
My bad that forgot to update qwen3.
@ruisizhang123
Copy link
Contributor

ruisizhang123 commented Sep 24, 2025

hmmm get a question for this PR: I'm trying to import either deepseek_simple_fsdp or llama_simple_fsdp from simple_fsdp/__init__.py. I understand you want a simple_fsdp indentifier to import this folder in torchtitan/protocols/train_spec.py.

Wonder if there is a way that I could specific one of the two model(deepseek or llama) Train_Spec in simple_fsdp folder without significantly refactor the codebase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants