Skip to content

Conversation

PaliC
Copy link
Contributor

@PaliC PaliC commented Oct 2, 2025

Summary:

Here we introduce model suite (model.py). The idea here to start and codify the ideas from jiannanWang/BackendBenchExamples. Specifically this PR adds some example models / configs which are to be loaded + a Readme. (It may be useful to look at the PR above this as well since it's the model loading logic).

This PR adds two toy models to model suite

SmokeTestModel - This is simple model that uses aten.ops.mm as we can implement a correct version of this op
ToyCoreOpsModel - This is a model which explicitly calls the backwards passes which are both in torchbench + core.

Test Plan:
the test infra is in the pr above, so tests passing on the PR above should be sufficient here

Future work with Model Suite

#181


Stack created with Sapling. Best reviewed with ReviewStack.

Summary:

Here we introduce model suite (model.py). The idea here to start and codify the ideas from jiannanWang/BackendBenchExamples.  Specifically this PR adds some example models / configs which are to be loaded + a Readme. (It may be useful to look at the PR above this as well since it's the model loading logic).

This PR adds two toy models to model suite

SmokeTestModel - This is simple model that uses aten.ops.mm as we can implement a correct version of this op
ToyCoreOpsModel - This is a model which explicitly calls the backwards passes which are both in torchbench + core.

Test Plan:
the test infra is in the pr above, so tests passing on the PR above should be sufficient here

### Future work with Model Suite
#181
@PaliC PaliC changed the title Split of "[ModelSuite] Add model loading infrastructure" [ModelSuite] Add Toy Models Oct 2, 2025
@PaliC PaliC marked this pull request as ready for review October 2, 2025 08:33
@PaliC PaliC requested review from jiannanWang, markkm and msaroufim and removed request for markkm October 2, 2025 09:54
@msaroufim
Copy link
Member

Should we really be having model definitions in this repo? It seems like this is very much the scope of projects like torchbench that are much further along

@jiannanWang
Copy link
Contributor

Should we really be having model definitions in this repo? It seems like this is very much the scope of projects like torchbench that are much further along

I was thinking maybe we can do something similar to this:
https://github.com/pytorch/benchmark/blob/main/torchbenchmark/models/hf_T5/install.py

Instead of saving the model files, we can import them from HF.

@msaroufim
Copy link
Member

Longer form answer from internal chat

I guess your code looks fine but it's solving a very tangential problem in my mind to your stated goals in the proposal of solving training support in backendbench. I don't think it's a good idea for us to get into the model zoo business, there are better repos suited for this like torchbench. Torchbench has other problems like it's not particularly up to date so if we want yet another zoo that's the problem I'd like to see solved head on

Ideally our scope should be focused on kernel binding and correctness checks and we've been heavily reliant on other repos to determine what is important whether that's OpInfo or torchbench - it's totally valid to say I think those projects have deficiencies we'd like to see solved but then that's a very separate discussion

And a more minimal PR we could merge fast would be what we need to run correctness checks on backwards passes using the same input/output pair idea we've been using already

Copy link

meta-cla bot commented Oct 17, 2025

Hi @PaliC!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants