Adding training support #160

ParamThakkar123 · 2025-09-11T12:35:26Z

On completion, currently WIP fixes: #42

…nto training

ParamThakkar123 · 2025-10-23T08:18:45Z

This PR adds training support to BackendBench moving from inference only to supporting training as well. This PR adds the following:

prompts for LLMs to generate backward passes of kernel by adding a create_backward_prompt() function to KernelTemplate, TritonKernelTemplate, PyTorchKernelTemplate, CuTeDSLKernelTemplate,
Added a register_kernel() method in the OpRegistry class to register forward and backward pass kernels
Added a new file train.py with the following:
BackendBench.train.TrainingTestCase — container for inputs, target, params, and optional loss_fn.
BackendBench.train.TrainingTestSuite — collection type for multiple test cases.
BackendBench.train._mse_loss — default MSE loss used when none provided.
BackendBench.train._compute_numerical_grads — finite-difference numerical gradient calculator used as fallback.
BackendBench.train.train_one_op — main training loop that:

Prepares inputs/params (device/grad handling).
Runs forward via provided kernel implementation.
Synthesizes a target via a reference operator if needed (resolves via op registry).
Attempts autograd gradients on kernel; falls back to numerical grads if unavailable.
Computes reference gradients via autograd on a reference op (if available).
Compares gradients (relative error threshold) and applies simple SGD updates to params or inputs.
Returns metrics: grad_correct, grad_rel_error, step_time_ms, final_loss.

Tests for each of the new functions implemented with dummy kernel implementations.

[WIP] Adding training support

dfda4f5

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 11, 2025

ParamThakkar123 marked this pull request as draft September 11, 2025 12:35

ParamThakkar123 added 3 commits October 17, 2025 11:38

Merge branch 'main' of https://github.com/meta-pytorch/BackendBench i…

9f36d3d

…nto training

Merge branch 'main' of https://github.com/meta-pytorch/BackendBench i…

4660141

…nto training

Added complete training support

44da410

ParamThakkar123 marked this pull request as ready for review October 23, 2025 08:19

ParamThakkar123 changed the title ~~[WIP] Adding training support~~ Adding training support Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding training support #160

Adding training support #160

ParamThakkar123 commented Sep 11, 2025 •

edited

Loading

Uh oh!

ParamThakkar123 commented Oct 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Adding training support #160

Are you sure you want to change the base?

Adding training support #160

Conversation

ParamThakkar123 commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ParamThakkar123 commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ParamThakkar123 commented Sep 11, 2025 •

edited

Loading

ParamThakkar123 commented Oct 23, 2025 •

edited

Loading