Skip to content

Conversation

msaroufim
Copy link
Member

@msaroufim msaroufim commented Aug 18, 2025

Alright lots of moving parts here and apologies for the lengthy PR but I'm opening this primarily to communicate with @Laurawly and @shaahins. This PR is in the same vein as #79

The goal here is to have high confidence that DirectoryBackend is correct and that means

  1. We aren't missing any torchbench operators (we will expand to others after launch)
  2. For any ops that we do implement we're confident they get overridden by our custom implementations
  3. We don't stretch to new features if we have low confidence they're working

To ensure we're not missing any torchbench operators I have two scripts that generate a fully functional DirectoryBackend with a root generated_kernels/ and each op name underneath it as a seperate folder. The mapping between a PyTorch operator and folder name can sometimes be tricky. For example what do you expect relu.default to default to and so if you're unsure you can generate them with the full result here. In addition as I went through this exercise some ops didn't seem to have much documentation about them despite being in torchbench, so I made a list of ops with no docstrings here

For any ops that we do implement we're confident they get overridden: The problem here is if we generate a correct operator and rely on logging statement we might still be loading the eager function since a correct function would be hard to distinguish. Instead what I do is purposefully load kernels that are incorrect with watermarked values of 42 that way we know the wrong kernel was loaded by us

Stretch of new features: one of the pitches of BackendBench was that this is a package you could install, I'd rather we tone down that pitch until we figure out details with the IntraKernel dispatcher so I'm deleting most of our code in __init__py

TODO in future PRs

  • Add the real shapes to the README of each shape @msaroufim
  • Make it clearer what the expected dtype is, for now we're shipping fp16 but we should do bf16 instead @PaliC

@msaroufim msaroufim marked this pull request as draft August 18, 2025 17:21
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 18, 2025
@msaroufim msaroufim marked this pull request as ready for review August 19, 2025 19:55
@msaroufim msaroufim requested a review from Laurawly August 19, 2025 22:46
@msaroufim msaroufim changed the title Add Directory STructure for BackendBench DirectoryBackend op to folder mapping Aug 20, 2025
@PaliC PaliC mentioned this pull request Aug 20, 2025
@msaroufim
Copy link
Member Author

@PaliC @jiannanWang could I please get a review?

@Laurawly
Copy link
Contributor

Laurawly commented Aug 20, 2025

The watermark strategy is clever, just wondering shall we consider using different watermark values per operator (e.g., 42 + hash(op_name) % 100) to catch cross-contamination, and adding a flag to easily switch between watermark mode and correctness mode?

Copy link
Contributor

@jiannanWang jiannanWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice scripts! Generally look good to me.


# For common suffixes, we might want to keep them to distinguish overloads
if suffix in ["out", "inplace", "scalar"]:
op_name = f"{base}_{suffix}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what is the policy here with overloads + directorybench? What happens if there is both a base name implementation in the directory + overload?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're the same op fundamentally just with different inputs. For example if an input is scalar then you expect to broadcast. PyTorch ops are general so I'm leaning on making them all 1 kernel

@PaliC
Copy link
Contributor

PaliC commented Aug 20, 2025

Generally lgtm, however, I would consult / get a review from @Laurawly before merging

@msaroufim msaroufim merged commit 1729da7 into main Aug 20, 2025
3 checks passed
Laurawly added a commit that referenced this pull request Aug 24, 2025
- Move scripts to scripts/ folder as requested by PaliC
- Replace shell scripts with Python implementation using logging
- Reuse PR #90's clean_op_name_for_directory function
- Keep TORCHBENCH_CORE_OPS list but document it better
- Remove hardcoded shell scripts in favor of Python script

This addresses Mark's comment about reusing PR #90's work and
PaliC's suggestions for better code organization.
Laurawly added a commit that referenced this pull request Aug 24, 2025
…tegration

- Categorize all 143 TorchBench operations into Triton-friendly (88), capable (34), and challenging (21)
- Add FP16/BF16 filtering to eval.py for better Triton compatibility
- Update KernelAgent backend to use PR #90 directory structure
- Consolidate scripts and move to BackendBench/scripts/
- Replace print statements with proper logging
- Remove experimental kernel_agent_fp16 backend in favor of filtering flag
- Add comprehensive operation classification based on Triton compiler analysis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants