[Executorch] parallelize op_choose_qparams #15607

kimishpatel · 2025-11-05T18:45:00Z

Stack from ghstack (oldest at bottom):

When doing prefill for quantized kv cache, with large prefill length, parallelizing this op helps.

Differential Revision: D84962234

NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!

When doing prefill for quantized kv cache, with large prefill length, parallelizing this op helps. Differential Revision: [D84962234](https://our.internmc.facebook.com/intern/diff/D84962234/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D84962234/)! [ghstack-poisoned]

pytorch-bot · 2025-11-05T18:45:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15607

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (6 Unrelated Failures)

As of commit c69dd4b with merge base 15a0fcd ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

Test Metal Backend / test-voxtral-metal-e2e / macos-job (gh) (matched macos rule in flaky-rules.json)
File doesn't exist

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-binary-size-linux-gcc / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / test-openvino-linux / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / test-setup-linux-gcc / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest / windows / windows-job (gh) (trunk failure)
backends/xnnpack/test/ops/test_conv1d.py::TestConv1d::test_qs8_conv1d_batchnorm_seq
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
backends/xnnpack/test/ops/test_conv1d.py::TestConv1d::test_qs8_conv1d_batchnorm_seq

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-11-05T18:46:08Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

When doing prefill for quantized kv cache, with large prefill length, parallelizing this op helps. Differential Revision: [D84962234](https://our.internmc.facebook.com/intern/diff/D84962234/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D84962234/)! [ghstack-poisoned]

kimishpatel requested a review from manuelcandales as a code owner November 5, 2025 18:45

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 5, 2025

This was referenced Nov 5, 2025

[Executorch] Add simd path for op quantize #15608

Open

[Executorch] Add multithreading for op_quantize #15609

Open

Reduce allocation overhead in quantized sdpa #15610

Open

[Executorch] Introduce caching cpu memory allocator #15611

Open

meta-codesync bot added fb-exported meta-exported labels Nov 5, 2025

metascroy approved these changes Nov 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Executorch] parallelize op_choose_qparams #15607

[Executorch] parallelize op_choose_qparams #15607

kimishpatel commented Nov 5, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 5, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Executorch] parallelize op_choose_qparams #15607

Are you sure you want to change the base?

[Executorch] parallelize op_choose_qparams #15607

Conversation

kimishpatel commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15607

✅ You can merge normally! (6 Unrelated Failures)

Uh oh!

github-actions bot commented Nov 5, 2025

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kimishpatel commented Nov 5, 2025 •

edited

Loading

pytorch-bot bot commented Nov 5, 2025 •

edited

Loading

This PR needs a `release notes:` label