Share kv cache compile spec by Gasoonjia · Pull Request #18864 · pytorch/executorch

Gasoonjia · 2026-04-14T06:03:09Z

Currently we blindly share kv cache cross all prefill + decode methods, making parakeet model generate garbage output.
This PR creates a cuda backend spec to control the KV cache sharing across different methods. Default is not sharing.

…ods compile spec The cross-method constant sharing code in CudaBackend::init() was running unconditionally for all multi-method models, which corrupts weights for models like Parakeet where different methods have different sub-models (encoder, decoder, joint) that should NOT share constants. This change: - Adds a new `share_kv_cache_across_methods` compile spec that must be explicitly set to enable cross-method constant sharing - Guards the sharing logic behind this compile spec (previously ran for all models with the required AOTI APIs) - Makes sharing failures return Error::Internal instead of just logging - Adds generate_share_kv_cache_compile_spec() to AotiBackend Python API - Updates Qwen3.5 MoE export to opt-in to sharing for prefill/decode Without this spec set, each method gets its own independent constants, fixing the Parakeet CUDA CI regression.

pytorch-bot · 2026-04-14T06:03:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18864

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rolling out OSDC (ARC) runners on pull workflow for PyTorch trunk commits

❌ 1 New Failure, 3 Unrelated Failures

As of commit c3456e6 with merge base 7e099b4 ():

NEW FAILURE - The following job has failed:

pull / unittest-editable / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-coreml-bc-macos (macos-m2-stable) / macos-job (gh) (matched macos rule in flaky-rules.json)
File doesn't exist

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-14T06:03:53Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Gasoonjia added 2 commits April 13, 2026 22:59

lint

083120f

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 14, 2026

Gasoonjia added the ciflow/cuda label Apr 14, 2026

Merge branch 'main' into share-kv-cache-compile-spec

c3456e6

Gasoonjia marked this pull request as ready for review April 14, 2026 07:45

Gasoonjia requested a review from lucylq as a code owner April 14, 2026 07:45

digantdesai approved these changes Apr 14, 2026

View reviewed changes

Gasoonjia merged commit 875f7c8 into main Apr 14, 2026
347 of 352 checks passed

Gasoonjia deleted the share-kv-cache-compile-spec branch April 14, 2026 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Share kv cache compile spec#18864

Share kv cache compile spec#18864
Gasoonjia merged 3 commits intomainfrom
share-kv-cache-compile-spec

Gasoonjia commented Apr 14, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Gasoonjia commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18864

❗ 1 Active SEVs

❌ 1 New Failure, 3 Unrelated Failures

Uh oh!

github-actions bot commented Apr 14, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Gasoonjia commented Apr 14, 2026 •

edited

Loading

pytorch-bot bot commented Apr 14, 2026 •

edited

Loading

This PR needs a `release notes:` label