Skip to content

Replace arbitrary sequence length kernel wrapper with a padding wrapper.#14

Merged
maximilianmbeck merged 4 commits intomainfrom
add_varlen_state_passing_test
Mar 1, 2026
Merged

Replace arbitrary sequence length kernel wrapper with a padding wrapper.#14
maximilianmbeck merged 4 commits intomainfrom
add_varlen_state_passing_test

Conversation

@maximilianmbeck
Copy link
Collaborator

This PR replaces the default wrap_chunkwise__arbitrary_sequence_length kernel wrapper for inference (more precisely for prefill with the wrap_chunkwise__arbitrary_sequence_length_with_padding kernel wrapper.

We keep the old wrapper functions for reference.

In short, the new wrapper relies on the insight that we can pad the qkv sequence from the right with zeros and set the forget gate to 1 and the input gate to 0, so that the final memory state is maintained over the padding time steps.

This is more efficient than the previous arbitrary sequence length kernel wrapper, which relied on a sequence of chunkwise and step kernel calls to precisely match the arbitrary sequence length.

@github-actions
Copy link

github-actions bot commented Mar 1, 2026


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@maximilianmbeck
Copy link
Collaborator Author

I have read the CLA Document and I hereby sign the CLA

@maximilianmbeck
Copy link
Collaborator Author

recheck

@maximilianmbeck maximilianmbeck merged commit b8c4817 into main Mar 1, 2026
1 of 2 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 1, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant