-
Notifications
You must be signed in to change notification settings - Fork 88
Pull requests: quic/efficient-transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Rewrite layer-wise ONNX export as an API -> adds CustomLoader and Loop inside export
#1048
opened Jun 5, 2026 by
ochougul
Contributor
Loading…
feat(0506): Layerwise export: API-driven, env-var-free, opt-in flag
1.22
Release 1.22 candidate
enhancement
New feature or request
#1047
opened Jun 5, 2026 by
vbaddi
Contributor
Loading…
1 task done
feat(0506): Add optional KV-cache buffer-name prefix for vLLM disaggr…
#1046
opened Jun 5, 2026 by
vbaddi
Contributor
Loading…
3 tasks done
[Nightly-CI]: Adding nightly validation in release/1.22.0_tmp branch
#1044
opened Jun 5, 2026 by
abukhoy
Contributor
Loading…
KV handoff with DMA slicing APIs to avoid KV input/output copies.
#1039
opened Jun 4, 2026 by
quic-akuruvil
Contributor
Loading…
[EB] Qwen_3_5_Moe
1.22
Release 1.22 candidate
#1038
opened Jun 4, 2026 by
mohiso22
Contributor
Loading…
Repeatkv transform
1.22
Release 1.22 candidate
#1037
opened Jun 4, 2026 by
quic-dhirajku
Contributor
Loading…
feat(0406): Add Gemma4 Unified vision-language support
enhancement
New feature or request
#1036
opened Jun 4, 2026 by
vbaddi
Contributor
Loading…
Reranker & Embedding: Qwen3-VL single-shot inference with single-specialization compile
1.22
Release 1.22 candidate
embedding
This label is for all the PR related to embedding model.
reranker
This label is for all the PR related to reranker model.
#1031
opened Jun 3, 2026 by
quic-amitraj
Contributor
Loading…
fix(0306): MoE prefill reductions for subfunction export
1.22
Release 1.22 candidate
bugfix
#1028
opened Jun 3, 2026 by
vbaddi
Contributor
Loading…
ci(0306): speed up QAIC PR tests with safe parallelism
enhancement
New feature or request
#1025
opened Jun 2, 2026 by
vbaddi
Contributor
Loading…
Added multi specialization for Qwen2.5-VL, Qwen3-VL and Qwen3_VL_MOE models as per reference from #909.
1.22
Release 1.22 candidate
#1021
opened Jun 2, 2026 by
quic-dhirajku
Contributor
•
Draft
fix: rename device_id → device_ids for API consistency
#1020
opened Jun 2, 2026 by
shagsood
Loading…
[gh-pages]: Release/v1.21.6 Github page Added
#1015
opened Jun 1, 2026 by
abukhoy
Contributor
Loading…
feat(skip-softmax): Add skip-softmax support for KV-blocked attention
enhancement
New feature or request
Qwen image with magcache
Diffusers
Use for PR related to diffusers in efficient-transformers.
performance
#998
opened May 20, 2026 by
quic-amitraj
Contributor
Loading…
Magcache support for Use for PR related to diffusers in efficient-transformers.
performance
Diffuser
Diffusers
#993
opened May 18, 2026 by
quic-amitraj
Contributor
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.