flashinfer-ai / flashinfer Public

Notifications You must be signed in to change notification settings
Fork 914
Star 5.5k

Code
Issues 337
Pull requests 233
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: flashinfer-ai/flashinfer

Labels 67 Milestones 0

New pull request New

233 Open 2,059 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix(mla): widen page index to int64_t to avoid 32-bit overflow op: attention

#3136 opened Apr 21, 2026 by Tracin

Loading…

1 of 5 tasks

fix(jit): decouple FLASHINFER_JIT_VERBOSE from debug compilation flags

#3135 opened Apr 21, 2026 by leonardozcm

Loading…

feat: Add a dsa_graph_safe flag to topk

#3133 opened Apr 21, 2026 by zianglih Contributor • Draft

5 tasks

[CuTe DSL] Fix FP8 MLA persistent perf regression and ProxyKind cu13 wheel breakage

#3132 opened Apr 21, 2026 by pgera Contributor

Loading…

6 tasks done

feat: Enable FP8 (E4M3/E5M2) in concat_mla_k for optimize long-context prefill performance and refactor type dispatch for BF16/FP16

#3129 opened Apr 21, 2026 by qiching Collaborator

Loading…

4 tasks done

chore: Address non-blocking review feedback for #3051 / #3080 op: gemm op: moe run-ci

#3128 opened Apr 21, 2026 by bkryu Collaborator

Loading…

3 of 5 tasks

optimize gdn decode bf16 state kernel for mtp with caching. model: qwen3.5

#3127 opened Apr 20, 2026 by ameynaik-hub Contributor

Loading…

5 tasks

autotuner: check cache before synthesizing profile input tensors run-ci

#3126 opened Apr 20, 2026 by leejnau Contributor

Loading…

5 tasks done

docs(pod): flesh out POD-attention wrapper docstrings

#3124 opened Apr 20, 2026 by Zlatanwic

Loading…

5 tasks done

bump version to 0.6.9

#3123 opened Apr 20, 2026 by aleozlx Collaborator • Draft

feat(sampling): add fused top-k/top-p sampling with filtered probabil… op: misc run-ci

#3121 opened Apr 20, 2026 by haozhihuiYmh150 • Draft

2 of 5 tasks

feat(gdn): add Triton WY-parallel MTP decode kernels model: qwen3.5

#3120 opened Apr 20, 2026 by ameynaik-hub Contributor • Draft

5 tasks

perf(gdn): fix bf16_state T=1 per-call overhead and add pool+padding … model: qwen3.5

#3118 opened Apr 19, 2026 by ameynaik-hub Contributor

Loading…

5 tasks

Port TRT-LLM fused qk norm rope op: attention op: comm op: gemm op: moe

#3117 opened Apr 18, 2026 by murphymatt Contributor

Loading…

5 tasks done

Reland support lse in trtllm paged attn kernels op: attention

#3116 opened Apr 18, 2026 by murphymatt Contributor

Loading…

5 tasks done

perf(autotuner): replace power-of-2 token buckets with hybrid spacing & fix missing routing_replay_out arg op: gemm op: moe run-ci

#3115 opened Apr 18, 2026 by StudyingShao

Loading…

5 tasks done

Normalize modular attention to CUTLASS DSL APIs shipped on PyPI op: attention run-ci

#3112 opened Apr 17, 2026 by saltyminty Collaborator

Loading…

5 tasks

fix(fmha_v2): enable flash_attention for Q_PAGED_KV regardless of s_kv

#3106 opened Apr 17, 2026 by blake-snc Contributor

Loading…

Report unit test files with no result

#3105 opened Apr 17, 2026 by dierksen Collaborator

Loading…

2 of 5 tasks

Fix multi-instances using same random seed op: comm

#3102 opened Apr 17, 2026 by guyuankan

Loading…

4 of 5 tasks

Add int8 paged KV support to main paths op: attention

#3100 opened Apr 17, 2026 by lesj0610

Loading…

5 tasks done

Add int4 paged KV support to main paths op: attention

#3101 opened Apr 17, 2026 by lesj0610

Loading…

5 tasks done

WIP: B12x micro kernel merged op: gemm op: moe

#3098 opened Apr 17, 2026 by askliar Contributor

Loading…

Support NVFP4 KV for prefill and batch attention kernels op: attention run-ci

#3097 opened Apr 17, 2026 by Tom-Zheng Contributor

Loading…

5 tasks done

feat: implement configurable tie_break for filtered topk op: misc run-ci

#3095 opened Apr 17, 2026 by zianglih Contributor

Loading…

5 tasks done

Previous 1 2 3 4 5 … 9 10 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2026-03-21.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!