-
Notifications
You must be signed in to change notification settings - Fork 69
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
- Status: Open.#461 In openinfer-project/openinfer;
- Status: Open.#459 In openinfer-project/openinfer;
Tracking: Qwen3-4B kernel-list performance vs roofline
qwen3Qwen3-4B model crate (pegainfer-qwen3-4b)Qwen3-4B model crate (pegainfer-qwen3-4b)roadmapTracks features, enhancements, or milestones planned as part of the project roadmapTracks features, enhancements, or milestones planned as part of the project roadmapStatus: Open.#456 In openinfer-project/openinfer;- Status: Open.#453 In openinfer-project/openinfer;
- Status: Open.#452 In openinfer-project/openinfer;
qwen35: design tensor parallelism for hybrid recurrent state
qwen35Qwen3.5-4B model crate (pegainfer-qwen35-4b)Qwen3.5-4B model crate (pegainfer-qwen35-4b)Status: Open.#446 In openinfer-project/openinfer;[Roadmap] Speculative decoding (qwen3 first, shared primitives)
qwen3Qwen3-4B model crate (pegainfer-qwen3-4b)Qwen3-4B model crate (pegainfer-qwen3-4b)roadmapTracks features, enhancements, or milestones planned as part of the project roadmapTracks features, enhancements, or milestones planned as part of the project roadmapStatus: Open.#443 In openinfer-project/openinfer;- Status: Open.#435 In openinfer-project/openinfer;
qwen35: explore DFlash speculative decoding
enhancementNew feature or requestNew feature or requestqwen35Qwen3.5-4B model crate (pegainfer-qwen35-4b)Qwen3.5-4B model crate (pegainfer-qwen35-4b)Status: Open.#434 In openinfer-project/openinfer;qwen3: batched decode is not batch-invariant — a request's tokens depend on its batch-mates
questionFurther information is requestedFurther information is requestedqwen3Qwen3-4B model crate (pegainfer-qwen3-4b)Qwen3-4B model crate (pegainfer-qwen3-4b)Status: Open.#414 In openinfer-project/openinfer;- Status: Open.#408 In openinfer-project/openinfer;
- Status: Open.#403 In openinfer-project/openinfer;