generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 21
Pull requests: aws-neuron/neuronx-distributed-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix all four scaling multipliers for Granite
#48
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Add ShardedRMSNorm for Q-K normalization under tensor parallelism
#47
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Add NoPE layer support and tied embeddings for SmolLM3-3B
#46
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Add Q-K normalization and scaled embeddings for Gemma-3-1b-it
#45
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Add LongRoPE and fix state dict conversion for Phi-3.5-mini-instruct
#44
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Fix state dict mapping and add partial RoPE for Phi-1.5
#43
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Update GPT-2 with Conv1D transposition and vocab padding
#42
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Update Pythia-2.8B GPTNeoX model with validated accuracy
#41
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Update StableLM-2-1.6B with partial RoPE and LayerNorm
#40
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Fix interleaved RoPE and partial rotary factor for GLM-4
#39
opened Feb 18, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Update OLMo-2-1B-Instruct with ShardedRMSNorm for TP Q-K norm
#38
opened Feb 17, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
Enable OnDeviceSamplingConfig for compiler accuracy fix
#37
opened Feb 17, 2026 by
sdeeptan-aws
Loading…
11 of 14 tasks
feat: add expert_wise_scale support for per-expert FP8 quantization in MoE models
#35
opened Feb 13, 2026 by
lifelongeeek
Loading…
8 of 10 tasks
ProTip!
Filter pull requests by the default branch with base:main.