Swiglu gating_ QAT_Residual Attention Scalin _EMA- Sliding window_Optimizations for 10min/16MB Track#2159
Open
visin109 wants to merge 3 commits into
Open
Swiglu gating_ QAT_Residual Attention Scalin _EMA- Sliding window_Optimizations for 10min/16MB Track#2159visin109 wants to merge 3 commits into
visin109 wants to merge 3 commits into