03 Mar 03:32

v0.1.1 Latest

Latest

Twinkle 0.1.1 version Release

中文

支持Qwen3.5-2B~Qwen3.5-9B等Dense模型

English

Support model series of Qwen3.5-2B~Qwen3.5-9B

Full Changelog: v0.1...v0.11

Assets 2

02 Mar 14:26

v0.1

中文

Twinkle框架的0.1版本发布！

新功能

🎉完整的数据集、DataLoader、Loss、Transformers和Megatron模型、Advantage、Sampler等组件的支持
🎉支持PT、SFT、RL等多种训练Stage，并支持单卡、多机多卡、Ray、Client-Server等多种训练模式
🎉支持了首版的多租户复用训练，并完整开源了server端实现。使用ray serve实现了多副本可扩缩容部署，并支持粘滞路由
🎉在魔搭官方网站上，提供了在线服务，用户可以使用该服务免费训练Qwen/Qwen3-30B-A3B-Instruct-2507，并推送模型到ModelHub上

English

Twinkle Framework Version 0.1 Released!

New Features

🎉 Full support for components including Dataset, DataLoader, Loss, Transformers and Megatron models, Advantage, Sampler, and more
🎉Support for multiple training stages such as PT, SFT, and RL, with various training modes including single-GPU, multi-node multi-GPU, Ray, and Client-Server
🎉 First version of multi-tenant shared training is now supported, with the server-side implementation fully open-sourced. Multi-replica scalable deployment is implemented using Ray Serve, with support for sticky routing
🎉 An online service is now available on the ModelScope official website, where users can train Qwen/Qwen3-30B-A3B-Instruct-2507 for free and push models to ModelHub

What's Changed

Squash to main by @tastelikefeet in #46
rename cmb by @tpx818 in #65
docs: update README and remove ulysses_size from ep_fsdp_qwen3_moe.py by @meichangsu1 in #64
add contrbutors by @yingdachen in #66
fix lora fetch by @tastelikefeet in #67
Update documentation links in README.md by @wangxingjun778 in #68
Fix router by @tastelikefeet in #69
Fix doc links and add tests by @tastelikefeet in #70
Refactor code by @tastelikefeet in #72
Fix compat tinker and update doc by @Yunnglin in #73
[compat] gpt_bridge compat transformers_5 by @Jintao-Huang in #75
Fix server state adapter limit by @Yunnglin in #74
Fix some bugs by @tastelikefeet in #77
[model] support Qwen3.5 series models by @hjh0119 in #76
fix single gpu bug by @tastelikefeet in #78
[bugfix] fix dense model get layer spec by @hjh0119 in #80
fix grad norm bug by @tastelikefeet in #81
Update readme by @yingdachen in #83
Add custom route for sticky session by @Yunnglin in #82
[bugfix] fix 4d attention mask device by @hjh0119 in #85
add more comment for node resouces by @tastelikefeet in #79
Update doc and fix bugs by @tastelikefeet in #84
Fix logps by @tastelikefeet in #86
recover cp sequence before loss by @hjh0119 in #88
[bugfix] fix logps with PP by @hjh0119 in #89
Fix megatron loss by @tastelikefeet in #90
Dev feature by @hzher in #92
Fix proxy by @Yunnglin in #87
fix TEGroupedLinear by @tastelikefeet in #94
[bugfix] fix grpo loss by @hjh0119 in #93
fix numpy version by @tastelikefeet in #95
[bugfix] fix contiguous by @hjh0119 in #96
Add a sample script by @tastelikefeet in #97

New Contributors

@tastelikefeet made their first contribution in #46
@tpx818 made their first contribution in #65
@wangxingjun778 made their first contribution in #68
@hzher made their first contribution in #92

Full Changelog: https://github.com/modelscope/twinkle/commits/v0.1

Contributors

yingdachen, wangxingjun778, and 7 other contributors

Assets 2