Releases: modelscope/twinkle
Releases · modelscope/twinkle
v0.1.1
Twinkle 0.1.1 version Release
中文
- 支持Qwen3.5-2B~Qwen3.5-9B等Dense模型
English
- Support model series of Qwen3.5-2B~Qwen3.5-9B
Full Changelog: v0.1...v0.11
v0.1
中文
Twinkle框架的0.1版本发布!
新功能
- 🎉完整的数据集、DataLoader、Loss、Transformers和Megatron模型、Advantage、Sampler等组件的支持
- 🎉支持PT、SFT、RL等多种训练Stage,并支持单卡、多机多卡、Ray、Client-Server等多种训练模式
- 🎉支持了首版的多租户复用训练,并完整开源了server端实现。使用ray serve实现了多副本可扩缩容部署,并支持粘滞路由
- 🎉在魔搭官方网站上,提供了在线服务,用户可以使用该服务免费训练
Qwen/Qwen3-30B-A3B-Instruct-2507,并推送模型到ModelHub上
English
Twinkle Framework Version 0.1 Released!
New Features
- 🎉 Full support for components including Dataset, DataLoader, Loss, Transformers and Megatron models, Advantage, Sampler, and more
- 🎉Support for multiple training stages such as PT, SFT, and RL, with various training modes including single-GPU, multi-node multi-GPU, Ray, and Client-Server
- 🎉 First version of multi-tenant shared training is now supported, with the server-side implementation fully open-sourced. Multi-replica scalable deployment is implemented using Ray Serve, with support for sticky routing
- 🎉 An online service is now available on the ModelScope official website, where users can train
Qwen/Qwen3-30B-A3B-Instruct-2507for free and push models to ModelHub
What's Changed
- Squash to main by @tastelikefeet in #46
- rename cmb by @tpx818 in #65
- docs: update README and remove ulysses_size from ep_fsdp_qwen3_moe.py by @meichangsu1 in #64
- add contrbutors by @yingdachen in #66
- fix lora fetch by @tastelikefeet in #67
- Update documentation links in README.md by @wangxingjun778 in #68
- Fix router by @tastelikefeet in #69
- Fix doc links and add tests by @tastelikefeet in #70
- Refactor code by @tastelikefeet in #72
- Fix compat tinker and update doc by @Yunnglin in #73
- [compat] gpt_bridge compat transformers_5 by @Jintao-Huang in #75
- Fix server state adapter limit by @Yunnglin in #74
- Fix some bugs by @tastelikefeet in #77
- [model] support Qwen3.5 series models by @hjh0119 in #76
- fix single gpu bug by @tastelikefeet in #78
- [bugfix] fix dense model get layer spec by @hjh0119 in #80
- fix grad norm bug by @tastelikefeet in #81
- Update readme by @yingdachen in #83
- Add custom route for sticky session by @Yunnglin in #82
- [bugfix] fix 4d attention mask device by @hjh0119 in #85
- add more comment for node resouces by @tastelikefeet in #79
- Update doc and fix bugs by @tastelikefeet in #84
- Fix logps by @tastelikefeet in #86
- recover cp sequence before loss by @hjh0119 in #88
- [bugfix] fix logps with PP by @hjh0119 in #89
- Fix megatron loss by @tastelikefeet in #90
- Dev feature by @hzher in #92
- Fix proxy by @Yunnglin in #87
- fix TEGroupedLinear by @tastelikefeet in #94
- [bugfix] fix grpo loss by @hjh0119 in #93
- fix numpy version by @tastelikefeet in #95
- [bugfix] fix contiguous by @hjh0119 in #96
- Add a sample script by @tastelikefeet in #97
New Contributors
- @tastelikefeet made their first contribution in #46
- @tpx818 made their first contribution in #65
- @wangxingjun778 made their first contribution in #68
- @hzher made their first contribution in #92
Full Changelog: https://github.com/modelscope/twinkle/commits/v0.1