Refactor code, add auto research and openenv by tastelikefeet · Pull Request #225 · modelscope/twinkle

tastelikefeet · 2026-06-15T03:13:47Z

PR type

PR information

Features

Auto Research 终端代理 — 新增 twinkle_client/auto 模块（AgentLoop + TrainingMonitor + Tools），纯终端 chat 即可控制训练
Skills 知识注入 — 新增 twinkle_client/skills，支持 bundled / local / ModelScope 三级 provider，自动注入训练领域知识
OpenEnv 环境抽象 — 新增 twinkle_agentic/envs，统一多环境工具注入接口，支持自定义环境包接入
ChunkedCrossEntropyLoss 优化 — 分块计算降低显存峰值，修复 require_entropy 逻辑
Grad Clip 重构 — 平台感知的梯度裁剪（GPU/NPU/MPS 自适应）
Multi-turn RL Cookbook — 新增多轮工具调用 GRPO 训练完整示例
Cookbook 目录重组 — rl/ 拆分为 rl/dpo/、rl/grpo/、rl/gkd/，统一附带 .sh 启动脚本
Platform 检测模块 — 新增 utils/platforms/，统一 GPU/NPU/MPS 设备抽象
Template 增强 — DeepSeekV4 模板优化、工具调用解析器重构、Qwen3.5-VL 支持

Bug Fixes

DPO / GKD / GRPO loss 数值稳定性修复
InfoNCE loss in-batch negative 采样修正
Processor require_entropy 冗余计算消除
Megatron multi-LoRA 路径处理修复
vLLM Sampler 遗留代码清理
Server state session 管理修复
Template 工具解析边界条件处理
IterablePackingDataset 拼包逻辑健壮性增强

Tests

新增 tests/loss/ — CE/MSE、DPO、GRPO/GKD 全覆盖
新增 tests/advantage/ — 优势函数单测
新增 tests/metric/ — 全指标覆盖
新增 tests/utils/ — 工具函数测试
新增 tests/twinkle_agentic/test_tools.py — Agentic 工具测试

Docs

双语新增：Agentic（Envs/Protocol/Rollout/Tools/Preprocessor）、Auto（Auto-Research/SkillProvider）、CLI 组件文档

Experiment results

Paste your experiment result here(if needed).

gemini-code-assist

Code Review

This pull request cleans up unused imports across several files, including removing time in manager.py, os in mixin.py, and various types in cli.py. It also removes from __future__ import annotations from cli.py and quotes the ConfigRegistry type annotation as a result. The review feedback points out that removing from __future__ import annotations will break compatibility with Python < 3.10 due to the use of PEP 604 union types and PEP 585 generic collections. It is recommended to restore this import, which also allows using ConfigRegistry directly without quotes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

…actor-1

…actor-1 # Conflicts: # pyproject.toml # src/twinkle/model/megatron/megatron.py

tastelikefeet · 2026-06-28T08:28:30Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces significant updates to the Twinkle framework, including the addition of a new agentic module (twinkle_agentic) with support for multi-turn rollouts, environment pools, and tool use, alongside comprehensive documentation. It refactors training and sampling cookbooks to use a unified CLI configuration, adds Muon optimizer support, and optimizes the ChunkedCrossEntropyLoss backward pass for memory efficiency. Key feedback highlights several critical issues: a missing batched=True parameter in dataset_index.py that would cause type errors, dangerous in-place mutation of logits in the chunked cross-entropy backward pass, missing L2 normalization for cosine similarity calculations in build_thinking_rag_index.py, and missing process group parameters in EmbeddingMetric's distributed gather. Additionally, improvements are suggested to resolve hardcoded absolute paths in evaluation scripts, correct step counter misalignment on training resume, and prevent negative local pool sizes in EnvPool.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

tastelikefeet added 2 commits June 14, 2026 22:39

fix

05f84d9

wip

28ab28f

gemini-code-assist Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/twinkle/cli/cli.py

Comment thread src/twinkle/cli/cli.py

tastelikefeet added 27 commits June 15, 2026 11:43

wip

30e8412

wip

6909de3

wip

c7a8b88

wip

47a8320

wip

18526e5

wip

34a3e9a

wip

420d0da

wip

5f093e5

wip

520f797

fix

a79fb3b

wip

df88556

wip

28ee773

wip

ffeba0b

wip

a762f0f

wip

3f4b9ca

fix

5a765ba

fix

d409fc3

wip

d3eb7c4

wip

24df8d1

wip

ee6ba73

fix

12471dc

fix

e23c212

fix

d6d4896

Merge commit 'd6d4896cd15669902bec654bdbdb7c8100ad1a82' into feat/ref…

eafb978

…actor-1

add unittests

5253fbd

fix

cc7f339

tui

2604e77

tastelikefeet added 21 commits June 21, 2026 17:20

fix

f93bf95

fix

dbe4cb7

fix

f75f20e

fix

182bba1

fix

e9f505b

wip

e25c459

fix

ea80684

fix

1c29af2

wip

4c93d3f

wip

220947d

fix

bcfb206

wip

1c48804

fix

1a17061

fix

405aec0

fix

2a76112

fix

5b30a93

fix

6da60e0

wip

8d05cd4

wip

92faa1c

fix

34d8d13

wip

6bb8416

tastelikefeet changed the title ~~[WIP] Refactor~~ Refactor code, add auto research and openenv Jun 28, 2026

tastelikefeet added 3 commits June 28, 2026 14:46

fix

d8ec5fd

fix

582a04e

Merge commit '1d35323b976c51692830d211969a6429f6c89214' into feat/ref…

e785a17

…actor-1 # Conflicts: # pyproject.toml # src/twinkle/model/megatron/megatron.py

gemini-code-assist Bot reviewed Jun 28, 2026

View reviewed changes

tastelikefeet added 3 commits June 28, 2026 18:47

fix

f08d958

lint

418dce4

lint

9a7af7f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor code, add auto research and openenv#225

Refactor code, add auto research and openenv#225
tastelikefeet wants to merge 66 commits into
modelscope:mainfrom
tastelikefeet:feat/refactor-1

tastelikefeet commented Jun 15, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

tastelikefeet commented Jun 28, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

tastelikefeet commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR type

PR information

Features

Bug Fixes

Tests

Docs

Experiment results

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

tastelikefeet commented Jun 28, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tastelikefeet commented Jun 15, 2026 •

edited

Loading