Refactor code, add auto research and openenv#225
Conversation
There was a problem hiding this comment.
Code Review
This pull request cleans up unused imports across several files, including removing time in manager.py, os in mixin.py, and various types in cli.py. It also removes from __future__ import annotations from cli.py and quotes the ConfigRegistry type annotation as a result. The review feedback points out that removing from __future__ import annotations will break compatibility with Python < 3.10 due to the use of PEP 604 union types and PEP 585 generic collections. It is recommended to restore this import, which also allows using ConfigRegistry directly without quotes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces significant updates to the Twinkle framework, including the addition of a new agentic module (twinkle_agentic) with support for multi-turn rollouts, environment pools, and tool use, alongside comprehensive documentation. It refactors training and sampling cookbooks to use a unified CLI configuration, adds Muon optimizer support, and optimizes the ChunkedCrossEntropyLoss backward pass for memory efficiency. Key feedback highlights several critical issues: a missing batched=True parameter in dataset_index.py that would cause type errors, dangerous in-place mutation of logits in the chunked cross-entropy backward pass, missing L2 normalization for cosine similarity calculations in build_thinking_rag_index.py, and missing process group parameters in EmbeddingMetric's distributed gather. Additionally, improvements are suggested to resolve hardcoded absolute paths in evaluation scripts, correct step counter misalignment on training resume, and prevent negative local pool sizes in EnvPool.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
PR type
PR information
Features
twinkle_client/auto模块(AgentLoop + TrainingMonitor + Tools),纯终端 chat 即可控制训练twinkle_client/skills,支持 bundled / local / ModelScope 三级 provider,自动注入训练领域知识twinkle_agentic/envs,统一多环境工具注入接口,支持自定义环境包接入rl/拆分为rl/dpo/、rl/grpo/、rl/gkd/,统一附带.sh启动脚本utils/platforms/,统一 GPU/NPU/MPS 设备抽象Bug Fixes
require_entropy冗余计算消除Tests
tests/loss/— CE/MSE、DPO、GRPO/GKD 全覆盖tests/advantage/— 优势函数单测tests/metric/— 全指标覆盖tests/utils/— 工具函数测试tests/twinkle_agentic/test_tools.py— Agentic 工具测试Docs
Experiment results
Paste your experiment result here(if needed).