Jiangzs1028

Zishang Jiang Jiangzs1028

Achievements

MENTOR MENTOR Public

(ICLR2026) Selective Expert Guidance For Effective and Diverse Exploration in Reinforcement Learning of LLMs

Python 8
DUCL DUCL Public

(AAAI26 Oral) Difficulty Is Not Enough: Curriculum Learning for LLMs Fine-tuning Must Consider Utility

Python
lsdefine/simple_GRPO lsdefine/simple_GRPO Public

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1.7k 132