Skip to content

Pinned Loading

  1. understand-r1-zero understand-r1-zero Public

    Understanding R1-Zero-Like Training: A Critical Perspective

    Python 1.2k 58

  2. zero-bubble-pipeline-parallelism zero-bubble-pipeline-parallelism Public

    Forked from NVIDIA/Megatron-LM

    Zero Bubble Pipeline Parallelism

    Python 453 34

  3. lorahub lorahub Public

    [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

    Python 668 42

  4. oat oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    Python 648 63

  5. stde stde Public

    Official implementation of Stochastic Taylor Derivative Estimator (STDE) NeurIPS2024

    Python 128 10

  6. feedback-conditional-policy feedback-conditional-policy Public

    Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

    Python 62 2

Repositories

Showing 10 of 101 repositories
  • jrystal Public

    A JAX-based Differentiable Density Functional Theory Framework for Materials

    sail-sg/jrystal’s past year of commit activity
    Python 48 Apache-2.0 1 5 3 Updated Apr 17, 2026
  • envpool Public

    C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

    sail-sg/envpool’s past year of commit activity
    C++ 1,327 Apache-2.0 130 19 1 Updated Apr 17, 2026
  • odc Public

    On demand communication

    sail-sg/odc’s past year of commit activity
    Python 32 2 1 4 Updated Apr 16, 2026
  • TeamHOI Public

    [CVPR 2026] TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

    sail-sg/TeamHOI’s past year of commit activity
    Python 30 MIT 0 0 0 Updated Mar 12, 2026
  • Stable-RL Public

    Rethinking the Trust Region in LLM Reinforcement Learning

    sail-sg/Stable-RL’s past year of commit activity
    Python 52 Apache-2.0 5 0 5 Updated Mar 2, 2026
  • oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    sail-sg/oat’s past year of commit activity
    Python 648 Apache-2.0 63 6 1 Updated Jan 29, 2026
  • sail-sg/LifelongSafetyAlignment’s past year of commit activity
    Python 11 0 0 0 Updated Jan 13, 2026
  • feedback-conditional-policy Public

    Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

    sail-sg/feedback-conditional-policy’s past year of commit activity
    Python 62 2 0 0 Updated Jan 5, 2026
  • InfNeRF Public

    InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity

    sail-sg/InfNeRF’s past year of commit activity
    Python 12 Apache-2.0 1 1 0 Updated Jan 3, 2026
  • SkyLadder Public Forked from jzhang38/TinyLlama

    The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling

    sail-sg/SkyLadder’s past year of commit activity
    Python 42 Apache-2.0 615 1 0 Updated Dec 29, 2025

Top languages

Loading…

Most used topics

Loading…