Build, extend, and experiment with scientific agents. One platform. Zero friction.
Recent Updates
- [2026.02] Major Update: More benchmark support, multi-agent architecture (CodeAgent + search agent), refactored run scripts (
run_multi_agent.py/run_single_agent.py), and streamlined project structure. - [2025.08] Science-Star Init: We release Science-Star — an open platform for building, extending, and experimenting with scientific agents.
Science-Star is your open-source platform for scientific AI agents. Built on the ReAct engine with Planning, Action, Memory, and Reflection, it ships with Humanity's Last Exam (HLE) and GAIA benchmarks, rich tooling (search, crawl, PDF, browser, RAG), and end-to-end visualization. One command to run, one platform to extend—whether you're a researcher or a developer, Science-Star gets you from idea to experiment, fast.
Also check out Awesome-Agent-Craft: Our curated collection of papers and benchmarks on unlocking the potential of Scientific AI Agents.
- 🔧 Rich Toolbox — Search (SerpAPI, Tavily, DuckDuckGo, Wayback), crawl (Jina, crawl4ai), PDF parser, browser use, inspector (doc/audio/visual), retriever (RAG), code execution. Add yours via a clean interface.
- 🤖 Multi-Agent & Modular — CodeAgent + search agent out of the box. One config to switch single ↔ multi. Swap loaders, models, tools—core logic stays untouched.
- 📊 HLE & GAIA + Viz — One-click benchmarks with loaders and scorers. Streamlit dashboards for data exploration and result analysis. Zero setup.
- Domain Tools: Chemistry, Biology, and other scientific toolkits
- More Architectures: Beyond ReAct—new agent paradigms
- More Benchmarks: Additional datasets and evaluation suites
- Configure environment — conda, smolagents, API keys, and tests. → Installation
- Run experiments — HLE & GAIA in one command, customize configs, launch visualization dashboards. → Quick Start
- Explore the architecture — tools, loaders, RAG, configs, and how they connect. → Project Structure
- Tools:
science_star/tools/— search, crawl, PDF, browser, retriever - Loaders:
data/hle_loader.py,data/gaia_loader.py - Viz:
visualization/vis_dataset.py,visualization/vis_output.py
Using o4-mini-2025-04-16 as our base model, we have achieved leading results on a small-scale HLE dataset by implementing an end-to-end pipeline that leverages the ReAct framework with integrated planning, action, memory and reflection modules. The project requires further testing and refinement. We invite the open-source community to join us in shaping the future of this work. Let's build together!
We welcome all forms of feedback! Please raise an issue for bugs, questions, or suggestions. This helps our team address common problems efficiently and builds a more productive community.
Student Contributors: Daoyu Wang, Qingchuan Li, Tian Gao, Shuo Yu, Xiaoyu Tao, Ze Guo
Supervisors: Qi Liu, Mingyue Cheng
Affiliation: State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China
We extend our gratitude to OAgent for providing the OAgent and hard work in engineering. We are also thankful to the smolagent team for their fundamental support. Lastly, we deeply appreciate the insightful discussions and contributions from Daoyu Wang, Qingchuan Li, Tian Gao, Shuo Yu, Xiaoyu Tao, Ze Guo.
Science-Star
@misc{Science-Star,
author = {Daoyu Wang, Qingchuan Li, Tian Gao, Shuo Yu, Xiaoyu Tao, Mingyue Cheng, Qi Liu},
title = {Science-Star: An Open Platform for Building, Extending, and Experimenting with Scientific Agents.},
year = {2025},
organization = {GitHub},
url = {https://github.com/Melmaphother/Science-Star},
}
