Skip to content

prateekg7/SMART-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SMART-LLM: Gemini 2.0 Enhanced Implementation

Smart Multi-Agent Robot Task Planning using Large Language Models

Note

Project Context: This repository is an enhanced implementation of the framework proposed in the research paper "SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models" (ICRA 2024).

This fork improves upon the original baselines by integrating Google's Gemini 2.0 Flash/Pro models, offering superior reasoning capabilities and cost-efficiency compared to the original GPT-based implementation.


📄 Original Research Reference

This project builds strictly upon the foundational work by: Shyam Sundar Kannan, Vishnunandan L. N. Venkatesh, and Byung-Cheol Min

  • Paper Submitted to: IEEE International Conference on Robotics and Automation (ICRA), 2024
  • Original Source: Project Page | arXiv | Video

Abstract of Original Work: SMART-LLM utilizes Large Language Models (LLMs) to convert high-level instructions (e.g., "Prepare breakfast") into multi-robot task plans. It employs a multi-stage process involving Task Decomposition, Coalition Formation, and Task Allocation, validated within the AI2-THOR environment.


🚀 Key Improvements in This Repository

We have modernized and optimized the codebase to achieve higher benchmarks:

  1. Gemini 2.0 Integration: Replaced OpenAI calls with Google's gemini-2.0-flash, reducing latency and improving plan accuracy.
  2. Code Quality: Refactored scripts for professional engineering standards, removing redundant code and improving maintainability.

🛠️ Setup & Installation

1. Environment Setup

Create a conda environment:

conda create -n smartllm python==3.9
conda activate smartllm

2. Install Dependencies

pip install -r requirments.txt

3. API Configuration

This project uses a secure .env file for configuration.

  1. Get a Key: Obtain a Google API Key from Google AI Studio.
  2. Create Config: Create a file named .env in the root directory.
  3. Add Key:
    GOOGLE_API_KEY="your_api_key"

💻 Usage

Generating Plans

Run the script to generate executable robot plans for a specific floor plan.

python3 scripts/run_llm.py --floor-plan {floor_plan_no} --model gemini-2.0-flash
  • Result: Executable code will be generated in the logs/gemini_runs/ directory.

Executing Plans in AI2-THOR

To visualize and verify the generated plan:

  1. Identify the generated folder in logs/gemini_runs/.
  2. Run the execution script:
    python3 scripts/execute_plan.py --command {generated_folder_name}

📚 Dataset & Resources

  • Task Data: Located in data/final_test/ (Instructions, robot capabilities, ground truth).
  • Robot Skills: Defined in resources/robots.py.
  • AI2-THOR: Layouts can be previewed at AI2-THOR Demo.

📝 Citation

If you use this work, please cite the original authors who established this framework:

@article{kannan2023smart,
    title={SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models},
    author={Kannan, Shyam Sundar and Venkatesh, Vishnunandan LN and Min, Byung-Cheol},
    journal={arXiv preprint arXiv:2309.10062},
    year={2023}
}

About

Enhanced implementation of SMART-LLM (ICRA 2024) using Google's Gemini 2.0 Flash for optimized multi-robot task planning and improved benchmarks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages