A modular Python library for building LLM-powered agents with tool execution, web browsing capabilities, and observability. This library provides a flexible framework for creating autonomous agents that can interact with files, execute code, browse the web, and perform complex reasoning tasks.
MiniMinds is a lightweight framework that abstracts the complexity of building generative AI agents. It provides:
- Pluggable LLM Support: Abstract interface for multiple LLM providers (Groq, OpenAI, Gemini)
- Tool System: Decorator-based tool registration with automatic schema generation
- Agent Framework: Base classes for implementing stateful, iterative agent workflows
- Web Automation: Browser control via Playwright for web-based tasks
- Observability: Built-in Langfuse integration for tracing and debugging
- Context Management: Session-based resource management for multi-agent scenarios
The library demonstrates practical agent patterns including automated unit test generation, web exploration, and file manipulation.
- Multi-Provider LLM Support: Abstracted client interface supporting Groq, with extensibility for OpenAI and Gemini
- Declarative Tool System: Define tools using Python decorators; automatic conversion to LLM-compatible schemas
- Built-in Tool Suites:
- File operations (read, write, list, create/remove directories)
- Code execution (run Python files, execute pytest)
- Web automation (navigate, click, fill forms, screenshot)
- Utility tools (JSON validation, string manipulation, math operations)
- Agent Lifecycle Management: Base agent class with configurable iteration limits and state tracking
- Persistent Sessions: Context managers for managing browser instances and agent state
- Context Optimization: Scratchpad pattern for pruning conversation history to reduce token usage
- Tracing & Observability: Langfuse decorators for LLM call and tool execution monitoring
User Query
↓
[Agent State Initialization]
↓
┌─────────────────────────────────┐
│ Iterative Reasoning Loop │
│ ┌─────────────────────────┐ │
│ │ 1. LLM Generate │ │
│ │ (with tool schemas) │ │
│ └───────────┬─────────────┘ │
│ ↓ │
│ ┌─────────────────────────┐ │
│ │ 2. Parse Response │ │
│ │ (content + tool calls)│ │
│ └───────────┬─────────────┘ │
│ ↓ │
│ ┌─────────────────────────┐ │
│ │ 3. Execute Tools │ │
│ │ (via registry) │ │
│ └───────────┬─────────────┘ │
│ ↓ │
│ ┌─────────────────────────┐ │
│ │ 4. Update State │ │
│ │ (messages + status) │ │
│ └───────────┬─────────────┘ │
│ ↓ │
│ [Check Stop Condition] │
│ ↓ │
└──────────────┬─────────────────┘
↓
Final Result
- Tool Definition: Functions decorated with
@tool()are converted toToolobjects - Schema Generation: Tools are serialized to OpenAI/Gemini-compatible function schemas
- Registry Management:
ToolRegistryaggregates tools and manages session injection - LLM Integration: Tool schemas passed to LLM; tool calls parsed from responses
- Execution: Tools invoked via registry with automatic session ID injection
The library implements two agent patterns for handling conversation context:
- Simple Agent (
v1_simple.py): Retains full message history; suitable for short tasks - Scratchpad Agent (
v2_scratchpad.py): Prunes tool outputs after each iteration, maintaining only system/user messages and latest assistant state
- Python: 3.11+
- Pydantic: Data validation and settings management
- Loguru: Structured logging
- Groq: Primary LLM provider (llama-3.3-70b-versatile, llama-3.1-70b-versatile)
- OpenAI: Supported via abstract interface
- Gemini: Supported via abstract interface
- Playwright: Headless browser automation for web interaction
- Pytest: Test execution and validation
- Langfuse: Distributed tracing for LLM calls and tool executions
- UV: Fast Python package installer and environment manager
- Python 3.11 or higher
- UV package manager (recommended)
pip install uvgit clone https://github.com/kariem-magdy/GenAI-Agent-Lab-Library-MiniMinds.git
cd GenAI-Agent-Lab-Library-MiniMindsuv syncLinux/Mac:
source .venv/bin/activateWindows:
.venv\Scripts\activateCreate a .env file in the project root:
GROQ_API_KEY=your_groq_api_key_here
# Optional: for observability
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_keyGet your Groq API key from console.groq.com
playwright install chromiumfrom tools.toolkit.builtin.math_tools import add, multiply
# Tools are callable
result = add(5, 3) # Returns 8
# Get tool description for LLM
print(add.to_string())
# Output: Tool Name: add, Description: Adding Numbers, Arguments: a: int|float, b: int|float, Outputs: int|floatfrom tools.decorator import tool
@tool()
def fetch_weather(city: str) -> dict:
"""Fetch weather information for a given city."""
# Implementation here
return {"city": city, "temp": 22}
# Tool is now registered and has schema generation
print(fetch_weather.to_openai_format())from llm.groq_client import GroqClient, LLMConfig
from tools.registry import ToolRegistry
from tools.toolkit.builtin import math_tools, string_tools
# Configure LLM
config = LLMConfig(
model_name="llama-3.3-70b-versatile",
temperature=0.7,
max_tokens=2048
)
client = GroqClient(config)
# Register tools
registry = ToolRegistry()
registry.register_from_module(math_tools)
registry.register_from_module(string_tools)
# Create messages
messages = [
{"role": "system", "content": f"You are a helpful assistant. Available tools:\n{registry.to_string()}"},
{"role": "user", "content": "Calculate 15 * 7 and convert the result to uppercase string"}
]
# Generate with tools
response = client.generate(messages, tools=registry.to_client_tools(config.provider))
print(response)The library includes a practical example agent that generates and executes unit tests:
from agent.unit_tester.v2_scratchpad import ScratchpadUnitTesterAgent
from llm.groq_client import GroqClient, LLMConfig
config = LLMConfig(
model_name="llama-3.3-70b-versatile",
temperature=1.0,
max_tokens=5000
)
client = GroqClient(config)
agent = ScratchpadUnitTesterAgent(client, max_iterations=20)
user_query = """
Write unit tests for tools/toolkit/web_explorer.py and run them.
Output test results in tools/llm_tests/ directory.
"""
state = agent.iterate(user_query=user_query)
print(state.messages[-1]) # Final reportRun the example:
python -m agent.examples.03_use_v2_agentfrom session import Session
from tools.registry import ToolRegistry
import tools.toolkit.web_explorer as web_tools
with Session("web-session") as session:
registry = ToolRegistry(session.session_id)
registry.register_from_module(web_tools)
# Navigate to a URL
status = registry.get("goto_url")("https://example.com")
print(status)
# Extract page content
content = registry.get("get_page_content")(mode="text")
print(content)
# Take screenshot
screenshot_data = registry.get("screenshot")(full_page=True)
# Clean up
registry.get("end_browsing_page")()Scenario: Generate pytest tests for a Python module, execute them, and report results.
Agent Workflow:
- File Discovery: Agent lists project files using
list_directory_filestool - Code Reading: Reads target module using
read_filetool - Test Generation: Uses LLM to generate pytest test cases
- Test Writing: Writes tests to file using
write_filetool - Execution: Runs tests using
run_pytest_teststool - Analysis: Parses test results (pass/fail counts)
- Iteration: If failures detected, reads error output and regenerates tests
- Reporting: Returns structured JSON report with test summary
Example Output:
{
"finished": true,
"message": "10 tests passed, 2 failed in 1.86s",
"scratchpad": "Generated tests for web_explorer.py. All core functions covered. Two edge case failures require mock adjustment."
}Run Example:
python -m agent.examples.00_raw_unit_testerWith tracing:
python -m agent.examples.01_raw_traced_unit_testerGenAI-Agent-Lab-Library-MiniMinds/
├── agent/ # Agent framework
│ ├── base.py # Base agent class with iteration logic
│ ├── examples/ # Example agent implementations
│ │ ├── 00_raw_unit_tester.py # Raw agent without framework
│ │ ├── 01_raw_traced_unit_tester.py # With Langfuse tracing
│ │ ├── 02_use_v1_agent.py # Using simple agent class
│ │ └── 03_use_v2_agent.py # Using scratchpad agent
│ └── unit_tester/ # Unit tester agent implementations
│ ├── v1_simple.py # Full history retention
│ └── v2_scratchpad.py # Context pruning strategy
├── llm/ # LLM client abstractions
│ ├── base.py # Abstract LLM client interface
│ ├── config.py # Configuration with Pydantic
│ └── groq_client.py # Groq implementation
├── tools/ # Tool system
│ ├── base.py # Tool class definition
│ ├── decorator.py # @tool() decorator
│ ├── registry.py # Tool registry and management
│ ├── main.py # Tool testing script
│ └── toolkit/ # Tool collections
│ ├── web_explorer.py # Browser automation tools
│ └── builtin/ # Built-in utility tools
│ ├── code_tools.py # Python/pytest execution
│ ├── file_tools.py # File system operations
│ ├── json_tools.py # JSON validation
│ ├── math_tools.py # Mathematical operations
│ └── string_tools.py # String manipulation
├── prompts/ # System prompts
│ ├── unit_tester_v1.txt # Prompt for simple agent
│ └── unit_tester_v2.txt # Prompt for scratchpad agent
├── browser_manager.py # Playwright browser lifecycle
├── session.py # Session context manager
├── pyproject.toml # Project dependencies
└── README.md # This file
- LLM Provider Support: Only Groq is fully implemented; OpenAI/Gemini require client implementation
- Error Recovery: Limited retry logic for tool execution failures
- Parallel Execution: Tools execute sequentially; no concurrent tool calls
- Memory Management: Context pruning is manual; no automatic summarization
- Tool Validation: No runtime validation of tool outputs against schemas
- Browser Isolation: Single browser instance per session; no headful mode option
- Implement OpenAI and Gemini client adapters
- Add automatic context summarization using LLM
- Support streaming responses for real-time agent output
- Implement tool output validation layer
- Add multi-agent coordination primitives
- Develop memory module for long-term agent state persistence
- Create tool marketplace for community-contributed tools
- Add structured logging for tool execution timeline
- Implement automatic test generation for custom tools
Contributions are welcome. Please follow these guidelines:
- Code Style: Follow PEP 8; use Pydantic for configuration models
- Type Hints: All functions must have type annotations
- Documentation: Docstrings required for all public methods
- Testing: Add tests for new tools in
tools/llm_tests/ - Logging: Use Loguru for structured logging; avoid print statements
- Tracing: Decorate new LLM/tool wrappers with
@observe
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit changes with clear messages
- Add tests and ensure existing tests pass
- Submit a pull request with description of changes
When creating new tools:
- Use the
@tool()decorator - Provide clear docstrings (used as LLM descriptions)
- Return JSON-serializable types (str, dict, list, int, bool)
- Handle exceptions gracefully; return error dictionaries
- Support
session_idparameter if the tool requires state
- HuggingFace Agent Course - Introduction to AI agents
- Context Engineering by Langchain - Managing agent context
- Multi-Agent Architectures - Conceptual overview
- Advanced Context Engineering - Context optimization techniques
- How Long Contexts Fail - Understanding context window limitations
This project is part of a GenAI educational lab. License information not specified.