GenAI Agent Library - MiniMinds

A modular Python library for building LLM-powered agents with tool execution, web browsing capabilities, and observability. This library provides a flexible framework for creating autonomous agents that can interact with files, execute code, browse the web, and perform complex reasoning tasks.

Overview

MiniMinds is a lightweight framework that abstracts the complexity of building generative AI agents. It provides:

Pluggable LLM Support: Abstract interface for multiple LLM providers (Groq, OpenAI, Gemini)
Tool System: Decorator-based tool registration with automatic schema generation
Agent Framework: Base classes for implementing stateful, iterative agent workflows
Web Automation: Browser control via Playwright for web-based tasks
Observability: Built-in Langfuse integration for tracing and debugging
Context Management: Session-based resource management for multi-agent scenarios

The library demonstrates practical agent patterns including automated unit test generation, web exploration, and file manipulation.

Key Features

Multi-Provider LLM Support: Abstracted client interface supporting Groq, with extensibility for OpenAI and Gemini
Declarative Tool System: Define tools using Python decorators; automatic conversion to LLM-compatible schemas
Built-in Tool Suites:
- File operations (read, write, list, create/remove directories)
- Code execution (run Python files, execute pytest)
- Web automation (navigate, click, fill forms, screenshot)
- Utility tools (JSON validation, string manipulation, math operations)
Agent Lifecycle Management: Base agent class with configurable iteration limits and state tracking
Persistent Sessions: Context managers for managing browser instances and agent state
Context Optimization: Scratchpad pattern for pruning conversation history to reduce token usage
Tracing & Observability: Langfuse decorators for LLM call and tool execution monitoring

System Architecture

Agent Execution Flow

User Query
    ↓
[Agent State Initialization]
    ↓
┌─────────────────────────────────┐
│  Iterative Reasoning Loop       │
│  ┌─────────────────────────┐   │
│  │ 1. LLM Generate         │   │
│  │    (with tool schemas)  │   │
│  └───────────┬─────────────┘   │
│              ↓                  │
│  ┌─────────────────────────┐   │
│  │ 2. Parse Response       │   │
│  │    (content + tool calls)│  │
│  └───────────┬─────────────┘   │
│              ↓                  │
│  ┌─────────────────────────┐   │
│  │ 3. Execute Tools        │   │
│  │    (via registry)       │   │
│  └───────────┬─────────────┘   │
│              ↓                  │
│  ┌─────────────────────────┐   │
│  │ 4. Update State         │   │
│  │    (messages + status)  │   │
│  └───────────┬─────────────┘   │
│              ↓                  │
│      [Check Stop Condition]    │
│              ↓                  │
└──────────────┬─────────────────┘
               ↓
          Final Result

Tool Registration & Execution

Tool Definition: Functions decorated with @tool() are converted to Tool objects
Schema Generation: Tools are serialized to OpenAI/Gemini-compatible function schemas
Registry Management: ToolRegistry aggregates tools and manages session injection
LLM Integration: Tool schemas passed to LLM; tool calls parsed from responses
Execution: Tools invoked via registry with automatic session ID injection

Context Management Strategies

The library implements two agent patterns for handling conversation context:

Simple Agent (v1_simple.py): Retains full message history; suitable for short tasks
Scratchpad Agent (v2_scratchpad.py): Prunes tool outputs after each iteration, maintaining only system/user messages and latest assistant state

Technologies & Models Used

Core Dependencies

Python: 3.11+
Pydantic: Data validation and settings management
Loguru: Structured logging

LLM Integration

Groq: Primary LLM provider (llama-3.3-70b-versatile, llama-3.1-70b-versatile)
OpenAI: Supported via abstract interface
Gemini: Supported via abstract interface

Tools & Automation

Playwright: Headless browser automation for web interaction
Pytest: Test execution and validation

Observability

Langfuse: Distributed tracing for LLM calls and tool executions

Package Management

UV: Fast Python package installer and environment manager

Installation & Setup

Prerequisites

Python 3.11 or higher
UV package manager (recommended)

Step 1: Install UV

pip install uv

Step 2: Clone Repository

git clone https://github.com/kariem-magdy/GenAI-Agent-Lab-Library-MiniMinds.git
cd GenAI-Agent-Lab-Library-MiniMinds

Step 3: Install Dependencies

uv sync

Step 4: Activate Virtual Environment

Linux/Mac:

source .venv/bin/activate

Windows:

.venv\Scripts\activate

Step 5: Configure API Keys

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here

# Optional: for observability
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key

Get your Groq API key from console.groq.com

Step 6: Install Playwright Browsers (for web tools)

playwright install chromium

Usage Instructions

Basic Tool Usage

from tools.toolkit.builtin.math_tools import add, multiply

# Tools are callable
result = add(5, 3)  # Returns 8

# Get tool description for LLM
print(add.to_string())
# Output: Tool Name: add, Description: Adding Numbers, Arguments: a: int|float, b: int|float, Outputs: int|float

Creating Custom Tools

from tools.decorator import tool

@tool()
def fetch_weather(city: str) -> dict:
    """Fetch weather information for a given city."""
    # Implementation here
    return {"city": city, "temp": 22}

# Tool is now registered and has schema generation
print(fetch_weather.to_openai_format())

Building a Simple Agent

from llm.groq_client import GroqClient, LLMConfig
from tools.registry import ToolRegistry
from tools.toolkit.builtin import math_tools, string_tools

# Configure LLM
config = LLMConfig(
    model_name="llama-3.3-70b-versatile",
    temperature=0.7,
    max_tokens=2048
)
client = GroqClient(config)

# Register tools
registry = ToolRegistry()
registry.register_from_module(math_tools)
registry.register_from_module(string_tools)

# Create messages
messages = [
    {"role": "system", "content": f"You are a helpful assistant. Available tools:\n{registry.to_string()}"},
    {"role": "user", "content": "Calculate 15 * 7 and convert the result to uppercase string"}
]

# Generate with tools
response = client.generate(messages, tools=registry.to_client_tools(config.provider))
print(response)

Using the Unit Tester Agent

The library includes a practical example agent that generates and executes unit tests:

from agent.unit_tester.v2_scratchpad import ScratchpadUnitTesterAgent
from llm.groq_client import GroqClient, LLMConfig

config = LLMConfig(
    model_name="llama-3.3-70b-versatile",
    temperature=1.0,
    max_tokens=5000
)
client = GroqClient(config)

agent = ScratchpadUnitTesterAgent(client, max_iterations=20)

user_query = """
Write unit tests for tools/toolkit/web_explorer.py and run them.
Output test results in tools/llm_tests/ directory.
"""

state = agent.iterate(user_query=user_query)
print(state.messages[-1])  # Final report

Run the example:

python -m agent.examples.03_use_v2_agent

Web Automation with Browser Tools

from session import Session
from tools.registry import ToolRegistry
import tools.toolkit.web_explorer as web_tools

with Session("web-session") as session:
    registry = ToolRegistry(session.session_id)
    registry.register_from_module(web_tools)
    
    # Navigate to a URL
    status = registry.get("goto_url")("https://example.com")
    print(status)
    
    # Extract page content
    content = registry.get("get_page_content")(mode="text")
    print(content)
    
    # Take screenshot
    screenshot_data = registry.get("screenshot")(full_page=True)
    
    # Clean up
    registry.get("end_browsing_page")()

Example Workflow

Automated Unit Test Generation

Scenario: Generate pytest tests for a Python module, execute them, and report results.

Agent Workflow:

File Discovery: Agent lists project files using list_directory_files tool
Code Reading: Reads target module using read_file tool
Test Generation: Uses LLM to generate pytest test cases
Test Writing: Writes tests to file using write_file tool
Execution: Runs tests using run_pytest_tests tool
Analysis: Parses test results (pass/fail counts)
Iteration: If failures detected, reads error output and regenerates tests
Reporting: Returns structured JSON report with test summary

Example Output:

{
  "finished": true,
  "message": "10 tests passed, 2 failed in 1.86s",
  "scratchpad": "Generated tests for web_explorer.py. All core functions covered. Two edge case failures require mock adjustment."
}

Run Example:

python -m agent.examples.00_raw_unit_tester

With tracing:

python -m agent.examples.01_raw_traced_unit_tester

Project Structure

GenAI-Agent-Lab-Library-MiniMinds/
├── agent/                          # Agent framework
│   ├── base.py                     # Base agent class with iteration logic
│   ├── examples/                   # Example agent implementations
│   │   ├── 00_raw_unit_tester.py  # Raw agent without framework
│   │   ├── 01_raw_traced_unit_tester.py  # With Langfuse tracing
│   │   ├── 02_use_v1_agent.py     # Using simple agent class
│   │   └── 03_use_v2_agent.py     # Using scratchpad agent
│   └── unit_tester/                # Unit tester agent implementations
│       ├── v1_simple.py            # Full history retention
│       └── v2_scratchpad.py        # Context pruning strategy
├── llm/                            # LLM client abstractions
│   ├── base.py                     # Abstract LLM client interface
│   ├── config.py                   # Configuration with Pydantic
│   └── groq_client.py              # Groq implementation
├── tools/                          # Tool system
│   ├── base.py                     # Tool class definition
│   ├── decorator.py                # @tool() decorator
│   ├── registry.py                 # Tool registry and management
│   ├── main.py                     # Tool testing script
│   └── toolkit/                    # Tool collections
│       ├── web_explorer.py         # Browser automation tools
│       └── builtin/                # Built-in utility tools
│           ├── code_tools.py       # Python/pytest execution
│           ├── file_tools.py       # File system operations
│           ├── json_tools.py       # JSON validation
│           ├── math_tools.py       # Mathematical operations
│           └── string_tools.py     # String manipulation
├── prompts/                        # System prompts
│   ├── unit_tester_v1.txt          # Prompt for simple agent
│   └── unit_tester_v2.txt          # Prompt for scratchpad agent
├── browser_manager.py              # Playwright browser lifecycle
├── session.py                      # Session context manager
├── pyproject.toml                  # Project dependencies
└── README.md                       # This file

Limitations & Future Improvements

Current Limitations

LLM Provider Support: Only Groq is fully implemented; OpenAI/Gemini require client implementation
Error Recovery: Limited retry logic for tool execution failures
Parallel Execution: Tools execute sequentially; no concurrent tool calls
Memory Management: Context pruning is manual; no automatic summarization
Tool Validation: No runtime validation of tool outputs against schemas
Browser Isolation: Single browser instance per session; no headful mode option

Planned Improvements

Implement OpenAI and Gemini client adapters
Add automatic context summarization using LLM
Support streaming responses for real-time agent output
Implement tool output validation layer
Add multi-agent coordination primitives
Develop memory module for long-term agent state persistence
Create tool marketplace for community-contributed tools
Add structured logging for tool execution timeline
Implement automatic test generation for custom tools

Contributing Guidelines

Contributions are welcome. Please follow these guidelines:

Code Style: Follow PEP 8; use Pydantic for configuration models
Type Hints: All functions must have type annotations
Documentation: Docstrings required for all public methods
Testing: Add tests for new tools in tools/llm_tests/
Logging: Use Loguru for structured logging; avoid print statements
Tracing: Decorate new LLM/tool wrappers with @observe

Submitting Changes

Fork the repository
Create a feature branch: git checkout -b feature/your-feature
Commit changes with clear messages
Add tests and ensure existing tests pass
Submit a pull request with description of changes

Tool Development

When creating new tools:

Use the @tool() decorator
Provide clear docstrings (used as LLM descriptions)
Return JSON-serializable types (str, dict, list, int, bool)
Handle exceptions gracefully; return error dictionaries
Support session_id parameter if the tool requires state

References

Learning Resources

HuggingFace Agent Course - Introduction to AI agents
Context Engineering by Langchain - Managing agent context
Multi-Agent Architectures - Conceptual overview
Advanced Context Engineering - Context optimization techniques

Research & Papers

How Long Contexts Fail - Understanding context window limitations

License

This project is part of a GenAI educational lab. License information not specified.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
agent		agent
assets		assets
llm		llm
prompts		prompts
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
browser_manager.py		browser_manager.py
main.py		main.py
pyproject.toml		pyproject.toml
session.py		session.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

GenAI Agent Library - MiniMinds

Overview

Key Features

System Architecture

Agent Execution Flow

Tool Registration & Execution

Context Management Strategies

Technologies & Models Used

Core Dependencies

LLM Integration

Tools & Automation

Observability

Package Management

Installation & Setup

Prerequisites

Step 1: Install UV

Step 2: Clone Repository

Step 3: Install Dependencies

Step 4: Activate Virtual Environment

Step 5: Configure API Keys

Step 6: Install Playwright Browsers (for web tools)

Usage Instructions

Basic Tool Usage

Creating Custom Tools

Building a Simple Agent

Using the Unit Tester Agent

Web Automation with Browser Tools

Example Workflow

Automated Unit Test Generation

Project Structure

Limitations & Future Improvements

Current Limitations

Planned Improvements

Contributing Guidelines

Submitting Changes

Tool Development

References

Learning Resources

Research & Papers

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages