PromptGraph

A multi-agent AI system that transforms natural-language project ideas into complete prompt architectures. Given a description like "a quizzing platform for medical students", Prompter decomposes the project into modules, generates tailored system prompts with techniques like chain-of-thought and few-shot learning, designs inter-agent communication, scores quality across five dimensions, and outputs production-ready configuration files.

Built with LangGraph for orchestration and Groq for fast LLM inference.

Architecture

Project Idea
     │
     ▼
┌──────────┐    ┌───────────┐    ┌──────────────────────┐
│ Analyzer │───▶│ Architect │───▶│ Communication Designer│
└──────────┘    └───────────┘    └──────────────────────┘
                                          │
                                          ▼
                                    ┌──────────┐
                              ┌────▶│ Packager │──▶ Output
                              │     └──────────┘
                              │
                         ┌────┴───┐     ┌─────────┐
                         │ Critic │◀───▶│ Refiner │
                         └────────┘     └─────────┘
                          (quality loop)

6-agent pipeline:

Agent	Role
Analyzer	Decomposes the idea into a `ModuleMap` with domain classification and interaction model
Architect	Generates `PromptArtifact` per AI module — technique selection, context slots, eval criteria
Communication Designer	Creates an `InterAgentMap` — shared memory schema, handoff conditions, triggers
Critic	Scores each prompt on 5 dimensions (clarity, compliance, robustness, creativity, measurability)
Refiner	Iteratively revises prompts that score below the quality threshold
Packager	Assembles final JSON config, Markdown spec, and Python scaffolding

The Critic-Refiner loop runs up to 3 iterations (configurable), only revising prompts that haven't passed.

Installation

# Clone
git clone https://github.com/nvmaditya/Prompter.git
cd Prompter

# Install with dev dependencies
pip install -e ".[dev]"

# Configure
cp .env.example .env
# Edit .env and add your Groq API key

Requirements: Python >= 3.11

Configuration

Create a .env file in the project root:

GROQ_API_KEY=your-api-key-here
GROQ_MODEL=llama-3.3-70b-versatile

Optional overrides:

Variable	Default	Description
`GROQ_MODEL`	`llama-3.3-70b-versatile`	Groq model to use (`llama-3.1-8b-instant` also supported)
`PROMPTER_QUALITY_THRESHOLD`	`7.0`	Minimum critic score (0-10) for a prompt to pass
`PROMPTER_MAX_ITERATIONS`	`3`	Max critic-refiner loop iterations
`PROMPTER_RATE_LIMIT_TIER`	`free`	`free` (adds delays between requests) or `paid`
`PROMPTER_LLM_MAX_TOKENS`	`4096`	Max response tokens per LLM call
`PROMPTER_VERBOSE`	`false`	Enable debug logging

Usage

Generate prompts from an idea

prompter generate "a quizzing platform for medical students"

Read idea from a file

prompter generate path/to/idea.txt

Custom output directory

prompter generate "your project idea" -o ./my-output

Interactive mode (review after analysis)

prompter interactive "your project idea"

This pauses after the Analyzer stage to show you the module breakdown. You can approve, modify, or cancel before the pipeline continues.

Resume from checkpoint

If the pipeline fails partway through (e.g., rate limit), resume from the saved checkpoint:

prompter generate "your project idea" --resume .prompter_state/<run-id>

Output

Prompter generates three artifacts in the output directory:

`prompt_config.json`

Complete machine-readable configuration — all modules, prompts, context slots, communication maps, and quality scores. Validates against the FinalOutputArtifact Pydantic schema.

`architecture_spec.md`

Human-readable Markdown document with:

Project overview and module breakdown
Full system prompts with technique explanations
Inter-agent communication design
Quality scores and improvement history

`scaffolding/`

A starter Python project:

scaffolding/
├── prompts/           # System prompt .txt files per module
├── agents/            # Agent stub Python files
├── config.py          # Configuration template
├── main.py            # Orchestration entry point
└── README.md          # Setup instructions

Project Structure

prompter/
├── prompter/
│   ├── agents/              # 6 agent implementations
│   ├── llm/                 # LLM client with 3-layer retry
│   ├── models/              # Pydantic data models
│   ├── output/              # JSON, Markdown, scaffold writers
│   ├── prompt_templates/    # System prompts for each agent
│   ├── utils/               # Checkpoint, logging, token counting
│   ├── cli.py               # Typer CLI
│   ├── config.py            # Settings via pydantic-settings
│   ├── graph.py             # LangGraph StateGraph builder
│   └── state.py             # PipelineState TypedDict
├── tests/
│   ├── unit/                # 5 test files (model, config, state, LLM, tokens)
│   ├── integration/         # 9 test files (all agents, graph, checkpoint, CLI)
│   └── regression/          # Real API smoke tests
├── docs/                    # PRD, SRS, architecture docs
├── .env.example
└── pyproject.toml

Testing

# Run all mocked tests (fast, no API calls) — 190 tests
pytest tests/ -m "not slow" -v

# Run real API tests (requires valid GROQ_API_KEY) — 4 tests
pytest tests/regression/ -m slow -v

# Run everything
pytest tests/ -v

The test suite covers:

Unit tests — Pydantic models, LLM client retry layers, JSON extraction, token counting, state creation
Integration tests — Each agent end-to-end (mocked LLM), LangGraph routing, checkpoint round-trip, CLI validation
Regression tests — Full pipeline against live Groq API

Technical Details

LLM Client (`prompter/llm/client.py`)

Three-layer retry:

HTTP transport — Tenacity exponential backoff for transient failures
Rate limiting — Enforced delays between requests for Groq free tier
Schema self-healing — When the LLM returns invalid JSON or wrong structure, automatically re-prompts with the validation error and schema

The client uses a compact schema representation (_compact_schema) that resolves $ref references and strips metadata, reducing token usage by ~59% compared to raw model_json_schema().

Checkpoint & Resume

Pipeline state is checkpointed to .prompter_state/<run-id>/pipeline_state.json after each agent stage. All Pydantic models survive serialization round-trips via a model registry with __pydantic__ markers.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
prompter		prompter
tasks		tasks
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
claude.md		claude.md
idea.md		idea.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PromptGraph

Architecture

Installation

Configuration

Usage

Generate prompts from an idea

Read idea from a file

Custom output directory

Interactive mode (review after analysis)

Resume from checkpoint

Output

`prompt_config.json`

`architecture_spec.md`

`scaffolding/`

Project Structure

Testing

Technical Details

LLM Client (`prompter/llm/client.py`)

Checkpoint & Resume

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PromptGraph

Architecture

Installation

Configuration

Usage

Generate prompts from an idea

Read idea from a file

Custom output directory

Interactive mode (review after analysis)

Resume from checkpoint

Output

prompt_config.json

architecture_spec.md

scaffolding/

Project Structure

Testing

Technical Details

LLM Client (prompter/llm/client.py)

Checkpoint & Resume

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`prompt_config.json`

`architecture_spec.md`

`scaffolding/`

LLM Client (`prompter/llm/client.py`)

Packages