VibeFlow

🦌 Autonomous Engineering. Manifested Intent. 🌊

"We are no longer just programmers; we are Vibe Coders and Agentic Orchestrators." — Inspired by Andrej Karpathy, March 2026

VibeFlow is an experimental take on an autonomous engineering system that implements the Vibe Coding paradigm via DeerFlow. You write high-level intent in a vibe.md manifest (or vibe one) — VibeFlow decomposes it into tasks, executes them via MCP tool calls, and self-corrects using the AutoResearch ratchet loop. It runs indefinitely until your success metrics are met.

To work with Snowflake authentication and the Cortex REST API, we've forked the main DeerFlow project here: https://github.com/patreilly/deer-flow. If you're not using Snowflake for your inferences, use the main repository here.

Directory Layout

home/vibeflow/
├── README.md                  ← you are here
├── .gitignore
├── vibeflow/                  ← VibeFlow source (package.json lives HERE)
│   ├── src/
│   ├── vibe.md
│   ├── math-helper-vibe.md
│   ├── docker-compose.deerflow.yml
│   └── ...
└── ../deer-flow/              ← patreilly/deer-flow fork (cloned alongside)

Common mistake: running npm start from vibeflow/ instead of vibeflow/vibeflow/.

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                          npm start                                   │
│                       (tsx src/index.ts)                            │
└────────────────────────────┬────────────────────────────────────────┘
                             │
              ┌──────────────▼──────────────┐
              │        index.ts             │
              │   Pipeline orchestration    │
              │   Interactive prompts       │
              │   Env var configuration     │
              └──┬──────────┬──────────┬───┘
                 │          │          │
    ┌────────────▼──┐  ┌────▼──────┐  └──────────────┐
    │ mcp-client.ts │  │ratchet.ts │         ┌────────▼────────┐
    │               │  │           │         │ orchestrator.ts  │
    │ Connects MCP  │  │ Quality   │         │                  │
    │ servers at    │  │ gate —    │         │ Task tree mgmt   │
    │ startup:      │  │ only      │         │ Checkpoint save/ │
    │  • filesystem │  │ commits   │         │   restore        │
    │  • context7   │  │ if pass   │         │ Short/long path  │
    │  • fetch      │  │ rate      │         │   routing        │
    │  • duckduckgo │  │ improved  │         │ File writes via  │
    │               │  │           │         │   fs.promises    │
    │ Audit log →   │  └───────────┘         └────────┬────────┘
    │ .vibeflow/    │                                  │
    │ logs/         │         ┌────────────────────────┘
    └───────────────┘         │
                   ┌──────────▼──────────┐
                   │    verifier.ts      │
                   │                     │
                   │ generateCriteria()  │
                   │  → LLM derives      │
                   │    AcceptanceCriteria│
                   │    from manifest    │
                   │                     │
                   │ verify()            │
                   │  • executable:      │
                   │    bash cmd exit 0  │
                   │  • evaluative:      │
                   │    LLM judge        │
                   │                     │
                   │ verifyAndFix()      │
                   │  Option B fix loop  │
                   │  (one persistent    │
                   │   DeerFlow thread)  │
                   └──────────┬──────────┘
                              │
                   ┌──────────▼──────────┐
                   │  deerflow-client.ts │
                   │                     │
                   │ HTTP client for     │
                   │ ByteDance DeerFlow: │
                   │  • POST /threads    │
                   │  • /runs/stream     │
                   │    (short path)     │
                   │  • /runs + polling  │
                   │    (long path)      │
                   └─────────────────────┘

Source files:

File	Role
`src/index.ts`	Entry point — prompts, pipeline sequencing, env config
`src/orchestrator.ts`	Task tree, DeerFlow dispatch, checkpoint save/restore, file writes
`src/verifier.ts`	AcceptanceCriteria generation, criterion evaluation, fix loop
`src/deerflow-client.ts`	DeerFlow HTTP client (short-path streaming + long-path polling)
`src/mcp-client.ts`	MCP server connections, tool dispatch, audit logging
`src/ratchet.ts`	Quality gate — tracks pass rate history, prevents regressions

Prerequisites

Requirement	Version	Notes
Node.js	>= 20.x	ESM support required
Python	>= 3.9	For `research_loop.py`
Docker	>= 24.x	Required to run DeerFlow
Anthropic API key	—	Or Snowflake Cortex credentials

First-time Setup

VibeFlow runs on top of DeerFlow — ByteDance's orchestration engine.

To work with Snowflake authentication and the Cortex REST API, we've forked the main DeerFlow project here: https://github.com/patreilly/deer-flow.

The fork is built locally from source; docker-compose.deerflow.yml handles this automatically. If you've got your own API keys from Anthropic or OpenAI, just clone the original, main repository and use vanilla DeerFlow.

1. Clone the DeerFlow fork (or the main repo if you're not using Snowflake!)

# Run from the directory that contains vibeflow/
git clone https://github.com/patreilly/deer-flow ../deer-flow

# The Snowflake Cortex provider lives on a feature branch — check it out
cd ../deer-flow
git checkout feature/snowflake-cortex-provider

2. Configure DeerFlow

# still in deer-flow directory

# Generate config.yaml from the template
# macOS: must use python3, not python
make config PYTHON=python3

3. Add a model to config.yaml

Open ../deer-flow/config.yaml and add at least one active model under models:. Without this the langgraph container crashes at startup.

Option A — Direct Anthropic API:

# ../deer-flow/config.yaml
models:
  - name: claude-sonnet-4-6
    display_name: Claude Sonnet 4.6
    use: deerflow.models.claude_provider:ClaudeChatModel
    model: claude-sonnet-4-6
    api_key: $ANTHROPIC_API_KEY
    max_tokens: 16384
    supports_vision: true
    supports_thinking: true
    when_thinking_enabled:
      thinking:
        type: enabled

export ANTHROPIC_API_KEY=sk-ant-...

Option B — Claude via Snowflake Cortex (PAT auth):

models:
  - name: snowflake-claude-sonnet
    display_name: Claude Sonnet 4.6 (Snowflake Cortex)
    use: deerflow.models.snowflake_claude_provider:SnowflakeClaudeChatModel
    model: claude-sonnet-4-6
    snowflake_account: $SNOWFLAKE_ACCOUNT
    snowflake_pat_token: $SNOWFLAKE_PAT_TOKEN
    max_tokens: 16384
    supports_vision: true
    supports_thinking: true
    when_thinking_enabled:
      thinking:
        type: enabled

export SNOWFLAKE_ACCOUNT=myorg-myaccount   # first segment only, no region suffix
export SNOWFLAKE_PAT_TOKEN=pat-...

Option C — Claude via Snowflake Cortex (key-pair JWT, auto-refreshes):

models:
  - name: snowflake-claude-sonnet
    display_name: Claude Sonnet 4.6 (Snowflake Cortex)
    use: deerflow.models.snowflake_claude_provider:SnowflakeClaudeChatModel
    model: claude-sonnet-4-6
    snowflake_account: $SNOWFLAKE_ACCOUNT
    snowflake_user: $SNOWFLAKE_USER
    snowflake_private_key_path: $SNOWFLAKE_PRIVATE_KEY_PATH
    snowflake_private_key_passphrase: $SNOWFLAKE_PRIVATE_KEY_PASSPHRASE  # omit if unencrypted
    max_tokens: 16384
    supports_vision: true

export SNOWFLAKE_ACCOUNT=myorg-myaccount
export SNOWFLAKE_USER=SERVICE_USER
export SNOWFLAKE_PRIVATE_KEY_PATH=/path/to/rsa_key.p8

4. Build and start DeerFlow

cd ../vibeflow/vibeflow    # where package.json lives

# Create extensions_config.json from the example BEFORE starting Docker.
# If this file is missing, Docker will create it as a directory and both
# containers will fail to start with "[Errno 21] Is a directory" errors.
cp ../../deer-flow/extensions_config.example.json ../../deer-flow/extensions_config.json

# First run — builds Docker images from source (takes a few minutes)
docker compose -f docker-compose.deerflow.yml up -d --build

# Verify both services are healthy
curl http://localhost:2024/ok    # → {"status":"ok"}
curl http://localhost:8001/health

5. Install VibeFlow dependencies

# Still in vibeflow/vibeflow/
npm ci    # clean install from package-lock.json (use npm install only when adding/updating packages)
uv sync   # installs Python deps from uv.lock (anthropic, etc.)

6. Run VibeFlow

VIBE_MANIFEST=math-helper-vibe.md npm start

Subsequent Runs

cd vibeflow/vibeflow

# Start DeerFlow (no rebuild needed unless code changed)
docker compose -f docker-compose.deerflow.yml up -d

# Rebuild after pulling fork updates
docker compose -f docker-compose.deerflow.yml up -d --build

# Run VibeFlow
VIBE_MANIFEST=math-helper-vibe.md npm start

Quick Start

# Run against the default vibe.md
npm start

# Custom manifest
VIBE_MANIFEST=my-project-vibe.md npm start

# Dry run — prints the task plan without executing anything
VIBE_DRY_RUN=1 VIBE_MANIFEST=my-project-vibe.md npm start

# Force a fresh run — clears checkpoints AND output directory without prompting
VIBE_FRESH=1 VIBE_MANIFEST=my-project-vibe.md npm start

# Skip the output directory prompt (useful in CI or scripted workflows)
VIBE_OUTPUT_DIR=/path/to/my-project VIBE_MANIFEST=my-project-vibe.md npm start

# Point at a remote DeerFlow instance
DEERFLOW_URL=http://my-server:2024 npm start

VibeFlow checks DeerFlow is reachable on startup and exits immediately with an error if not.

Python Research Loop

npm run research

# Or directly
python3 research_loop.py --manifest vibe.md --max-iterations 5
python3 research_loop.py --dry-run

The Vibe Manifest

vibe.md is the source of truth for your project. VibeFlow reads it before every task cycle. You can start with something vague and ambiguous and let VibeFlow make some reasonable assumptions. Or, ask your favorite coding assistant to make one for you with the below minimum useful structure:

## Project Goal
Build a REST API for user authentication with JWT tokens.

## Success Metrics
| Metric | Target | Measurement Method |
|---|---|---|
| Tests pass | 100% | `npm test` |
| Builds clean | yes | `npm run build` |

## Deterministic Guardrails
1. No deleting files without confirmation
2. All changes within the project directory

## Current Phase
- [ ] Scaffold the project
- [ ] Implement /login endpoint
- [ ] Add JWT middleware

The unchecked items in Current Phase are the active work queue. See math-helper-vibe.md for a full example.

Acceptance Criteria

Before writing a single line of code, VibeFlow attempts to answer the question: "What does success look like for this project?" The answer becomes the AcceptanceCriteria — the specification that every task is built to satisfy, and that the verification phase evaluates after the build completes.

This works for any project type, not just code:

Project type	Verification example
Web app	`npm run build` exits 0; `curl localhost:PORT` returns HTTP 200
dbt pipeline	`dbt run && dbt test` both exit 0
ML training script	`python train.py --smoke-test` exits 0; `metrics.json` has `loss` key
Rust CLI	`cargo build --release` exits 0; `./binary --help` exits 0
Blog post	Word count ≥ 800; LLM judge confirms topic coverage and reading level
Research report	LLM judge confirms argument coherence and citation quality

How criteria are generated

VibeFlow calls DeerFlow with the full manifest and asks it to derive criteria. Whether the manifest is a one-liner ("build me a todo app") or a comprehensive spec, the same process runs:

Explicit criteria are extracted verbatim from the manifest (success metrics, guardrails, output requirements).
Implicit criteria are inferred from domain knowledge — things any reasonable user would expect. These are flagged [inferred] and shown with the assumption the LLM made.
Each criterion is assigned a weight (0–1, summing to 1.0) and a verification mode:
- executable — verified by running a bash command; exit 0 = pass
- evaluative — verified by an LLM judge reading the output files against a rubric
- hybrid — both

The criteria review gate

Criteria are shown before anything is built. On an interactive terminal you'll see:

Acceptance Criteria
───────────────────
  Intent: Build a browser-based math practice app for 2nd-grade students using vanilla TypeScript and Vite

  1. [executable] npm install exits 0  w=0.10 [explicit]
  2. [executable] npm run build exits 0  w=0.15 [explicit]
  3. [executable] Dev server responds HTTP 200 on localhost:5173  w=0.20 [inferred]
       Assumed: a Vite project is expected to be locally runnable
  4. [evaluative] App implements all 4 core features from the manifest  w=0.30 [explicit]
  5. [evaluative] Guardrails respected (no negative language, no persistent storage)  w=0.25 [explicit]

Build to these criteria? [Y/n]

If the inferred intent is wrong, exit here, edit vibe.md, and re-run. This catches bad inference cheaply — before the build, not after.

Set VIBE_AUTO_APPROVE_CRITERIA=1 to skip the prompt in CI or scripted workflows.

The fix loop (Option B)

After all tasks complete, VibeFlow evaluates every criterion. If any fail, it opens one persistent DeerFlow thread and iterates:

Attempt 1: "These criteria failed: [details + file listing]. Fix them."
           → DeerFlow updates files
           → re-verify failed criteria

Attempt 2: "Attempt 1 still has failures: [new details]. Fix remaining."
           → same thread — LLM sees its previous attempt in context
           → re-verify

Attempt 3: (same pattern, last chance)

Because the thread persists across attempts, the LLM has full history of what it already tried and won't repeat the same approach. The final weighted pass rate goes into the ratchet as the real testPassRate.

Configuration

Environment Variables

DeerFlow connection:

Variable	Default	Description
`DEERFLOW_URL`	`http://localhost:2024`	LangGraph server
`DEERFLOW_GATEWAY_URL`	`http://localhost:8001`	Gateway API
`DEERFLOW_ASSISTANT_ID`	`lead_agent`	Agent name to target
`DEERFLOW_POLL_MS`	`3000`	Background run poll interval

VibeFlow behaviour:

Variable	Default	Description
`VIBE_MANIFEST`	`vibe.md`	Path to intent manifest
`VIBE_OUTPUT_DIR`	(prompted)	Output directory — skips the interactive prompt
`VIBE_DRY_RUN`	`false`	Print task plan without executing
`VIBE_MAX_TASKS`	`20`	Max tasks per run
`VIBE_FRESH`	`false`	Clear checkpoints and output directory without prompting, then start fresh
`VIBE_AUTO_APPROVE_CRITERIA`	`false`	Skip criteria review prompt (CI / non-interactive use)
`VIBE_SKIP_VERIFY`	`false`	Skip the verify+fix phase entirely
`VIBE_MAX_FIX_RETRIES`	`10`	Max iterations of the criteria fix loop
`VIBE_FEEDBACK`	(unset)	Free-form feedback string — VibeFlow asks DeerFlow to update the manifest before running; shows a Refinement Log for confirmation
`MCP_FILESYSTEM_SERVER`	(unset)	Command to launch MCP filesystem server (legacy fallback)
`MCP_SSE_URL`	(unset)	URL for SSE-transport MCP server (legacy fallback)

Snowflake Cortex (optional):

Variable	Purpose
`SNOWFLAKE_ACCOUNT`	Account identifier — first segment only, e.g. `myorg-myaccount`
`SNOWFLAKE_PAT_TOKEN`	Programmatic Access Token (simplest auth)
`SNOWFLAKE_USER`	Username for key-pair JWT auth
`SNOWFLAKE_PRIVATE_KEY_PATH`	Path to PEM private key
`SNOWFLAKE_PRIVATE_KEY_PASSPHRASE`	Passphrase if key is encrypted

Snowflake credentials are passed into the Docker containers automatically — set them in your shell before running docker compose.

Connecting MCP Servers

VibeFlow connects to MCP servers at startup for DeerFlow's tool use (fetching docs, searching the web, etc.). Configure them in vibeflow/mcp.config.json.

Output directory

VibeFlow prompts you for an output directory at the start of every run:

Where should VibeFlow write the generated project files?
  Enter an absolute path (/Users/you/projects/my-app)
  or a path relative to the current directory (../my-app, ./output).
  Press Enter to use the default: /Users/you/code/math-helper

Output directory:

The default is derived from your manifest filename — math-helper-vibe.md → ../math-helper. Press Enter to accept it, or type any path.

To skip the prompt entirely (CI or scripted runs), set VIBE_OUTPUT_DIR:

VIBE_OUTPUT_DIR=../my-app VIBE_MANIFEST=my-project-vibe.md npm start

The {OUTPUT_DIR} token in mcp.config.json is substituted with the resolved path at runtime, so the filesystem MCP server always points at the same directory VibeFlow is writing to:

{
  "servers": [
    {
      "id": "filesystem",
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "{OUTPUT_DIR}"]
    }
  ]
}

VibeFlow creates the output directory automatically before starting MCP servers. No manual mkdir needed.

Popular servers pre-configured in the example:

Server	Package	What it gives VibeFlow
`filesystem`	`@modelcontextprotocol/server-filesystem`	Read/write files in the project (uses `{OUTPUT_DIR}`)
`context7`	`@upstash/context7-mcp`	Up-to-date library docs (resolves package → versioned docs)
`fetch`	`@modelcontextprotocol/server-fetch`	Fetch URLs and convert to Markdown
`duckduckgo`	`mcp/duckduckgo` (Docker)	Web search

All npx-based servers run via npx -y — no global installs needed.

mcp.config.json is gitignored. The committed mcp.config.json uses {OUTPUT_DIR} and works for any project out of the box.

How It Works

What DeerFlow provides

VibeFlow delegates all LLM work to DeerFlow, which runs as a local Docker service. Understanding what DeerFlow adds — versus what a direct API call would give you — explains why the architecture is structured the way it is.

Multi-step agent loop, not a single LLM call

When VibeFlow sends a task to DeerFlow it doesn't get back a single response. DeerFlow runs a LangGraph lead_agent graph that can make multiple model calls, reason through intermediate steps, use tools between steps, and decide when it's actually done. For a task like "implement the practice screen," the agent may read existing files, look up library docs, generate code, review it, and revise — all before returning.

Tool use during execution

DeerFlow can invoke MCP servers mid-task: reading and writing files via the filesystem server, fetching up-to-date library documentation via context7, and searching the web via DuckDuckGo. This means the agent isn't limited to its training data — it can look things up as it works.

Thread persistence = memory across retries

DeerFlow threads accumulate conversation history. The Option B fix loop relies on this: by the third fix attempt, the LLM has its two previous attempts and their failure details in context, so it doesn't repeat the same approach. Stateless API calls can't do this without manually managing and truncating conversation history.

Model provider abstraction

VibeFlow has no direct dependency on Anthropic, OpenAI, or Snowflake. DeerFlow's config.yaml handles authentication and provider-specific API formats. Switching providers is a config change — no VibeFlow code changes needed.

Where DeerFlow adds less value

For simple, deterministic tasks ("write a tsconfig.json") the multi-step agent loop is overhead you don't need. The short-path / long-path split exists partly for this reason — short tasks get a streaming response with a hard timeout; only tasks that genuinely benefit from the agent loop run as long-path background jobs.

Pipeline overview

VibeFlow runs a six-phase pipeline on every npm start:

manifest
  │
  ├─ 0. OUTPUT DIRECTORY
  │      Prompts for (or reads VIBE_OUTPUT_DIR / derives from manifest name).
  │      Creates the directory. Starts MCP servers with the path substituted in.
  │      VIBE_FRESH=1 wipes both checkpoints and the output directory first.
  │
  ├─ 1. DEFINE CRITERIA
  │      DeerFlow reads the manifest and produces AcceptanceCriteria.
  │      Explicit goals are extracted; implicit expectations are inferred and flagged.
  │      User reviews and confirms before anything is built.
  │      Criteria are stored in the checkpoint — resumed runs skip re-generation.
  │
  ├─ 2. DECOMPOSE
  │      DeerFlow decomposes the manifest into a task tree, with the acceptance
  │      criteria injected as context so tasks are built to satisfy known standards.
  │
  ├─ 3. EXECUTE TASKS
  │      Each task runs via DeerFlow (short path: streaming; long path: background poll).
  │      Files emitted as <vibeflow:file> blocks are written directly to the output
  │      directory via fs.promises. Checkpoints are saved between tasks.
  │
  ├─ 4. VERIFY + FIX
  │      Every acceptance criterion is evaluated against the output:
  │        • executable criteria → bash command (cwd = output dir), exit 0 = pass
  │        • evaluative criteria → LLM judge reads artifacts, returns PASS/FAIL + reason
  │      If any fail, a single DeerFlow thread is opened for the fix loop.
  │      Each fix attempt appends to the same thread (Option B) — the LLM sees
  │      its full history and doesn't repeat failed approaches.
  │      Up to VIBE_MAX_FIX_RETRIES (default 3) iterations.
  │
  └─ 5. RATCHET
         The weighted pass rate from verification (not fake task completion %)
         is proposed to the ratchet. Only commits if metrics improved.

Bifurcated execution

Tasks are routed by the LLM's estimated duration:

Short path — tasks expected to finish in < 30 s run synchronously with a streaming timeout guard
Long path — longer tasks run async with periodic checkpointing; send SIGINT to pause cleanly and resume on next start

Retries

VibeFlow has four distinct retry layers — each targeting a different failure mode.

1. Short-path task timeout → long-path promotion

Short-path tasks (estimated < 30 s) run synchronously with a 120 s wall-clock timeout. If a task exceeds that, retrying with the same approach would just time out again. Instead VibeFlow promotes the task to the long path, switching to a background run with polling and no hard deadline:

Short-path task starts (streaming, 120 s limit)
  ↓ timeout fires
  ↑ promoted to long-path background run (no hard deadline, exponential backoff)

You will see ↑ timed out — promoting to long-path background run… in the console when this happens.

2. Short-path task non-timeout error → immediate retry

For errors that are not timeouts (bad response, parse failure, HTTP error), the short-path retries up to task.maxRetries times on the same streaming approach. These are transient failures where re-trying the same method is reasonable.

3. Long-path task error → exponential backoff retry

Long-path background runs retry on any failure with exponential backoff:

Attempt	Wait before retry
1 → 2	2 s
2 → 3	4 s
3 → 4	8 s

After task.maxRetries attempts the task is marked FAILED.

4. Mandatory dependency failure → one upstream retry

If a mandatory upstream task fails and a downstream task is about to run, VibeFlow retries the upstream task once before deciding whether to skip the dependent. This handles the case where an intermittent failure earlier in the graph would otherwise cascade into skipping large parts of the tree.

Task B depends on Task A (mandatory)
  Task A → FAILED
  VibeFlow retries Task A
    → COMPLETED: Task B runs normally
    → FAILED again: Task B is skipped with [SKIPPED] prefix

5. Inactivity detection on streaming connections

The short-path streaming connection has a secondary check independent of the wall-clock timeout: if no SSE chunk arrives from DeerFlow for 60 seconds, the stream is considered stalled and aborted. This catches silently dropped TCP connections quickly rather than waiting for the full 120 s limit.

A slow-but-active stream (tokens arriving at any rate) keeps the timer resetting and is never interrupted by this check.

6. Criteria fix loop → progress-gated retries

After all tasks complete, failed acceptance criteria enter the fix loop (see The fix loop). This loop is not purely count-driven — it also has stuck detection:

Keeps retrying as long as the weighted pass rate improves between attempts
Stops early after 2 consecutive attempts with no improvement (stuck)
Hard cap at VIBE_MAX_FIX_RETRIES (default: 10) attempts

This means a run that steadily improves from 40% → 60% → 80% → 100% uses all the attempts it needs, while a run that hits a ceiling at 60% after 2 tries stops spending token budget on a dead end.

The Ratchet

A physical ratchet only turns one direction. VibeFlow's ratchet applies the same constraint to progress: each run can only advance the baseline — it can never silently make things worse.

This matters because autonomous agents are generative. Without a gate, a run that produces lower-quality output than the previous one would overwrite the good state and nobody would notice until much later. The ratchet makes regression impossible to ignore.

Lifecycle

Every run goes through the same three steps:

1. PROPOSE   After verification, VibeFlow submits the measured pass rate
             as a candidate MetricSnapshot. Nothing is committed yet.

2. EVALUATE  The ratchet compares the proposal against the last committed
             baseline across all tracked metrics.

3. COMMIT    If at least one metric improved and none regressed beyond the
  or         tolerance → the proposal becomes the new baseline.
  ROLLBACK   If any metric regressed → the proposal is discarded and the
             previous baseline is restored.

The first run always commits — there is no prior baseline to regress from.

What is tracked

Metric	Direction	What it measures
`testPassRate`	higher is better	Weighted pass rate from criteria evaluation
`apiLatencyP95Ms`	lower is better	95th-percentile latency of DeerFlow calls
`criticalSecurityFindings`	lower is better	Security issues found in generated output
`ratchetCommitRate`	higher is better	% of all proposals that have been committed

All metrics are normalised internally so the comparison logic is uniform regardless of direction.

Regression tolerance

A 2% tolerance prevents the ratchet from blocking on noise. If testPassRate drops from 95 to 94.1 between runs that is within tolerance and does not count as a regression. A drop from 95 to 80 does.

The commit rate as a health signal

ratchetCommitRate is itself a tracked metric — it measures what fraction of all proposals have been committed over the lifetime of the project. A high commit rate means the agent is consistently making progress. A declining commit rate is an early warning that the system is struggling to improve and warrants investigation.

Rollback stack

The ratchet keeps the last 20 committed baselines on a rollback stack. If a run commits and is later found to be bad, you can restore a known-good state without losing the full history.

State is persisted to .vibeflow/ratchet/ratchet_state.json after every propose, commit, and rollback, so the baseline survives process restarts.

MCP Tool Calls

All file and code operations go through the MCP client:

Retry logic — 3 attempts with exponential backoff
Rate limiting — token-bucket (configurable RPS)
Audit logging — every call appended to .vibeflow/mcp-audit.jsonl

Dependency classification

When DeerFlow decomposes a manifest into tasks it classifies each one as mandatory or optional.

Mandatory (default) — the task's output is required for the overall goal. If a mandatory task fails, any downstream task that depends on it will attempt one full retry of the blocking task before deciding whether to skip.
Optional — the task is an enhancement or polish step. If it fails or is skipped, downstream tasks are told the impact and continue anyway.

At runtime you will see one of three prefixes in the console output:

Prefix	Meaning
`[MANDATORY DEP RETRY]`	A required upstream task failed; VibeFlow is re-running it before proceeding
`[MANDATORY DEP RETRY OK]`	The retry succeeded; the dependent task will now run
`[OPTIONAL DEP SKIPPED]`	A non-critical task was skipped; the impact is printed and execution continues
`[SKIPPED]`	A task was skipped because one or more mandatory dependencies could not be resolved even after a retry

Starting fresh vs. resuming

Checkpoints are written to .vibeflow/checkpoints/. On the next npm start, VibeFlow detects them and offers to resume or start fresh.

Use VIBE_FRESH=1 to skip the prompt and always start clean — it wipes both the checkpoints and the output directory:

VIBE_FRESH=1 VIBE_MANIFEST=my-project-vibe.md npm start

Or answer y at the interactive prompt when asked "Clear checkpoints and start fresh?".

Because npm start runs tsx src/index.ts directly (no compiled binary), source changes take effect immediately on the next run — no build step needed.

Stopping a long-path run

# Graceful stop — finishes current task, saves checkpoint
kill -SIGINT <pid>

The next npm start restores from the latest checkpoint automatically.

For a deeper dive into the source code internals, see vibeflow/ARCHITECTURE.md.

Troubleshooting

Symptom	Fix
`make config` fails: `python: No such file or directory`	Use `make config PYTHON=python3`
`npm error: Could not read package.json`	Wrong directory — `cd vibeflow/vibeflow/` first
`deer-flow-langgraph` crashes on startup	`models:` in `config.yaml` is empty — add at least one model entry
Containers fail with `[Errno 21] Is a directory: '.../extensions_config.json'`	`extensions_config.json` was missing when Docker first ran, so Docker created it as a directory. Fix: `rm -rf ../deer-flow/extensions_config.json && cp ../deer-flow/extensions_config.example.json ../deer-flow/extensions_config.json` then `docker compose -f docker-compose.deerflow.yml down && docker compose -f docker-compose.deerflow.yml up -d`
`[SKIPPED]` shown for many tasks	A mandatory upstream task failed and its retry also failed — check the `[MANDATORY DEP RETRY]` error above it for the root cause
`[OPTIONAL DEP SKIPPED]` shown	Expected behaviour — the optional task didn't produce output but execution continues; check the impact line to decide if you need to fix it
First `--build` hangs for minutes	Normal — Docker is installing Python deps; wait it out

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
vibeflow		vibeflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

VibeFlow

Table of Contents

Directory Layout

Architecture

Prerequisites

First-time Setup

1. Clone the DeerFlow fork (or the main repo if you're not using Snowflake!)

2. Configure DeerFlow

3. Add a model to config.yaml

4. Build and start DeerFlow

5. Install VibeFlow dependencies

6. Run VibeFlow

Subsequent Runs

Quick Start

Python Research Loop

The Vibe Manifest

Acceptance Criteria

How criteria are generated

The criteria review gate

The fix loop (Option B)

Configuration

Environment Variables

Connecting MCP Servers

How It Works

What DeerFlow provides

Pipeline overview

Bifurcated execution

Retries

1. Short-path task timeout → long-path promotion

2. Short-path task non-timeout error → immediate retry

3. Long-path task error → exponential backoff retry

4. Mandatory dependency failure → one upstream retry

5. Inactivity detection on streaming connections

6. Criteria fix loop → progress-gated retries

The Ratchet

Lifecycle

What is tracked

Regression tolerance

The commit rate as a health signal

Rollback stack

MCP Tool Calls

Dependency classification

Stopping a long-path run

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages