Bugsterapp · faculopezscala · Apr 6, 2026 · Apr 6, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,117 @@
+# Changelog
+
+All notable changes to sow are documented here. The format is loosely based on
+[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project follows
+[Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [Unreleased]
+
+The next release lands the launch positioning ("Stop letting Claude touch your
+prod database") plus the rest of the eng-review plan as five parallel PRs:
+
+### Added (planned)
+
+- **`sow sandbox`** — flagship zero-config command. Auto-detects your project's
+  Postgres source, samples + sanitizes, spins up a local sandbox, and patches
+  `.env.local` with the new `DATABASE_URL`. One command from clone to working
+  sandbox. (PR #4)
+- **`sow env revert`** — restores `.env.local` from the `.env.local.sow.bak`
+  backup that `sow sandbox` writes. (PR #4)
+- **JSONB sanitization.** sow now walks JSONB columns recursively and replaces
+  values whose key matches a PII pattern. Closes the biggest PII leak vector in
+  modern Postgres schemas. (PR #3)
+- **Postgres type coverage.** Built-in transformers for `inet`, `cidr`,
+  `macaddr`, `macaddr8`, plus passthrough handling for `bytea`, `xml`, `money`,
+  `interval`, range types, array types, and custom enums. (PR #3)
+- **`--allow-unsafe` flag.** sow's sanitizer is now fail-closed: it aborts
+  `sow connect` if it sees a Postgres type it can't verify. Pass `--allow-unsafe`
+  to NULL out unhandled columns instead. (PR #3)
+- **`sow doctor <connector>`** — drill into a single connector's referential
+  integrity warnings. Surfaces orphaned FKs, transient read errors, and
+  sanitization warnings. (PR #6)
+- **Tag-driven release workflow.** New `version-bump.yml` workflow lets you cut
+  a major/minor/patch/prerelease via the GitHub Actions UI; the existing
+  `release.yml` is now triggered only by tag pushes (not every merge to main).
+  Prevents accidental releases on README typos. (PR #5)
+
+### Changed (planned)
+
+- **`sow branch reset` is now sub-second** on a 10k-row schema. Refactored the
+  Docker provider to use Postgres template databases (one long-lived container
+  per connector, N branch databases inside). Old reset path was 5-15s; new path
+  is ~200-800ms. Enables tight agent reset loops (50 iterations in a minute).
+  (PR #2)
+- **Sampler integrity warnings** — the referential-integrity pass now collects
+  structured warnings (`parent_fetch_failed`, `parent_not_found`,
+  `child_fetch_failed`, `implicit_ref_fetch_failed`) instead of silently
+  swallowing them in `catch {}` blocks. Surfaced via `sow doctor <connector>`.
+  (PR #6)
+- **Implicit reference resolution is now batched.** The sampler used to fire
+  one query per (source_table, source_column) pair when resolving implicit FKs;
+  it now collects missing ids by target table across all sources and fires one
+  `IN (...)` query per target. ~10x reduction in `sow connect` round-trips on a
+  50-table schema. (PR #6)
+- **Skip-list for implicit references is now dynamic.** The old hardcoded
+  English-only `["id", "user_id", "owner_id", "created_by"]` set is replaced
+  with a dynamic check against the actual formal Relationships from the
+  schema. Works for non-English column names and unusual FK layouts. (PR #6)
+- **MCP tool count corrected.** Package descriptions now correctly state 22
+  tools (was: incorrectly listed as 15).
+- **README repositioned** around "Stop letting Claude touch your prod database"
+  with new sections on the agent reset loop, the cookbook of three workflows,
+  and a docs index.
+
+## [0.1.14] — 2026-04-06
+
+### Fixed
+
+- **SQL injection across the sampler and branching layer (security).** A class
+  of bugs where dynamic SQL was built by string-interpolating values from
+  sampled source data has been closed. Seven call sites parameterized:
+  - `packages/core/src/sampler/referential.ts` — three formal-FK and
+    implicit-reference call sites (regression: a text PK like `O'Brien` used
+    to crash silently and drop the parent row)
+  - `packages/core/src/branching/manager.ts:getBranchSample` — the `table`
+    argument from user/agent input is now `quoteIdent`-quoted, the `limit` is
+    bound via `$1`
+  - `packages/core/src/branching/providers/supabase.ts:fetchAuthUserMappings`
+    — the `IN (...)` clause now uses `$1, $2, ...` placeholders, batched at
+    1000 ids per query, with UUID-shape pre-filter
+  - `packages/core/src/branching/supabase.ts` — eight RLS DDL and auth-user
+    INSERT/DELETE sites now use parameterized values and `quoteIdent`
+    identifiers
+- **`packages/core/src/adapters/postgres.ts`** — the `query()` method's
+  `params` argument was previously declared in the interface but silently
+  dropped at runtime (`_params?: unknown[]`). Now actually passes through to
+  `postgres@3`'s `sql.unsafe(query, parameters)` for real bind-parameter
+  safety.
+- **Fail-safe RLS setup in the Supabase provider.** A previous structure
+  could DISABLE row-level security on a table when a transient introspection
+  error occurred during sandbox setup. RLS introspection now lives in its own
+  per-table try block that `continue`s on error rather than falling into the
+  policy-disable fallback path.
+- **Identifier quoting helper** — new `packages/core/src/sql/identifiers.ts`
+  exports `quoteIdent()`, the SQL-standard double-quote escape used wherever
+  table or column names are interpolated into dynamic SQL. Throws on empty
+  identifiers and embedded NUL bytes.
+- **`sow branch sample` limit clamping** — accepts `LIMIT 0` (a valid request
+  for an empty result set), falls back to the documented default of 5 for
+  non-finite inputs, and clamps the upper bound at 100.
+
+### Tests
+
+- 89 unit tests passing. 10 new regression tests in
+  `packages/core/src/sampler/referential.test.ts` covering `quoteIdent`
+  edge cases, the `O'Brien` single-quote regression, composite FK
+  parameterization, and hostile-payload defense.
+- Cross-model adversarial review (Claude + Codex) — both passes clean,
+  Codex structured P1 gate passed.
+
+## [0.1.13] — earlier
+
+Initial public release. Functional CLI, MCP server, Docker-backed branches,
+deterministic PII sanitization, schema introspection, edge-case sampling,
+checkpoint save/load, branch diff. Auto-detection from env files and the
+common ORMs (Prisma, Drizzle, Knex, TypeORM, Sequelize, Docker Compose).
+Provider hints for Supabase, Neon, Vercel Postgres, and Railway.
+</content>
diff --git a/README.md b/README.md
@@ -9,7 +9,7 @@
   ╚══════╝ ╚═════╝  ╚══╝╚══╝
 ```
 
-**Safe test databases from production Postgres.**
+**Stop letting Claude touch your prod database.**
 
 [![GitHub stars](https://img.shields.io/github/stars/Bugsterapp/sow)](https://github.com/Bugsterapp/sow)
 [![npm version](https://img.shields.io/npm/v/@sowdb/cli)](https://www.npmjs.com/package/@sowdb/cli)
@@ -20,46 +20,52 @@
 
 </div>
 
-sow connects to your production Postgres, samples representative data with edge cases, replaces all PII with realistic fakes, and gives you isolated database branches that start in seconds. 100% local, zero API calls, zero cost.
+You're using Claude Code or Cursor against a real codebase with a real database. Every time the agent is about to do something database-adjacent, you feel that quiet pang of "wait, should I let it do that?"
+
+sow is the safety layer. One command points it at your prod Postgres, samples the data, scrubs every PII column with realistic fakes, and gives your coding agent a sandboxed local copy to hammer. Prod never gets touched. The sandbox runs in seconds, resets in under one. 100% local. Zero API calls. Zero cost. Never writes to your source database.
 
 ## Install & First Use
 
 ```bash
 npm install -g @sowdb/cli
-sow connect postgresql://user:pass@host:5432/mydb
-sow branch create my-feature
-# -> postgresql://sow:sow@localhost:54320/sow
+cd your-project
+sow sandbox
 ```
 
-## Why sow?
+`sow sandbox` auto-detects your database from your project's env files, samples it, sanitizes PII, and patches `.env.local` with a safe `DATABASE_URL`. Now any coding agent on your laptop talks to the sandbox instead of prod.
+
+## Why sow
 
-- **PII Safe** — All personal data is detected and replaced with realistic fakes.
-- **Agent-First** — MCP server, `--json` mode, SKILL.md for agent context.
-- **Fast** — First snapshot in 30-60s. Branches in ~5s. Resets in ~1s.
-- **Checkpoints** — Save and restore branch state instantly.
-- **Diff** — See exactly what changed: rows added, deleted, modified, schema changes.
-- **Deterministic** — Same seed produces identical output every time.
-- **Read-Only** — sow never writes to your source database.
-- **Auto-Detect** — Scans .env files, Prisma, Drizzle, Knex, TypeORM, Sequelize, Docker Compose.
+- **Built for coding agents.** MCP server with 22 tools, `--json` mode for every command, `SKILL.md` for agent context, deterministic seeds so bugs reproduce across sessions.
+- **PII-safe by default.** Detects emails, phones, names, addresses, SSNs, JSONB-embedded fields. Fail-closed: aborts if it sees a Postgres type it can't verify, with `--allow-unsafe` to override explicitly.
+- **Reset in under 1 second.** Postgres template-database backed. Your agent can try a destructive change, verify the result, reset, try again — 50 iterations in a minute.
+- **Zero config.** Auto-detects env files, Prisma, Drizzle, Knex, TypeORM, Sequelize, Docker Compose. Identifies Supabase, Neon, Vercel Postgres, and Railway projects.
+- **Read-only on the source.** sow never writes to your production database. Parameterized queries, identifier escaping, and a security-audited code path verified by both Claude and Codex adversarial review.
+- **100% local.** No cloud round-trip, no third party holding your sanitized data, no account, no API key. The sandbox lives on your laptop.
 
 ## Quick Start
 
 ```bash
+# Zero-config: detect your DB, sample, sanitize, patch .env.local
+sow sandbox
+
+# Or do it explicitly
 sow connect postgresql://user:pass@host:5432/mydb   # analyze, sample, sanitize
 sow branch create my-feature                         # isolated Postgres in ~5s
 DATABASE_URL=postgresql://sow:sow@localhost:54320/sow npm run dev
-sow branch diff my-feature                           # see what changed
+sow branch reset my-feature                          # back to seed state in <1s
+sow branch diff my-feature                           # see what your agent changed
 sow branch delete my-feature                         # clean up
 ```
 
 ## For AI Agents
 
 ```bash
 npm install -g @sowdb/mcp
-sow mcp --agent cursor          # or claude-code, windsurf, codex
+sow mcp --agent claude-code          # or cursor, windsurf, codex
 ```
 
-Or add manually to your MCP config:
+Or add to your MCP config manually:
 
 ```json
 {
@@ -75,26 +81,50 @@ Install the agent skill for context:
 npx skills add Bugsterapp/sow
 ```
 
+The MCP server exposes 22 tools: `sow_sandbox`, `sow_connect`, `sow_detect`, `sow_branch_create`, `sow_branch_reset`, `sow_branch_diff`, `sow_branch_save`, `sow_branch_load`, `sow_branch_exec`, `sow_branch_users`, `sow_branch_tables`, `sow_branch_sample`, and more. Every tool returns structured JSON. Agents drive the full sample → branch → exec → diff → reset loop without a human in the middle.
+
 ## How It Works
 
 ```
-Production DB          sow Pipeline              Local Branches
+Production DB          sow Pipeline              Local Sandbox
 
  ┌──────────┐     ┌──────────────────────┐     ┌──────────────┐
  │ Schema   │     │  1. Analyze          │     │ Branch A     │
- │ Stats    │────>│  2. Sample (N rows)  │────>│  :54320      │
+ │ Stats    │────>│  2. Sample (N rows)  │────>│  :54320/A    │
  │ Data     │     │  3. Sanitize PII     │     │              │
  │ (read    │     │  4. Save snapshot    │     │ Branch B     │
- │  only)   │     │     (~2 MB)          │     │  :54321      │
- └──────────┘     └──────────────────────┘     └──────────────┘
-                                                Provider-managed
+ │  only)   │     │     (~2 MB)          │     │  :54320/B    │
+ └──────────┘     └──────────────────────┘     │              │
+                                                │ Branch C     │
+                                                │  :54320/C    │
+                                                └──────────────┘
+                                                 One container
+                                                 per connector,
+                                                 N branch DBs,
+                                                 reset in <1s.
 ```
 
+## Cookbook
+
+Three workflows that show the full agent loop. See [`docs/cookbook.md`](docs/cookbook.md) for the prompts and full walkthrough.
+
+1. **Let Claude refactor your schema without fear** — `sow sandbox`, then ask Claude to add a column, drop an index, rename a table. Verify, reset, try a different approach.
+2. **Let Cursor generate seed data for a new feature** — point your agent at the sandbox and ask for "100 realistic users with orders." Inspect with `sow branch sample`. Reset and ask for a different distribution.
+3. **Let your coding agent debug a failing migration** — replay your last migration on the sandbox. If it fails, reset and try a fix. No prod risk.
+
+## Documentation
+
+- [`docs/sandbox.md`](docs/sandbox.md) — the `sow sandbox` flagship command, flags, and `.env.local` patching with backup/revert
+- [`docs/sanitization.md`](docs/sanitization.md) — what sow sanitizes, the fail-closed gate, JSONB handling, and the `--allow-unsafe` flag
+- [`docs/cookbook.md`](docs/cookbook.md) — three end-to-end workflows for coding agents
+- [`CHANGELOG.md`](CHANGELOG.md) — release history
+- [`CONTRIBUTING.md`](CONTRIBUTING.md) — building from source, running tests, the lane structure
+
 ## sow Cloud — coming soon
 
 sow CLI is free, open source, and works 100% locally. Always will be.
 
-sow Cloud is for teams: shared connectors, CI/CD without Docker-in-Docker, compliance (data never touches dev laptops), and a team dashboard.
+sow Cloud is for teams: shared connectors, CI/CD without Docker-in-Docker, compliance (sanitized data never touches dev laptops), and a team dashboard.
 
 [Join the waitlist →](https://tally.so/r/0QvzZN)