Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions EXAMPLES.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ End-to-end walkthroughs for common seedstorm workflows. All examples assume loca
## 1. Basic Seeding (no AI)

<details>
<summary><strong>Demo</strong> — introspect + seed a 28-table schema</summary>
<summary><strong>Demo</strong> — introspect + seed a 29-table schema</summary>

<img src="docs/gifs/basic-seed.gif" alt="seedstorm introspect + seed" width="720" />

Expand Down Expand Up @@ -55,7 +55,7 @@ seedstorm seed \
14:12:15 INFO Seeding table table=users rows=150
14:12:15 INFO Seeding table table=companies rows=50
...
14:12:16 INFO Seeding complete tables=28 total_rows=1515 duration=316ms
14:12:16 INFO Seeding complete tables=29 total_rows=1540 duration=316ms
```

</details>
Expand Down Expand Up @@ -398,7 +398,7 @@ Select which tables to seed. Tables are shown in FK-safe order with their depend
[ ] employees → departments
[✓] wishlists → users

7 of 28 tables selected
7 of 29 tables selected
↑/↓ navigate • space toggle • a all • n none • enter confirm • q quit
```

Expand Down Expand Up @@ -426,6 +426,7 @@ Set seeding parameters. Tab between fields, space to toggle the truncate checkbo
▸ Rows per table: [50]
Batch size: [100]
Enum rows (0 = use rows): [0]
Self-ref depth: [2]
[ ] Truncate before seeding

tab/↑↓ navigate • space toggle • enter confirm • b back • q quit
Expand All @@ -436,6 +437,7 @@ Set seeding parameters. Tab between fields, space to toggle the truncate checkbo
| Rows per table | How many rows to generate for each selected table |
| Batch size | Rows per INSERT statement (higher = faster, default 100) |
| Enum rows | Rows per enum value for enum tables (0 = use rows count) |
| Self-ref depth | Maximum generated depth for self-referential FK chains |
| Truncate | Delete all existing data before seeding (shows warning in review) |

</details>
Expand Down Expand Up @@ -598,14 +600,15 @@ A 3-step wizard: **Tables → Config → Generate**

**Step 1 — Table picker:** Same as `seed -i` — select which tables to include.

**Step 2 — Config:** Set rows, choose format (yaml/json/sql with `←`/`→`), and optionally set an output file path.
**Step 2 — Config:** Set rows, self-reference depth, choose format (yaml/json/sql with `←`/`→`), and optionally set an output file path.

```
seedstorm generate interactive ✓ Tables ● Config ○ Generate

Configure generation

▸ Rows per table: [10]
Self-ref depth: [2]
Format: [yaml] json sql
Output file: [data.json]

Expand Down
13 changes: 12 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,18 @@ test:

test-integration: dev-up
@echo "Waiting for databases to be healthy..."
@docker compose wait mysql postgres 2>/dev/null || sleep 10
@for i in $$(seq 1 60); do \
pg=$$(docker inspect -f '{{.State.Health.Status}}' seedstorm-postgres-1 2>/dev/null || true); \
my=$$(docker inspect -f '{{.State.Health.Status}}' seedstorm-mysql-1 2>/dev/null || true); \
if [ "$$pg" = "healthy" ] && [ "$$my" = "healthy" ]; then \
break; \
fi; \
if [ "$$i" = "60" ]; then \
docker compose ps; \
exit 1; \
fi; \
sleep 1; \
done
cd integration && go test -v -tags integration -count=1 ./... -timeout 300s

lint:
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,14 +67,14 @@ seedstorm gaps \
## Features

- **Schema self-discovery** — introspects tables, columns, PKs, FKs, enum values, UNIQUE and CHECK constraints; no manual editing required
- **FK-aware seeding** — topological sort guarantees parent tables are seeded before children; handles self-referential FKs, near-cycles, junction tables, and deep multi-level chains
- **FK-aware seeding** — topological sort guarantees parent tables are seeded before children; handles nullable and non-nullable self-referential FKs with bounded depth, near-cycles, junction tables, and deep multi-level chains
- **Constraint-aware faker mapping** — UNIQUE → `uuid`, CHECK IN → `randomstring(a,b,c)`, CHECK range → `number(min,max)`; seed data always satisfies your constraints
- **Semantic faker** — maps column names (`email`, `first_name`, `price`, `city`…) to realistic `gofakeit` generators automatically
- **Enum coverage** — every enum value appears at least `--rows` times, independently per column
- **AI enrichment** — Gemini rewrites faker hints for domain-meaningful data; supply `--prompt` for richer context
- **Gap analysis** — `gaps` shows which tables are empty with row counts and FK context; `--fill` seeds only the empty ones
- **Interactive TUI** — wizard for table selection, global config, per-table row volumes, and review before seeding
- **Web UI** — `seedstorm serve` exposes an interactive graph workspace with click-to-select tables, per-table row overrides, live SSE job logs, multi-DB session switcher, and connection presets in `localStorage`
- **Interactive TUI** — wizard for table selection, global config, self-reference depth, per-table row volumes, and review before seeding
- **Web UI** — `seedstorm serve` exposes an interactive graph workspace with click-to-select tables, self-reference depth, per-table row overrides, live SSE job logs, multi-DB session switcher, and connection presets in `localStorage`
- **Dry-run** — preview the seed plan and INSERT SQL without touching the database
- **Export** — generate fake data as YAML, JSON, or SQL without a live connection

Expand Down
26 changes: 25 additions & 1 deletion docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,22 @@ seedstorm seed \
--schema schema.yaml \
--enum-rows 10

# Bound generated self-referential chains to 2 levels
seedstorm seed \
--db postgres \
--dsn "postgres://..." \
--schema schema.yaml \
--self-ref-depth 2

# Override specific table volumes from a scripted run
seedstorm seed \
--db postgres \
--dsn "postgres://..." \
--schema schema.yaml \
--rows 20 \
--table-rows users=200,orders=500 \
--table-rows order_items=1000

# Interactive TUI — pick tables, configure options, review, then seed
seedstorm seed \
--db postgres \
Expand All @@ -126,7 +142,9 @@ The interactive TUI includes a **Volumes** step after global config. Each select
| `--db` / `$SEEDSTORM_DB` | `postgres` | Database type |
| `--dsn` / `$SEEDSTORM_DSN` | — | Connection string (required) |
| `--rows` / `-r` | `100` | Rows per table |
| `--table-rows` | — | Per-table row override, repeatable or comma-separated (`table=rows`) |
| `--enum-rows` | `0` | Rows per enum value (0 = use `--rows`) |
| `--self-ref-depth` | `2` | Maximum generated depth for self-referential FK chains |
| `--disable-fk` | false | Skip FK ordering |
| `--dry-run` / `-n` | false | Print seed plan + SQL, do not execute |
| `--truncate` | false | Truncate all tables before seeding (prompts for confirmation) |
Expand Down Expand Up @@ -196,7 +214,9 @@ Gap Analysis
| `--db` / `$SEEDSTORM_DB` | `postgres` | Database type |
| `--dsn` / `$SEEDSTORM_DSN` | — | Connection string (required) |
| `--rows` / `-r` | `100` | Rows per empty table (when `--fill` is set) |
| `--table-rows` | — | Per-table row override for fill, repeatable or comma-separated (`table=rows`) |
| `--enum-rows` | `0` | Rows per enum value for empty enum tables (0 = use `--rows`) |
| `--self-ref-depth` | `2` | Maximum generated depth for self-referential FK chains |
| `--fill` | false | Seed all empty tables |
| `--dry-run` / `-n` | false | Print SQL without executing (requires `--fill`) |
| `--yes` / `-y` | false | Skip confirmation prompt |
Expand All @@ -213,6 +233,8 @@ Generates fake data without connecting to a database. Outputs YAML, JSON, or SQL
seedstorm generate --schema schema.yaml --rows 10 --format json --out data.json
seedstorm generate --schema schema.yaml --rows 5 --format sql --db postgres
seedstorm generate --schema schema.yaml --rows 20 --format yaml
seedstorm generate --schema schema.yaml --rows 20 --self-ref-depth 3
seedstorm generate --schema schema.yaml --rows 20 --table-rows users=200,orders=500

# Interactive TUI
seedstorm generate --schema schema.yaml --interactive
Expand All @@ -226,6 +248,8 @@ In interactive mode, the **Volumes** step can override row counts per selected t
|------|---------|-------------|
| `--schema` / `-s` | `schema.yaml` | Schema file |
| `--rows` / `-r` | `100` | Rows per table |
| `--table-rows` | — | Per-table row override, repeatable or comma-separated (`table=rows`) |
| `--self-ref-depth` | `2` | Maximum generated depth for self-referential FK chains |
| `--format` / `-f` | `yaml` | Output format: `yaml`, `json`, `sql` |
| `--out` / `-o` | stdout | Output file (omit for stdout) |
| `--db` | `postgres` | DB type (affects SQL placeholder style) |
Expand Down Expand Up @@ -259,7 +283,7 @@ SEEDSTORM_ADDR=127.0.0.1:9000 seedstorm serve

What the UI gives you:

- **Workspace** — Cytoscape DAG of every table; click to select, non-nullable parents auto-lock as a dependency closure (mirrors the TUI). The selected-table panel lets you override row counts per table for **Seed**, **Fill empty**, and workspace **Generate** runs while `Rows` remains the default. Live SSE log stream + status pill.
- **Workspace** — Cytoscape DAG of every table; click to select, non-nullable parents auto-lock as a dependency closure (mirrors the TUI). The selected-table panel lets you override row counts per table for **Seed**, **Fill empty**, and workspace **Generate** runs while `Rows` remains the default. `Self-ref` controls bounded generated depth for self-referential FK chains. Live SSE log stream + status pill.
- **Connection management** — multi-session: hold several DBs open in one browser and switch from a topbar dropdown. Saved connection presets in `localStorage` with optional password (eye-icon reveal, closed by default). Passwords are kept in process memory only on the server.
- **Standalone tools** — `/generate`, `/enrich`, `/export` mirror the CLI commands as forms.

Expand Down
11 changes: 6 additions & 5 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,12 +62,13 @@ go test ./... -v

### Integration tests

Integration tests run the full pipeline against a 28-table real-world schema on both MySQL and PostgreSQL, covering:
Integration tests run the full pipeline against a 29-table real-world schema on both MySQL and PostgreSQL, covering:

| Edge case | Tables |
|-----------|--------|
| Self-referential FK | `categories`, `departments`, `employees` |
| Near-cycle (nullable FK breaks it) | `departments.head_employee_id ↔ employees.department_id` |
| Hard self-reference | `hard_self_employees.manager_id → hard_self_employees.id` |
| Deep FK chain (5 levels) | `return_requests → order_items → orders → users` |
| Many-to-many junctions | `product_tags`, `project_assignments`, `wishlist_items` |
| Multiple enums per table | `support_tickets` (status + priority) |
Expand All @@ -78,8 +79,8 @@ Integration tests run the full pipeline against a 28-table real-world schema on
| CHECK range constraint → `number(min,max)` faker | `products.rating` (1–5) |

Tests verify:
- All 28 tables receive exactly the requested number of rows
- 38 FK relationships have zero orphans
- All 29 tables receive rows, with enum-coverage tables allowed to exceed the base request so every enum value is represented
- 39 FK relationships have zero orphans, including nullable and non-nullable self-references
- 6 value constraints hold (ratings 1–5, prices > 0, quantities ≥ 1, salaries > 0)
- Enum values, UNIQUE columns, and CHECK constraints are auto-detected correctly

Expand All @@ -100,7 +101,7 @@ Expected output:
brands 25 rows
...
audit_logs 25 rows
Total: 700 rows across 28 tables (4.43s)
Total: 1600+ rows across 29 tables (11.30s)
--- PASS: TestPostgresIntegration (6.87s)
```

Expand All @@ -117,7 +118,7 @@ All tests run automatically on every PR via GitHub Actions (`.github/workflows/p
| `validate` | Directory/file structure via structlint |
| `test` | `go test ./...` + `make build` |
| `lint` | `golangci-lint` |
| `integration` | Full 28-table suite on Postgres 15 + MySQL 8 |
| `integration` | Full 29-table suite on Postgres 15 + MySQL 8 |

The integration job in CI uses `--timeout 120s`. Use `300s` locally when running both engines back-to-back.

Expand Down
11 changes: 10 additions & 1 deletion init/init.mysql.sql
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS hard_self_employees;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS products;
DROP TABLE IF EXISTS users;
Expand All @@ -16,6 +17,14 @@ CREATE TABLE products (
price DECIMAL(10, 2) NOT NULL
);

CREATE TABLE hard_self_employees (
id INT AUTO_INCREMENT PRIMARY KEY,
manager_id INT NOT NULL,
name VARCHAR(255) NOT NULL,
title VARCHAR(255),
FOREIGN KEY (manager_id) REFERENCES hard_self_employees(id)
);

CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
Expand All @@ -31,4 +40,4 @@ CREATE TABLE order_items (
quantity INT NOT NULL,
FOREIGN KEY (order_id) REFERENCES orders(id),
FOREIGN KEY (product_id) REFERENCES products(id)
);
);
10 changes: 9 additions & 1 deletion init/init.sql
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS hard_self_employees;
DROP TABLE IF EXISTS orders;
DROP TYPE IF EXISTS order_status;
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'shipped', 'delivered', 'cancelled');
Expand All @@ -16,6 +17,13 @@ CREATE TABLE IF NOT EXISTS products (
price NUMERIC(10, 2) NOT NULL
);

CREATE TABLE IF NOT EXISTS hard_self_employees (
id SERIAL PRIMARY KEY,
manager_id INTEGER NOT NULL REFERENCES hard_self_employees(id),
name VARCHAR(255) NOT NULL,
title VARCHAR(255)
);

CREATE TABLE IF NOT EXISTS orders (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(id),
Expand All @@ -28,4 +36,4 @@ CREATE TABLE IF NOT EXISTS order_items (
order_id INTEGER NOT NULL REFERENCES orders(id),
product_id INTEGER NOT NULL REFERENCES products(id),
quantity INTEGER NOT NULL
);
);
Loading
Loading