Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions apps/docs/content/docs/examples/committed-never-pushed.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: "The Phantom Deploy"
description: Code is committed locally but never pushed. The agent marks the task done. Production silently diverges.
---

# The Phantom Deploy

An agent commits code, runs `git status`, sees a clean working tree, and marks the task as done. But `git status` only shows local state. The commit never reached the remote. Production is running old code, and nobody knows — until something breaks days later.

## The Situation

An agent was implementing a new service file for a backend API. After writing the code and running tests locally, it committed:

```bash
git add backend/services/payment_handler.py
git commit -m "feat: add payment handler service"
```

The agent checked its work:

```bash
$ git status
On branch feature/payment-handler
nothing to commit, working tree clean
```

Clean working tree. Tests pass. The agent updated the task status to "Done" and moved on.

Two days later, the team deployed from the remote repository. The deployment failed — `payment_handler.py` didn't exist. The file was committed locally but never pushed to the remote. The `feature/payment-handler` branch existed on the agent's machine but nowhere else.

The fix was simple:

```bash
git push origin feature/payment-handler
```

But the damage was already done — two days of "done" work that wasn't actually deployed, blocking downstream tasks that depended on the payment handler being available.

## The Scar

The incident produced this scar:

```json
{
"learning_type": "scar",
"title": "Files can be committed but never pushed — verify deployment end-to-end",
"description": "Done does not equal Deployed does not equal Verified Working. After any significant commit: 1) git push and verify on remote, 2) Pull on target environment, 3) Verify file exists on target, 4) Test functionality end-to-end.",
"severity": "critical",
"scar_type": "process",
"applies_when": [
"any new file creation",
"service or infrastructure changes",
"before marking any issue as Done"
]
}
```

This scar is notable for two reasons:

1. **It's a process scar** — it never decays. The gap between "committed" and "deployed" is a permanent property of distributed version control, not a temporary situation.

2. **The `applies_when` field is broad** — "before marking any issue as Done" means this scar surfaces on almost every task completion. That's intentional. The cost of checking `git push` status is seconds; the cost of a phantom deploy is days.

## The Save

Weeks later, an agent finished implementing a database migration and was about to mark the task complete. It called `recall`:

```
recall({ plan: "mark migration task as done" })
```

GitMem surfaced the scar:

```
🧠 INSTITUTIONAL MEMORY ACTIVATED

Found 3 relevant scars for your plan:

🔴 Files can be committed but never pushed — verify
deployment end-to-end (critical, score: 0.79)
Done does not equal Deployed does not equal Verified
Working. After any significant commit...
```

The agent confirmed:

```
confirm_scars([{
scar_id: "...",
decision: "APPLYING",
evidence: "Ran git push origin feature/user-migration and verified commit appears on remote with git log origin/feature/user-migration --oneline -1. Branch is ahead by 0 commits after push.",
relevance: "high"
}])
```

Before marking the task done, the agent pushed, verified the commit on the remote, and confirmed zero divergence. The phantom deploy pattern was broken.

## The Takeaway

This example demonstrates three properties of effective institutional memory:

**The most dangerous bugs are state mismatches, not crashes.** The agent's local environment was perfectly correct — tests passed, files existed, `git status` was clean. The bug was the gap between local state and remote state. Scars are particularly valuable for state-mismatch bugs because they can't be caught by local testing alone.

**Process scars earn their keep through frequency.** This scar fires on nearly every task completion. That sounds noisy, but the confirmation takes seconds ("Verified push to remote"), and the downside of missing it is days of wasted work. High-frequency, low-cost scars are the backbone of institutional memory.

**"Done" is a loaded word.** In solo development, "committed" might mean "done." In distributed systems with multiple agents, "done" requires proof that the work is accessible to others. This scar redefines "done" from a local assertion to a distributed verification — and the `confirm_scars` protocol makes that redefinition actionable, not just philosophical.
107 changes: 107 additions & 0 deletions apps/docs/content/docs/examples/credential-exposure-near-miss.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: "The Credential Leak"
description: An AI agent almost prints API keys to a recorded terminal session. Institutional memory blocks it.
---

# The Credential Leak

An agent needs to check whether environment variables are configured. The fastest way? `echo $API_KEY`. But in a recorded session, that value is now permanently stored — in transcripts, recordings, and conversation history. This is the kind of mistake that's invisible until it's catastrophic.

## The Situation

An agent was setting up a deployment pipeline inside a container. It needed to verify that three API keys were available before proceeding:

```bash
# The agent's first instinct
echo $PAYMENT_API_KEY
echo $STORAGE_SECRET
echo $AUTH_TOKEN
```

This would work — the values would print, the agent would confirm they exist, and the deployment would proceed. But the session was being recorded. Every command and its output was captured in:

1. A session transcript stored in a database
2. An asciinema recording (`.cast` file) synced to shared storage
3. The conversation history persisted across sessions

If those `echo` commands ran, the actual secret values would be permanently embedded in at least three storage systems. Rotating the keys wouldn't help — the old values would still exist in historical records.

The correct approach uses existence checks that never expose values:

```bash
# Safe: only checks if the variable is set
[ -n "$PAYMENT_API_KEY" ] && echo "SET" || echo "MISSING"
[ -n "$STORAGE_SECRET" ] && echo "SET" || echo "MISSING"
[ -n "$AUTH_TOKEN" ] && echo "SET" || echo "MISSING"
```

## The Scar

After a near-miss where an agent almost printed credentials during a debugging session, this scar was created:

```json
{
"learning_type": "scar",
"title": "Never print environment variable VALUES — only check existence",
"description": "When checking if environment variables exist, NEVER run commands that output the actual values (e.g., echo $SECRET, env | grep KEY). Secret values will be captured in session recordings, transcripts, and conversation history. Instead, check existence only: test with [ -n \"$VAR\" ] && echo \"SET\" || echo \"MISSING\".",
"severity": "critical",
"scar_type": "incident",
"counter_arguments": [
"You might think showing a prefix is safe — but partial keys are still sensitive and the full value is in the raw output before truncation",
"You might think CLI sessions are ephemeral — but recordings and transcripts persist them permanently",
"You might think it's fine because it's a local container — but recordings sync to shared directories and transcripts go to the database"
]
}
```

Three things make this scar effective:

1. **Critical severity** ensures it surfaces with high priority during recall — it won't be buried below medium-severity results.

2. **Three counter-arguments** address three distinct rationalizations. Each one is a real thought process that would lead an agent to print secrets anyway: "partial keys are fine," "sessions are temporary," "containers are isolated."

3. **The scar type is `incident`** (not `process`), meaning it decays over 180 days. If the team later builds infrastructure-level protection (like a pre-execution hook that blocks secret-printing commands), the scar becomes less critical and naturally fades.

## The Save

In a later session, an agent was debugging a failing API integration and needed to verify credentials were loaded. It called `recall` before running diagnostic commands:

```
recall({ plan: "check environment variables for API configuration" })
```

GitMem surfaced the scar:

```
🧠 INSTITUTIONAL MEMORY ACTIVATED

Found 2 relevant scars for your plan:

🔴 Never print environment variable VALUES — only check
existence (critical, score: 0.82)
When checking if environment variables exist, NEVER run
commands that output the actual values...
```

The agent confirmed:

```
confirm_scars([{
scar_id: "...",
decision: "APPLYING",
evidence: "Used [ -n \"$VAR\" ] && echo SET || echo MISSING for all three API keys. No secret values were printed to the session.",
relevance: "high"
}])
```

Instead of `echo $API_KEY`, the agent ran safe existence checks. The credentials stayed secret. The session recording contained only "SET" or "MISSING" — no actual values.

## The Takeaway

This example demonstrates three properties of effective institutional memory:

**Critical scars justify infrastructure investment.** This scar was so important that it eventually led to a `PreToolUse` hook — an automated gate that blocks commands matching credential-exposure patterns before they execute. The scar came first, proved the risk was real, and justified building the permanent fix. Scars can be the evidence that earns engineering investment.

**Counter-arguments address rationalizations, not just facts.** "Partial keys are safe" and "containers are isolated" aren't factual errors — they're reasonable-sounding arguments that happen to be wrong in this context. By documenting them explicitly, the scar prevents the agent from independently re-deriving the same flawed reasoning.

**Incident scars have built-in expiration.** Unlike process scars (which never decay), this incident scar fades over 180 days. If the team ships a hook that structurally prevents credential exposure, the scar becomes redundant — and the system knows it. This prevents scar accumulation from becoming a burden.
2 changes: 2 additions & 0 deletions apps/docs/content/docs/examples/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,5 @@ These stories show that loop in action.
## Available Examples

- [The Silent Migration](/docs/examples/silent-migration-wrong-directory) — A CLI tool silently reads the wrong directory, and institutional memory catches it before a database gets corrupted.
- [The Credential Leak](/docs/examples/credential-exposure-near-miss) — An agent almost prints API keys to a recorded session. A critical scar blocks it before secrets are permanently stored.
- [The Phantom Deploy](/docs/examples/committed-never-pushed) — Code is committed locally but never pushed. The agent marks it done. Production silently diverges for days.
2 changes: 1 addition & 1 deletion apps/docs/content/docs/examples/meta.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{
"title": "Examples",
"pages": ["index", "silent-migration-wrong-directory"]
"pages": ["silent-migration-wrong-directory", "credential-exposure-near-miss", "committed-never-pushed"]
}