harness: add targeted tests for dedup sensitivity, multi-module section splitting, and security boundary routing

## Context

After running the full scenario harness and analyzing results, these test gaps were identified as high-value additions for improving classifier quality.

## Proposed additional test scenarios

### 1. Dedup sensitivity — near-duplicate variants

The current dedup test (\`edge-duplicate-injection\`) injects identical text. We need tests for:
- **Rephrased duplicates**: "Never commit secrets" vs "Do not commit secrets to the repository" — Jaccard similarity may fall below 0.8, causing false negatives
- **Partial duplicates**: a new session adds 3 rules, 2 of which already exist in ADF — only 1 should migrate

### 2. Multi-module section splitting — heading dominates all items

Current classifier: once a heading routes to a module, ALL items in that section go to the same module. Items with keywords for other modules are ignored.

Example failure:
\`\`\`
## Database
- D1 bound as \`DB\` in wrangler.toml        → backend.adf (heading wins)
- Run migrations with \`wrangler d1 migrate\` → backend.adf (should be infra.adf!)
\`\`\`

A test that verifies cross-keyword items within a section would expose this and track when it's fixed.

### 3. Security boundary routing — auth in backend vs security modules

Auth-related rules appear in two contexts:
- **Implementation rules** (how to write auth code): belong in \`backend.adf\`
- **Security policy rules** (what must be enforced): belong in \`security.adf\`

The current \`## Auth\` heading maps everything to \`security.adf\`. A test with mixed implementation + policy rules under one heading would expose the lack of sub-heading routing.

### 4. Trigger prefix collision — short triggers matching unrelated content

The prefix match fix (removing trailing \`\b\`) introduced a potential over-matching risk. Example:
- Trigger \`auth\` now matches "authority", "author", "authentic" 
- Trigger \`api\` matches "apiary", "apiVersion"

A test with content containing "the author of this library" or "apiary endpoint" should verify these don't accidentally route to security/backend modules.

### 5. Large injection — 20+ items in one session

Current tests max at ~13 items per session. A stress test with 25+ items would:
- Test dedup performance (O(n²) Jaccard comparisons)
- Verify routing accuracy doesn't degrade at scale
- Surface any ADF write failures for large patch sets

### 6. Empty/minimal injection — just a heading, no items

Edge case: AI adds \`## Auth\n\n\` (heading with no content). Should produce 0 extractions cleanly without errors.

## Implementation

Add these to \`harness/corpus/edge-cases.ts\` as additional \`Scenario\` objects. The trigger prefix collision test (#4) is particularly important to add before the prefix-match change ships in a release.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

harness: add targeted tests for dedup sensitivity, multi-module section splitting, and security boundary routing #36

Context

Proposed additional test scenarios

1. Dedup sensitivity — near-duplicate variants

2. Multi-module section splitting — heading dominates all items

Database

3. Security boundary routing — auth in backend vs security modules

4. Trigger prefix collision — short triggers matching unrelated content

5. Large injection — 20+ items in one session

6. Empty/minimal injection — just a heading, no items

Implementation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

harness: add targeted tests for dedup sensitivity, multi-module section splitting, and security boundary routing #36

Description

Context

Proposed additional test scenarios

1. Dedup sensitivity — near-duplicate variants

2. Multi-module section splitting — heading dominates all items

Database

3. Security boundary routing — auth in backend vs security modules

4. Trigger prefix collision — short triggers matching unrelated content

5. Large injection — 20+ items in one session

6. Empty/minimal injection — just a heading, no items

Implementation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions