From 6d783a904b8a4f9326e8982d1dbb3bd1256c73b1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Patrick=20Rossell=C3=B3=20Colom?= <74001504+PatrickSys@users.noreply.github.com> Date: Sun, 3 May 2026 13:34:59 +0200 Subject: [PATCH] Add agentic research MVP harness and templates --- README.md | 1 + .../templates/research/agentic-harness.md | 98 +++++++++++++++++++ docs/AGENTIC-RESEARCH-MVP.md | 93 ++++++++++++++++++ 3 files changed, 192 insertions(+) create mode 100644 distilled/templates/research/agentic-harness.md create mode 100644 docs/AGENTIC-RESEARCH-MVP.md diff --git a/README.md b/README.md index 2d3ddafa..b5c62447 100644 --- a/README.md +++ b/README.md @@ -122,6 +122,7 @@ Start with the public proof pack: - [Exported consumer proof pack](docs/proof/consumer-node-cli/README.md) - [Runtime support matrix](docs/RUNTIME-SUPPORT.md) - [Verification discipline](docs/VERIFICATION-DISCIPLINE.md) +- [Agentic research MVP harness](docs/AGENTIC-RESEARCH-MVP.md) ### Quickstart (after init) diff --git a/distilled/templates/research/agentic-harness.md b/distilled/templates/research/agentic-harness.md new file mode 100644 index 00000000..fb5acd0a --- /dev/null +++ b/distilled/templates/research/agentic-harness.md @@ -0,0 +1,98 @@ +# Agentic Research Harness (MVP) + +Use this when the task requires substantial web-enabled research and evidence-driven synthesis. + +## 0) Run metadata + +- Date: +- Operator: +- Query: +- Tier (`light|full`): +- Response format (`short|structured|argumentative`): +- Citation style: + +## 1) Decompose prompt + +Write `prompt-decomposition.json` with: + +- `sub_questions` +- `entities` +- `required_formats` +- `required_sections` +- `required_section_headings` +- `time_horizons` +- `time_periods` +- `scope_conditions` +- `pipeline_tier` +- `response_format` +- `citation_style` + +## 2) Coverage matrix + +Build `coverage-matrix.md` mapping each significant phrase in the query to decomposition items. + +**Gate:** zero `Gap? = YES` rows before moving on. + +## 3) Width sweep + +Collect broad candidate sources and cluster by theme: + +- canonical/primary +- secondary synthesis +- dissenting or contradictory evidence +- recent updates + +Log results in `width-sweep.md`. + +## 4) Depth passes + +Create one note file per major sub-question under `depth-notes/`. + +Each file should include: + +- claims supported +- conflicting evidence +- open uncertainties +- confidence level + +## 5) Synthesis draft + +Write `synthesis.md` with: + +- thesis +- claim graph +- tensions and tradeoffs +- implications + +## 6) Adversarial critique + +Write `critique.md`: + +- unsupported claims +- weak evidence links +- alternative interpretations +- overreach risks + +## 7) Patch and finalize + +- log revisions in `patches.md` +- produce `final-report.md` + +## 8) Verification + +Write `verification.md` with pass/fail for: + +- decomposition completeness +- coverage gap status +- source quality threshold +- citation traceability +- unresolved risk disclosure + +## 9) Scratchpad discipline + +Maintain `scratchpad.md` as append-only notes for long jobs: + +- timestamp +- action +- outcome +- next step diff --git a/docs/AGENTIC-RESEARCH-MVP.md b/docs/AGENTIC-RESEARCH-MVP.md new file mode 100644 index 00000000..47cbd57b --- /dev/null +++ b/docs/AGENTIC-RESEARCH-MVP.md @@ -0,0 +1,93 @@ +# Agentic Research MVP for Workspine + +This MVP distills HyperResearch-style decomposition + multi-pass research into a Workspine-native, repo-first flow usable by any developer when web research is enabled. + +## Goals + +- Add a deterministic, artifact-first **agentic web research harness** to Workspine. +- Preserve Workspine's existing evidence/verification philosophy. +- Keep the first version lightweight enough to run today from existing skills. + +## Architecture (MVP) + +```text +query intake + -> decomposition contract + -> width sweep (parallel source collection) + -> depth passes (focused sub-questions) + -> synthesis + -> adversarial critique + -> patch + finalize + -> verification pack +``` + +### Core artifacts + +All artifacts live under `.planning/research/agentic/`: + +- `scaffold.md` — run config, model/tooling assumptions, citation style. +- `prompt-decomposition.json` — atomic asks, entities, section contract, tier/format. +- `coverage-matrix.md` — phrase-to-atomic coverage audit, zero known gaps requirement. +- `width-sweep.md` — broad source harvest and clustering. +- `depth-notes/*.md` — deep dives by sub-question. +- `synthesis.md` — thesis, claims, confidence, tensions. +- `critique.md` — adversarial reviewer findings. +- `patches.md` — revision deltas applied after critique. +- `final-report.md` — publishable output. +- `verification.md` — evidence gates: source quality, instruction coverage, unresolved risks. +- `scratchpad.md` — append-only operator log for long-running jobs. + +## Workflow mapping into Workspine + +This MVP maps to existing workflow primitives instead of introducing a new binary command: + +1. Use `gsdd-plan` to define research objective/scope in phase plan. +2. Execute using the new harness instructions in `.planning/templates/research/agentic-harness.md`. +3. Verify with `gsdd-verify` plus research-specific checklist from template. + +## Tradeoffs considered + +### 1) Full orchestration engine vs template-first harness + +- **Chosen:** template-first harness. +- **Why:** lowest integration risk; no new runtime dependency or scheduler needed. +- **Tradeoff:** less automation than a bespoke orchestrator; more manual operator discipline. + +### 2) Single-pass writeup vs decomposition-driven pipeline + +- **Chosen:** decomposition-driven pipeline. +- **Why:** better instruction-following and coverage control. +- **Tradeoff:** more artifacts and overhead for simple queries. + +### 3) Maximal source count vs quality-gated source sets + +- **Chosen:** quality-gated source sets. +- **Why:** avoids noisy synthesis and citation bloat. +- **Tradeoff:** may miss fringe signals unless explicitly widened. + +### 4) Purely neutral summary vs adversarial critique pass + +- **Chosen:** adversarial critique required before finalization. +- **Why:** catches unsupported claims and overconfident framing. +- **Tradeoff:** extra latency per report. + +## Operating rules for web-enabled research + +- Prefer primary sources (official docs, papers, filings) where possible. +- Record retrieval date in artifact headers. +- Every major claim in `final-report.md` must trace to at least one source in synthesis notes. +- No unresolved `Gap? = YES` rows in coverage matrix before synthesis. + +## Definition of done (MVP) + +A research run is done when: + +1. Required artifacts exist and are non-empty. +2. Coverage matrix has zero known gaps. +3. Critique findings are either fixed or explicitly accepted with rationale. +4. Final report includes clear assumptions, confidence notes, and source list. +5. Verification file passes all checklist items. + +## Integration note: "latest agentic harness" + +For this MVP, "latest harness" is represented as a **pipeline contract** and artifact schema, not a hard-coded external dependency. This keeps Workspine stable while allowing future upgrades to deeper automation in a backward-compatible way.