From 544021093dd8fe4245117b5ce65568516bed85de Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Patrick=20Rossell=C3=B3=20Colom?= <74001504+PatrickSys@users.noreply.github.com> Date: Sun, 3 May 2026 13:35:10 +0200 Subject: [PATCH] Add agentic web research MVP contracts and templates --- README.md | 1 + agents/researcher.md | 13 ++++ .../templates/research/agentic-report.md | 44 +++++++++++++ .../templates/research/agentic-scratchpad.md | 23 +++++++ docs/AGENTIC-RESEARCH-MVP.md | 65 +++++++++++++++++++ 5 files changed, 146 insertions(+) create mode 100644 distilled/templates/research/agentic-report.md create mode 100644 distilled/templates/research/agentic-scratchpad.md create mode 100644 docs/AGENTIC-RESEARCH-MVP.md diff --git a/README.md b/README.md index 2d3ddafa..8527d7d0 100644 --- a/README.md +++ b/README.md @@ -122,6 +122,7 @@ Start with the public proof pack: - [Exported consumer proof pack](docs/proof/consumer-node-cli/README.md) - [Runtime support matrix](docs/RUNTIME-SUPPORT.md) - [Verification discipline](docs/VERIFICATION-DISCIPLINE.md) +- [Agentic research MVP](docs/AGENTIC-RESEARCH-MVP.md) ### Quickstart (after init) diff --git a/agents/researcher.md b/agents/researcher.md index 083e3e50..9bbbf236 100644 --- a/agents/researcher.md +++ b/agents/researcher.md @@ -82,3 +82,16 @@ Same algorithm, different scope. The scope is a context input, not a different r - **Tools required:** Web search, URL fetch, file read, file write; authoritative documentation API strongly recommended - **Parallelizable:** Yes -- 4 researchers (one per domain: stack, features, architecture, pitfalls) can run simultaneously - **Context budget:** High -- research is read-heavy with many external fetches. Keep output files focused to avoid downstream bloat. + +## Agentic Research Mode (MVP) + +When web research is enabled and question complexity is high, run a decomposition-first agentic loop: + +1. Decompose into sub-questions with completion criteria. +2. Run parallel worker tracks for each sub-question. +3. Maintain append-only scratchpad trace for long jobs. +4. Synthesize into claim-level evidence + confidence. + +Use templates: +- `distilled/templates/research/agentic-report.md` +- `distilled/templates/research/agentic-scratchpad.md` diff --git a/distilled/templates/research/agentic-report.md b/distilled/templates/research/agentic-report.md new file mode 100644 index 00000000..f3664f09 --- /dev/null +++ b/distilled/templates/research/agentic-report.md @@ -0,0 +1,44 @@ +# Agentic Research Report + +**Date:** [YYYY-MM-DD] +**Question:** [Primary research question] +**Scope:** [project | phase] +**Timebox:** [e.g., 45m] + +## 1) Executive answer + +- [Direct answer in 3-7 bullets] + +## 2) Decomposition map + +| Sub-question | Why it matters | Completion criteria | Status | +|---|---|---|---| +| [Q1] | [reason] | [done means] | [done/incomplete] | + +## 3) Recommendations + +1. [Recommendation] + - Confidence: [verified | likely | uncertain] + - Rationale: [short] + - Evidence: [URL1, URL2] + - Tradeoff: [cost/risk] + +## 4) Contradictions and ambiguity + +- [Conflicting evidence and how it was resolved] + +## 5) Non-goals / defer list + +- [What we explicitly do not recommend in MVP] + +## 6) Planner handoff + +- Suggested milestone/phase impact: +- Required technical spikes: +- Open questions that block implementation: + +## 7) Source index + +| Claim | Source URL | Accessed date | +|---|---|---| +| [claim] | [url] | [YYYY-MM-DD] | diff --git a/distilled/templates/research/agentic-scratchpad.md b/distilled/templates/research/agentic-scratchpad.md new file mode 100644 index 00000000..ef206f33 --- /dev/null +++ b/distilled/templates/research/agentic-scratchpad.md @@ -0,0 +1,23 @@ +# Agentic Research Scratchpad + +**Job:** [identifier] +**Started:** [timestamp] +**Owner:** [agent/runtime] + +## Running log (append-only) + +| Time | Sub-question | Action | Observation | Next step | +|---|---|---|---|---| +| [hh:mm] | [Q1] | [search/query/read] | [finding] | [follow-up] | + +## Dead ends / discarded paths + +- [Path] -> [Why discarded] + +## Confidence shifts + +- [Claim] moved from [uncertain] to [likely] because [new evidence] + +## Pending validation + +- [Claim requiring verification] diff --git a/docs/AGENTIC-RESEARCH-MVP.md b/docs/AGENTIC-RESEARCH-MVP.md new file mode 100644 index 00000000..9c15efb1 --- /dev/null +++ b/docs/AGENTIC-RESEARCH-MVP.md @@ -0,0 +1,65 @@ +# Agentic Research MVP (Hyperresearch Distillation) + +**Status:** MVP-ready in Workspine +**Date:** 2026-04-29 + +## Objective + +Provide a practical, repo-native agentic web research harness that any developer can run end-to-end with evidence, scratchpad traceability, and handoff quality suitable for planning/execution workflows. + +## Scope + +This MVP adds: + +1. A dedicated **agent contract** for decomposition-first web research. +2. A **scratchpad protocol** for long-running jobs. +3. A canonical **report template** that is strict about claims/evidence/confidence. +4. A suggested **execution lifecycle** that plugs into existing Workspine planning and verification workflows. + +This MVP intentionally does **not** add a new CLI command yet; it is a workflow/contract integration that can be invoked via existing agent surfaces. + +## Distilled Architecture + +### Components + +- **Orchestrator agent**: owns question framing, decomposition, synthesis, and final report quality. +- **Research workers**: parallel sub-agents that run targeted web/document research. +- **Scratchpad**: append-only task log for hypotheses, query paths, dead ends, and evidence links. +- **Evidence registry**: report-level table mapping each claim to source URLs and confidence. + +### Data flow + +1. Define research question and constraints. +2. Decompose into answerable sub-questions with explicit completion criteria. +3. Run workers in parallel against web-enabled tools. +4. Persist intermediate findings and uncertainty in scratchpad. +5. Merge, dedupe, contradiction-check, and confidence-score. +6. Emit final report in repository template. + +## Tradeoffs (explicit) + +- **Speed vs reliability:** parallel search is faster but increases duplicate/noisy evidence; mitigation is mandatory claim-level provenance. +- **Breadth vs depth:** broad decomposition catches blindspots but can dilute deep technical validation; mitigation is tiered pass (breadth first, depth second for decisive claims). +- **Autonomy vs determinism:** higher autonomy can improve discovery but harm reproducibility; mitigation is strict scratchpad + evidence contracts. +- **Token cost vs auditability:** richer traces cost more, but make handoff and review much stronger; MVP prefers auditability. + +## Integration with existing Workspine lifecycle + +- During **new-project** or **plan** phases, invoke the new agent contract when web research is available. +- Store outputs under `.planning/research/` (project) or phase directory (phase-specific). +- Use verification workflow discipline to check that recommendations are source-backed and actionable. + +## MVP Operating Contract + +1. Never publish a recommendation without at least one linked source. +2. Mark every major claim with confidence (`verified`, `likely`, `uncertain`). +3. Record unresolved questions and contradictions explicitly. +4. Keep scratchpad as durable artifact when total runtime exceeds 10 minutes or multi-agent fanout > 3 workers. +5. Include concrete next-step recommendations for planner/roadmapper consumption. + +## Suggested rollout path + +1. **Now (this MVP):** markdown contracts + templates + agent role. +2. **Next:** add `gsdd-research-agentic` workflow wrapper command. +3. **Later:** evidence schema linting and freshness TTL checks for cited web sources. +