Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ Start with the public proof pack:
- [Exported consumer proof pack](docs/proof/consumer-node-cli/README.md)
- [Runtime support matrix](docs/RUNTIME-SUPPORT.md)
- [Verification discipline](docs/VERIFICATION-DISCIPLINE.md)
- [Agentic research MVP](docs/AGENTIC-RESEARCH-MVP.md)

### Quickstart (after init)

Expand Down
13 changes: 13 additions & 0 deletions agents/researcher.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,16 @@ Same algorithm, different scope. The scope is a context input, not a different r
- **Tools required:** Web search, URL fetch, file read, file write; authoritative documentation API strongly recommended
- **Parallelizable:** Yes -- 4 researchers (one per domain: stack, features, architecture, pitfalls) can run simultaneously
- **Context budget:** High -- research is read-heavy with many external fetches. Keep output files focused to avoid downstream bloat.

## Agentic Research Mode (MVP)

When web research is enabled and question complexity is high, run a decomposition-first agentic loop:

1. Decompose into sub-questions with completion criteria.
2. Run parallel worker tracks for each sub-question.
3. Maintain append-only scratchpad trace for long jobs.
4. Synthesize into claim-level evidence + confidence.

Use templates:
- `distilled/templates/research/agentic-report.md`
- `distilled/templates/research/agentic-scratchpad.md`
Comment on lines +95 to +97
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While the templates are specified, the instructions do not define the expected output filenames for the resulting artifacts. To ensure consistency and allow downstream workflows to locate these files, please specify the target filenames (e.g., AGENTIC-REPORT.md and SCRATCHPAD.md) and their locations (e.g., .planning/research/ or the phase directory).

Suggested change
Use templates:
- `distilled/templates/research/agentic-report.md`
- `distilled/templates/research/agentic-scratchpad.md`
Write output files (e.g., AGENTIC-REPORT.md and SCRATCHPAD.md) using templates:
- distilled/templates/research/agentic-report.md
- distilled/templates/research/agentic-scratchpad.md

44 changes: 44 additions & 0 deletions distilled/templates/research/agentic-report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Agentic Research Report

**Date:** [YYYY-MM-DD]
**Question:** [Primary research question]
**Scope:** [project | phase]
**Timebox:** [e.g., 45m]

## 1) Executive answer

- [Direct answer in 3-7 bullets]

## 2) Decomposition map

| Sub-question | Why it matters | Completion criteria | Status |
|---|---|---|---|
| [Q1] | [reason] | [done means] | [done/incomplete] |

## 3) Recommendations

1. [Recommendation]
- Confidence: [verified | likely | uncertain]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To maintain consistency with the Researcher role contract and the Synthesizer expectations, the confidence labels should use the established HIGH/MEDIUM/LOW scale instead of the new verified/likely/uncertain labels.

Suggested change
- Confidence: [verified | likely | uncertain]
- Confidence: [HIGH | MEDIUM | LOW]

- Rationale: [short]
- Evidence: [URL1, URL2]
- Tradeoff: [cost/risk]

## 4) Contradictions and ambiguity

- [Conflicting evidence and how it was resolved]

## 5) Non-goals / defer list

- [What we explicitly do not recommend in MVP]

## 6) Planner handoff

- Suggested milestone/phase impact:
- Required technical spikes:
- Open questions that block implementation:

## 7) Source index

| Claim | Source URL | Accessed date |
|---|---|---|
| [claim] | [url] | [YYYY-MM-DD] |
Comment on lines +42 to +44
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Source index table is missing the Confidence column, which is explicitly required by the 'Evidence registry' definition in the MVP documentation (docs/AGENTIC-RESEARCH-MVP.md, line 28). Adding this column ensures that every claim is directly associated with its source and its verified confidence level.

Suggested change
| Claim | Source URL | Accessed date |
|---|---|---|
| [claim] | [url] | [YYYY-MM-DD] |
| Claim | Source URL | Confidence | Accessed date |
|---|---|---|---|
| [claim] | [url] | [HIGH | MEDIUM | LOW] | [YYYY-MM-DD] |

23 changes: 23 additions & 0 deletions distilled/templates/research/agentic-scratchpad.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Agentic Research Scratchpad

**Job:** [identifier]
**Started:** [timestamp]
**Owner:** [agent/runtime]

## Running log (append-only)

| Time | Sub-question | Action | Observation | Next step |
|---|---|---|---|---|
| [hh:mm] | [Q1] | [search/query/read] | [finding] | [follow-up] |

## Dead ends / discarded paths

- [Path] -> [Why discarded]

## Confidence shifts

- [Claim] moved from [uncertain] to [likely] because [new evidence]

## Pending validation

- [Claim requiring verification]
65 changes: 65 additions & 0 deletions docs/AGENTIC-RESEARCH-MVP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Agentic Research MVP (Hyperresearch Distillation)

**Status:** MVP-ready in Workspine
**Date:** 2026-04-29

## Objective

Provide a practical, repo-native agentic web research harness that any developer can run end-to-end with evidence, scratchpad traceability, and handoff quality suitable for planning/execution workflows.

## Scope

This MVP adds:

1. A dedicated **agent contract** for decomposition-first web research.
2. A **scratchpad protocol** for long-running jobs.
3. A canonical **report template** that is strict about claims/evidence/confidence.
4. A suggested **execution lifecycle** that plugs into existing Workspine planning and verification workflows.

This MVP intentionally does **not** add a new CLI command yet; it is a workflow/contract integration that can be invoked via existing agent surfaces.

## Distilled Architecture

### Components

- **Orchestrator agent**: owns question framing, decomposition, synthesis, and final report quality.
- **Research workers**: parallel sub-agents that run targeted web/document research.
- **Scratchpad**: append-only task log for hypotheses, query paths, dead ends, and evidence links.
- **Evidence registry**: report-level table mapping each claim to source URLs and confidence.

### Data flow

1. Define research question and constraints.
2. Decompose into answerable sub-questions with explicit completion criteria.
3. Run workers in parallel against web-enabled tools.
4. Persist intermediate findings and uncertainty in scratchpad.
5. Merge, dedupe, contradiction-check, and confidence-score.
6. Emit final report in repository template.

## Tradeoffs (explicit)

- **Speed vs reliability:** parallel search is faster but increases duplicate/noisy evidence; mitigation is mandatory claim-level provenance.
- **Breadth vs depth:** broad decomposition catches blindspots but can dilute deep technical validation; mitigation is tiered pass (breadth first, depth second for decisive claims).
- **Autonomy vs determinism:** higher autonomy can improve discovery but harm reproducibility; mitigation is strict scratchpad + evidence contracts.
- **Token cost vs auditability:** richer traces cost more, but make handoff and review much stronger; MVP prefers auditability.

## Integration with existing Workspine lifecycle

- During **new-project** or **plan** phases, invoke the new agent contract when web research is available.
- Store outputs under `.planning/research/` (project) or phase directory (phase-specific).
- Use verification workflow discipline to check that recommendations are source-backed and actionable.

## MVP Operating Contract

1. Never publish a recommendation without at least one linked source.
2. Mark every major claim with confidence (`verified`, `likely`, `uncertain`).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The confidence labels verified, likely, uncertain introduced here conflict with the HIGH, MEDIUM, LOW labels defined in the Researcher role contract (agents/researcher.md, line 53) and used by the Synthesizer (agents/synthesizer.md, line 173). Using inconsistent labels across different research modes will make it harder for downstream agents to process and aggregate findings. Please harmonize these to use the existing HIGH/MEDIUM/LOW scale.

Suggested change
2. Mark every major claim with confidence (`verified`, `likely`, `uncertain`).
2. Mark every major claim with confidence (HIGH, MEDIUM, LOW).

3. Record unresolved questions and contradictions explicitly.
4. Keep scratchpad as durable artifact when total runtime exceeds 10 minutes or multi-agent fanout > 3 workers.
5. Include concrete next-step recommendations for planner/roadmapper consumption.

## Suggested rollout path

1. **Now (this MVP):** markdown contracts + templates + agent role.
2. **Next:** add `gsdd-research-agentic` workflow wrapper command.
3. **Later:** evidence schema linting and freshness TTL checks for cited web sources.