diff --git a/plugins/asta-preview/skills/research-step/SKILL.md b/plugins/asta-preview/skills/research-step/SKILL.md index 0d2fcee..2d6c3da 100644 --- a/plugins/asta-preview/skills/research-step/SKILL.md +++ b/plugins/asta-preview/skills/research-step/SKILL.md @@ -1,18 +1,18 @@ --- name: research-step -description: Plan and execute autonomous research as a graph of typed tasks tracked in beads. Use when working from a mission.md to drive multi-step research with explicit dependencies and structured outputs. -allowed-tools: Bash(bd:*) Bash(date:*) Bash(scripts/*) Read(assets/**) Read(workflows/**) Read(scripts/**) Skill(asta:*) Skill(asta-preview:*) Skill(asta-plugins:*) +description: Plan and execute autonomous research as a graph of typed tasks tracked in beads, driven by a YAML template (`hypothesis_driven_research` or `grounded_theory_generation`). Use when working from a `mission.md` to drive multi-step research with explicit dependencies and structured outputs. +allowed-tools: Bash(bd:*) Bash(date:*) Bash(scripts/*) Bash(asta autodiscovery *) Bash(asta literature *) Bash(asta generate-theories *) Bash(jsonschema:*) Read(assets/**) Read(workflows/**) Read(scripts/**) Read(templates/**) Skill(asta:*) Skill(asta-preview:*) Skill(asta-plugins:*) --- # Research Step -Models a research session as a beads epic. Each unit of work is a typed sub-issue whose `metadata.research_step.output` matches a JSON schema in `assets/schemas.yaml`. +Models a research session as a beads epic. Each unit of work is a typed sub-issue whose structured output (`.asta/tasks//output.json`) matches a JSON schema in `assets/schemas.yaml`. This skill is a **router**. Inspect the working directory and the user's request, pick one workflow, then read its `.md` file in `workflows/` and follow it. Do not execute a workflow from memory — always open the file first. ## Setup -There are no hard preconditions. If `mission.md` does not exist, the **brainstorm** workflow will help the user draft one. +There are no hard preconditions. If `mission.md` does not exist, the **brainstorm** workflow will help the user draft one and pick a template. Installing `bd` and `jq`, running `bd init`, and verifying `scripts/summary-check.sh` works are the responsibility of the **init** workflow. Once `init` has run, subsequent workflows assume the environment is ready. @@ -20,22 +20,23 @@ Installing `bd` and `jq`, running `bd init`, and verifying `scripts/summary-chec | Path | Role | |---|---| -| `mission.md` | Input. The research task. | +| `mission.md` | Input. The research task. May carry a `template:` hint in frontmatter (chosen by brainstorm). | | `.beads/` | Source of truth for state. | | `summary.md` | Derived view of the session, regenerated by **update-summary**. Beads is the source of truth; this file is just a digest for humans and for **brainstorm**. Frontmatter `beads_snapshot` records the state it was rendered from. | -| `background_knowledge.txt` | Optional. Long-form context referenced from issue metadata via `summary_path`. | +| `templates/` | YAML plan templates (`hypothesis_driven_research.yaml`, `grounded_theory_generation.yaml`, plus optional `strategies/`). The epic's `metadata.research_step.template` field names which one drives the session. | +| `.asta/tasks//` | Per-task working directory. Holds `input.md`, `input.json`, `output.md`, `output.json`, and any task-type-specific sidecar files (e.g., `extraction_schema.json`, `theories.json`). | ## Workflows | Name | Purpose | Detailed instructions | |---|---|---| -| **brainstorm** | Default. Conversational exploration of current state; drafts/refines `mission.md`; hands off to other workflows when the user is ready to act. | `workflows/brainstorm.md` | -| **init** | Set up the environment: install `bd`/`jq`, run `bd init`, verify `scripts/summary-check.sh`. Hands off to **plan**. | `workflows/init.md` | -| **plan** | Create or extend the graph. Bootstraps the epic + initial frontier from `mission.md`, or replans downstream tasks after a closed task. | `workflows/plan.md` | -| **execute** | Run one ready task end-to-end. Hands off to **plan** when the closed task type unlocks new structure; otherwise to **update-summary**. | `workflows/execute.md` | +| **brainstorm** | Default. Conversational exploration of current state; drafts/refines `mission.md`; selects a template; hands off to other workflows when the user is ready to act. | `workflows/brainstorm.md` | +| **init** | Set up the environment: install `bd`/`jq`, run `bd init`, create `.asta/` skeleton, verify `scripts/summary-check.sh`. Hands off to **plan**. | `workflows/init.md` | +| **plan** | Create or extend the graph. Reads the epic's chosen template and walks its YAML to bootstrap or replan downstream tasks. The agent is the walker — no separate walker script. | `workflows/plan.md` | +| **execute** | Run one ready task end-to-end. Renders `input.md` / `input.json` from the issue's metadata + upstream task outputs, invokes the agent, validates the result, and closes the issue. Hands off to **plan** when the closed task type unlocks new structure; otherwise to **update-summary**. | `workflows/execute.md` | | **update-summary** | Regenerate `summary.md` from beads. Idempotent — no-op when `scripts/summary-check.sh` reports `status: fresh`. | `workflows/update-summary.md` | -Task-type schemas live in `assets/schemas.yaml`. +Output schemas live in `assets/schemas.yaml`. Output schemas for theorizer-derived task types are referenced via `bash_ref:` (the validator runs the upstream CLI at validate time to fetch the canonical shape; see §6.3 of `spec.md`). ## Routing @@ -45,14 +46,31 @@ If the user named a workflow ("init the research", "refresh the summary", "run t ### 2. Otherwise → brainstorm -If the user did not name a workflow, run **brainstorm**. It inspects the working directory, answers the user's question, drafts or refines `mission.md` when appropriate, and hands off to `init` / `plan` / `execute` / `update-summary` once the user is ready to act. +If the user did not name a workflow, run **brainstorm**. It inspects the working directory, answers the user's question, drafts or refines `mission.md` when appropriate, **selects a plan template** when one isn't yet chosen, and hands off to `init` / `plan` / `execute` / `update-summary` once the user is ready to act. ### 3. Chaining - **init** → always run **plan** afterwards (which then chains to **update-summary**). - **plan** → always run **update-summary** afterwards so the digest reflects the new graph. -- **execute** → if the closed task type is `literature_review`, `hypothesis`, `analysis`, or `synthesis`, chain to **plan** (which chains to **update-summary**); otherwise chain directly to **update-summary**. +- **execute** → if the closed task type is `literature_review`, `hypothesis`, `analysis`, `synthesis`, `auto_discovery`, `extraction_schema_design`, `theorizer_extraction`, `theory_generation`, `grounded_theory_generation`, or `novelty_assessment`, chain to **plan** (which chains to **update-summary**); for `scope`, `definitions`, `experiment_design`, or `evidence_gathering`, chain directly to **update-summary**. - **update-summary** and **brainstorm** → never chain. +- **brainstorm** performs template selection via `AskUserQuestion` once a `mission.md` exists but no epic does; the chosen template name is stored on the epic at bootstrap time and read by `plan.md` on every invocation. + +## Conventions for `input.md` / `output.md` files + +Both files live in each task's working dir at `.asta/tasks//`. They are the human-readable surface of the task; the structured surface lives in `input.json` / `output.json`. Three conventions apply to whoever writes them: + +- **`input.md` is a short brief** (~2-4 sentences). Plan writes it at task-create time, once the upstream `output.json` files are on disk and the task's own `input_instructions` have been interpolated. It says what this task is about and where its inputs come from. Not the full prompt — the full prompt lives in `metadata.research_step.input_instructions` and is what the executing agent works against. +- **`output.md` is the narrative artifact** for the task. Written by the executing agent during the work step. Length depends on the task. +- **Data-file references in markdown links are file-relative.** Any markdown link to a local data file (CSV, JSON, log, figure, notebook) in `input.md` or `output.md` must be relative to the file containing the link — the same convention standard markdown viewers and the asta-flows web UI use. From a task's `output.md` at `.asta/tasks//output.md`, link the AutoDS metadata as `[label](../../autods-run/metadata.json)`, not `[label](.asta/autods-run/metadata.json)` (the second form resolves to `.asta/tasks//.asta/autods-run/...` and breaks). Absolute paths under `/Users/`, `/home/`, or `/private/`, and `file://` URLs, are rejected by `scripts/validate-output.sh` (the asta-flows `/api/artifact` endpoint refuses anything outside the run dir). If an upstream tool emits a file outside the run dir, the importing task is responsible for copying or symlinking it under `.asta/` before referencing it. +- **Path field values inside `.json` files are run-root-relative.** Fields like `metadata_path`, `nodes_path`, `log_path`, `schema_path`, `extraction_results_path`, etc. inside `output.json` (and any sidecar) take paths relative to the run root (e.g. `.asta/autods-run/metadata.json`). These are machine-read; the consumer joins them against the run root. +- **Citation strategy (applies to both `input.md` and `output.md`).** Back up every non-obvious statement with a hyperlink to the file that grounds it. Two kinds: + - **Task-internal citations** — references to another task in this epic. Form: `` [`bd-id`](.asta/tasks//output.md) ``. Prefer the task's `output.md`. Use a sidecar (`theories.json`, `novelty_results.json`, `extraction_schema.json`, `extraction_results.json`) only when the claim lives only in structured form. + - **Literature citations** — references to a published paper. Form: `[Author Year](url)` for first authors (`[Bahr 1997](https://…)`), `[Author1, Author2 & Author3 Year](url)` for two-or-three-author cites (`[Bahr, Pfeffer & Kaser 2015](https://…)`), and `[Author1, Author2 et al. Year](url)` for four-or-more (`[Christian, Whorton et al. 2022](https://…)`). The `url` comes from the citing task's `output.json` `citations[].url`; the literature_review task that surfaced the paper is the canonical source for that URL. If a local copy of the paper exists (e.g. under `.asta/literature/`), link to that path instead; otherwise use the DOI, arXiv URL, or Semantic Scholar URL recorded in the citation. + + What counts as "non-obvious": any quantitative claim, comparison to a published result, methodological choice, or domain assertion that isn't shared common knowledge in the field. Framing sentences ("This section is about X") don't need citations; specific claims about how things work or what the data shows do. + +The citation convention itself is enforced by example, not by `validate-output.sh`; the run-relative-path rule above **is** enforced — see `scripts/validate-output.sh`. ## Boundaries diff --git a/plugins/asta-preview/skills/research-step/assets/schemas.yaml b/plugins/asta-preview/skills/research-step/assets/schemas.yaml index b840628..4b85766 100644 --- a/plugins/asta-preview/skills/research-step/assets/schemas.yaml +++ b/plugins/asta-preview/skills/research-step/assets/schemas.yaml @@ -1,80 +1,200 @@ -# Output schemas for research-step task types. -# Each task issue stores its realized output at metadata.research_step.output, -# matching the shape under `output:` for its task_type. +# Output schemas for research-step task types — v2. +# +# Each task type's `output:` entry describes what `.asta/tasks//output.json` +# must contain when the task closes. `validate-output.sh` reads this file. +# +# Two output shapes are supported: +# 1. Inline shape — `output:` is a mapping of field-name -> type/description. +# validate-output.sh requires every top-level key to be present. +# 2. CLI reference — `output:` is `{bash_ref: ""}`. validate-output.sh +# runs the command at validate time, expects a JSON Schema on stdout, and +# validates output.json against it. Used for theorizer / autodiscovery +# derived task types whose shape is owned upstream. +# +# `inputs:` documents which upstream task types feed this one. It's +# informational — the runtime envelope's `inputs[]` (a list of beads IDs) is +# what's enforced. +# +# `config:` documents which flat key/value pairs are recognized on +# metadata.research_step.config. Templates use these for `when:` branching and +# loop variable propagation. +# +# `execute_ref:` is an optional asta CLI subcommand to invoke for this task. +# When present, the executing agent uses exactly that subcommand. Absent +# means there is no canonical CLI surface; the agent does the work itself. -schema_version: 1 +schema_version: 2 task_types: scope: inputs: [] output: - question: string # the precise research question - boundaries: [string] # what is in / out of scope - success_criteria: [string] # how we know we have answered it + question: string + boundaries: array + success_criteria: array definitions: inputs: [scope] output: - terms: - - name: string - operational_definition: string - rationale: string + terms: array literature_review: + # Reused in two modes: + # (a) v1 / hypothesis_driven_research: scope + definitions → key_findings + gaps + citations. + # (b) lane: scope + definitions + auto_discovery + themes_and_gaps_synthesis → + # same shape; gaps[] is typically empty (the theme/gap is already upstream). inputs: [scope, definitions] + config: + lane: "enum [null, theme, gap]" + theme: dict + gap: dict + thread_dir: string + execute_ref: "asta literature interactive" output: - summary_path: string # relative path; long-form context - key_findings: [string] # 3-10 bullets readable without opening summary_path - gaps: [string] # gaps that motivate hypotheses - citations: - - id: string - title: string - url: string - relevance: string + key_findings: array + gaps: array + citations: array hypothesis: + # Theory-shaped (see spec.md §5.1). Used by hypothesis_driven_research + # template only. inputs: [scope, literature_review] + config: + mode: "enum [literature_fanout]" output: - statement: string # H_n: ... + statement: string rationale: string falsifiable_prediction: string - expected_evidence: [string] + expected_evidence: array experiment_design: inputs: [hypothesis] + config: + lane: "enum [null, theme, gap]" + reproduction: boolean + theme: dict + gap: dict output: method: string - procedure: [string] # ordered steps - variables: - independent: [string] - dependent: [string] - controls: [string] - artifacts_expected: [string] # paths the gathering step will produce + procedure: array + variables: dict + artifacts_expected: array evidence_gathering: inputs: [experiment_design] + config: + lane: "enum [null, theme, gap]" + reproduction: boolean output: - artifacts: - - path: string - kind: string # data | log | figure | code | other - description: string - log_path: string # what was actually run - deviations: [string] # ways execution diverged from design + artifacts: array + log_path: string + deviations: array analysis: + # Reused for the per-hypothesis-law reproduction step in lane mode. + # In lane mode, beads metadata.research_step.config.target_law carries + # {node_id, hypothesis, original_analysis}. Reproduction code, figures, + # and log files are referenced via markdown links in output.md. inputs: [hypothesis, evidence_gathering] + config: + lane: "enum [null, theme, gap]" + target_law: dict output: - verdict: enum [supported, refuted, inconclusive] - confidence: number # 0.0 - 1.0 + verdict: "enum [supported, refuted, inconclusive]" + confidence: number reasoning: string - caveats: [string] + caveats: array synthesis: - inputs: [scope, analysis_*] # all analysis issues in the epic + # Multiple run_kinds, distinguished by config.run_kind: + # - themes_and_gaps: after auto_discovery; outputs themes[] + gaps[]. + # - per_theme_gap_lane: closes one theme/gap lane. + # - across_lanes: rolls up every per-lane synthesis. + # - lit_fanout: closes the hypothesis_driven_research lit-fanout phase. + # - report: final closing artifact (output.md is the user-facing report). + # + # NOTE: validate-output.sh only enforces the four base fields below + # (answer, supporting_hypotheses, refuted_hypotheses, open_questions). + # run_kind-specific output fields (themes/gaps/candidate_papers) are + # documented here and required by the template's output_instructions + # but are NOT structurally enforced. If you need stricter validation + # per run_kind, extend validate-output.sh to read config.run_kind. + inputs: [scope, "analysis_*"] + config: + run_kind: "enum [themes_and_gaps, per_theme_gap_lane, across_lanes, lit_fanout, report]" + lane: "enum [null, theme, gap]" output: - answer: string # answer to scope.question - supporting_hypotheses: [bd_id] - refuted_hypotheses: [bd_id] - open_questions: [string] # become discovered-from edges on re-plan - report_path: string # generated markdown report + answer: string + supporting_hypotheses: array + refuted_hypotheses: array + open_questions: array + # Optional fields, populated only by specific run_kinds: + # - themes: array of {name, description, supporting_laws} (themes_and_gaps only) + # - gaps: array of {summary, why_open, related_laws} (themes_and_gaps only) + # - candidate_papers: array of bd-ids / paper refs (per_theme_gap_lane, across_lanes) + + auto_discovery: + inputs: [scope, definitions] + config: + run_pointer: string + execute_ref: "asta autodiscovery run" + output: + bash_ref: "asta autodiscovery --help" + + # Theorizer task types. execute_ref names the typed asta subcommand to + # invoke; input_schema_ref names the describe subcommand that returns + # that skill's resolved input JSON Schema (run it to discover the + # flags + types accepted by execute_ref). Both rely on + # `asta generate-theories card` having been fetched at least once + # (see `asta generate-theories --refresh-card card`). + + extraction_schema_design: + inputs: ["synthesis_across_lanes"] + config: + phase: "enum [theorizer]" + theorizer_config: dict + execute_ref: "asta generate-theories build-extraction-schema" + input_schema_ref: "asta generate-theories describe build-extraction-schema" + output: + bash_ref: "asta generate-theories card" + + theorizer_extraction: + inputs: ["synthesis_across_lanes", "extraction_schema_design"] + config: + theorizer_config: dict + execute_ref: "asta generate-theories find-and-extract" + input_schema_ref: "asta generate-theories describe find-and-extract" + output: + bash_ref: "asta generate-theories card" + + theory_generation: + inputs: ["synthesis_across_lanes", "extraction_schema_design", "theorizer_extraction"] + config: + phase: "enum [theorizer]" + theorizer_config: dict + execute_ref: "asta generate-theories form-theory" + input_schema_ref: "asta generate-theories describe form-theory" + output: + bash_ref: "asta generate-theories card" + + grounded_theory_generation: + inputs: ["auto_discovery", "theory_generation", "synthesis_across_lanes"] + + novelty_assessment: + inputs: ["grounded_theory_generation"] + config: + theorizer_config: dict + execute_ref: "asta generate-theories evaluate-novelty" + input_schema_ref: "asta generate-theories describe evaluate-novelty" + output: + bash_ref: "asta generate-theories card" + + resume_extraction: + inputs: ["theorizer_extraction"] + config: + run_id: string + execute_ref: "asta generate-theories resume-extraction" + input_schema_ref: "asta generate-theories describe resume-extraction" + output: + bash_ref: "asta generate-theories card" diff --git a/plugins/asta-preview/skills/research-step/scripts/validate-output.sh b/plugins/asta-preview/skills/research-step/scripts/validate-output.sh index 0f5a84e..8f9287d 100755 --- a/plugins/asta-preview/skills/research-step/scripts/validate-output.sh +++ b/plugins/asta-preview/skills/research-step/scripts/validate-output.sh @@ -1,28 +1,29 @@ #!/usr/bin/env bash -# validate-output.sh — structural validation of a research_step output JSON. +# validate-output.sh — structural validation of a research_step output.json. # -# Usage: validate-output.sh +# Usage: validate-output.sh # -# Verifies that the JSON file: -# 1. parses -# 2. carries the canonical metadata envelope -# ({research_step: {task_type, inputs, output_schema_version, output}}) -# 3. has every required `output.` for the given per -# assets/schemas.yaml (schema_version: 1) +# Behavior: +# - Look up the task_type in assets/schemas.yaml. +# - If the entry's `output:` has a `bash_ref:` field, run the command, +# pipe stdout into a JSON Schema validator, and check output.json +# against it. If the command fails or stdout is not valid JSON, +# warn and treat the output as opaque (exit 0). +# - Otherwise (inline output schema): verify every required field +# listed in schemas.yaml; preserve the v1 type spot-checks +# (analysis.verdict, analysis.confidence, etc.). # # Exit codes: -# 0 — valid -# 2 — JSON parse error -# 3 — unknown task_type -# 4 — missing required field -# 5 — task_type mismatch with envelope -# -# This is structural validation only. Quality validation (sound prediction, -# sane confidence, valid citations) is out of scope per execute.md. +# 0 — valid (or bash_ref unavailable + warning emitted) +# 2 — JSON parse error on output.json +# 3 — unknown task_type +# 4 — missing required field (inline schema mode) +# 5 — bash_ref command failed in an unexpected way (e.g. permission denied) +# 6 — sibling output.md has an absolute path or file:// URL set -euo pipefail if [[ $# -ne 2 ]]; then - echo "usage: validate-output.sh " >&2 + echo "usage: validate-output.sh " >&2 exit 1 fi @@ -34,69 +35,122 @@ if ! jq -e . "$file" > /dev/null 2>&1; then exit 2 fi -# Required output fields, mirroring assets/schemas.yaml (schema_version: 1). -case "$task_type" in - scope) required="question boundaries success_criteria" ;; - definitions) required="terms" ;; - literature_review) required="summary_path key_findings gaps citations" ;; - hypothesis) required="statement rationale falsifiable_prediction expected_evidence" ;; - experiment_design) required="method procedure variables artifacts_expected" ;; - evidence_gathering) required="artifacts log_path deviations" ;; - analysis) required="verdict confidence reasoning caveats" ;; - synthesis) required="answer supporting_hypotheses refuted_hypotheses open_questions report_path" ;; - *) - echo "validate-output: unknown task_type '$task_type'" >&2 - echo "validate-output: expected one of scope|definitions|literature_review|hypothesis|experiment_design|evidence_gathering|analysis|synthesis" >&2 - exit 3 - ;; -esac - -# Envelope must carry the matching task_type so we don't validate scope JSON -# against an analysis schema by accident. -envelope_type=$(jq -r '.research_step.task_type // empty' "$file") -if [[ -z "$envelope_type" ]]; then - echo "validate-output: $file missing .research_step.task_type" >&2 - exit 5 +# Resolve the path to schemas.yaml relative to this script. +script_dir="$(cd "$(dirname "$0")" && pwd)" +schemas="$script_dir/../assets/schemas.yaml" +if [[ ! -f "$schemas" ]]; then + echo "validate-output: schemas.yaml not found at $schemas" >&2 + exit 3 fi -if [[ "$envelope_type" != "$task_type" ]]; then - echo "validate-output: envelope task_type='$envelope_type' but expected '$task_type'" >&2 - exit 5 + +# Pull the task_type's `output:` entry from schemas.yaml. We use a tiny +# python shim to keep us out of yq-flavor land. Three possible shapes +# come back on stdout: +# - "bash_ref:" — resolve via the referenced CLI +# - "required: ..." — inline required-field list +# - "missing" — unknown task_type +output_spec=$(python3 - "$schemas" "$task_type" <<'PY' +import sys, yaml +schemas_path, task_type = sys.argv[1], sys.argv[2] +with open(schemas_path) as f: + data = yaml.safe_load(f) +entry = (data.get("task_types") or {}).get(task_type) +if not entry: + print("missing") + sys.exit(0) +output = entry.get("output") +if isinstance(output, dict) and "bash_ref" in output: + print(f"bash_ref:{output['bash_ref']}") +elif isinstance(output, dict): + # Inline output schema: top-level dict keys are required field names + # (a small departure from a strict JSON Schema, matching what the + # spec writes). + required = " ".join(output.keys()) + print(f"required:{required}") +elif isinstance(output, list): + # Free-form list shape — treat as opaque. + print("required:") +else: + print("required:") +PY +) + +if [[ "$output_spec" == "missing" ]]; then + echo "validate-output: unknown task_type '$task_type'" >&2 + exit 3 fi -# Envelope shape sanity. -for key in inputs output_schema_version output; do - if ! jq -e ".research_step | has(\"$key\")" "$file" >/dev/null; then - echo "validate-output: $file missing .research_step.$key" >&2 - exit 5 +if [[ "$output_spec" == bash_ref:* ]]; then + cmd="${output_spec#bash_ref:}" + # Try to run the command and validate output.json against its JSON Schema. + if schema_stdout=$($cmd 2>/dev/null); then + if echo "$schema_stdout" | jq -e . >/dev/null 2>&1; then + # Try jsonschema if available; else just check the output parses. + if command -v jsonschema >/dev/null 2>&1; then + if echo "$schema_stdout" | jsonschema -i "$file" /dev/stdin >/dev/null 2>&1; then + echo "ok (validated against bash_ref: $cmd)" + exit 0 + else + echo "validate-output: output.json failed schema from '$cmd'" >&2 + exit 4 + fi + else + echo "warn: jsonschema CLI not installed; bash_ref schema parsed but unchecked" + exit 0 + fi + else + echo "warn: bash_ref '$cmd' stdout is not JSON; treating output.json as opaque" + exit 0 + fi + else + echo "warn: bash_ref command '$cmd' failed; treating output.json as opaque" + exit 0 fi -done +fi + +# Inline schema mode — output_spec is "required: ..." +required="${output_spec#required:}" -# Required output fields. +# Check every required top-level field. for key in $required; do - if ! jq -e ".research_step.output | has(\"$key\")" "$file" >/dev/null; then - echo "validate-output: missing required field 'output.$key' for task_type '$task_type'" >&2 + if ! jq -e "has(\"$key\")" "$file" >/dev/null 2>&1; then + echo "validate-output: missing required field '$key' for task_type '$task_type'" >&2 exit 4 fi done -# Type spot-checks for the high-leverage cases. Not exhaustive — just the -# fields where a wrong type at this layer would silently break update-summary rendering -# or downstream tasks. +# Type spot-checks for high-leverage cases. case "$task_type" in literature_review) - jq -e '.research_step.output.key_findings | type == "array"' "$file" >/dev/null \ - || { echo "validate-output: output.key_findings must be an array" >&2; exit 4; } - jq -e '.research_step.output.gaps | type == "array"' "$file" >/dev/null \ - || { echo "validate-output: output.gaps must be an array" >&2; exit 4; } - jq -e '.research_step.output.citations | type == "array"' "$file" >/dev/null \ - || { echo "validate-output: output.citations must be an array" >&2; exit 4; } + jq -e '.key_findings | type == "array"' "$file" >/dev/null \ + || { echo "validate-output: key_findings must be an array" >&2; exit 4; } + jq -e '.gaps | type == "array"' "$file" >/dev/null \ + || { echo "validate-output: gaps must be an array" >&2; exit 4; } + jq -e '.citations | type == "array"' "$file" >/dev/null \ + || { echo "validate-output: citations must be an array" >&2; exit 4; } ;; analysis) - jq -e '.research_step.output.verdict | IN("supported", "refuted", "inconclusive")' "$file" >/dev/null \ - || { echo "validate-output: output.verdict must be one of supported|refuted|inconclusive" >&2; exit 4; } - jq -e '.research_step.output.confidence | type == "number" and . >= 0 and . <= 1' "$file" >/dev/null \ - || { echo "validate-output: output.confidence must be a number in [0, 1]" >&2; exit 4; } + jq -e '.verdict | IN("supported", "refuted", "inconclusive")' "$file" >/dev/null \ + || { echo "validate-output: verdict must be one of supported|refuted|inconclusive" >&2; exit 4; } + jq -e '.confidence | type == "number" and . >= 0 and . <= 1' "$file" >/dev/null \ + || { echo "validate-output: confidence must be a number in [0, 1]" >&2; exit 4; } + ;; + synthesis) + # answer is required by the schema; supporting_hypotheses / refuted_hypotheses / + # open_questions are arrays. themes / gaps / candidate_papers are optional. + jq -e '.supporting_hypotheses | type == "array"' "$file" >/dev/null \ + || { echo "validate-output: supporting_hypotheses must be an array" >&2; exit 4; } + jq -e '.refuted_hypotheses | type == "array"' "$file" >/dev/null \ + || { echo "validate-output: refuted_hypotheses must be an array" >&2; exit 4; } + jq -e '.open_questions | type == "array"' "$file" >/dev/null \ + || { echo "validate-output: open_questions must be an array" >&2; exit 4; } ;; esac +md_file="$(dirname "$file")/output.md" +if grep -qE '/Users/|/home/|/private/|file://' "$md_file" 2>/dev/null; then + echo "validate-output: $md_file has absolute paths; use .asta/-relative paths" >&2 + exit 6 +fi + echo "ok" diff --git a/plugins/asta-preview/skills/research-step/templates/grounded_theory_generation.yaml b/plugins/asta-preview/skills/research-step/templates/grounded_theory_generation.yaml new file mode 100644 index 0000000..9286fc3 --- /dev/null +++ b/plugins/asta-preview/skills/research-step/templates/grounded_theory_generation.yaml @@ -0,0 +1,414 @@ +name: grounded_theory_generation +description: > + Synthesize literature-grounded and parametric cross-cutting theories + anchored on an external AutoDS run, then propose follow-up experiments. + +mission_scaffold: + goal: "Generate theories and propose concrete next-step experiments." + required_sections: + - autods_run_pointer + - datasets + - focus + +declares: + task_types: + - scope + - definitions + - auto_discovery + - literature_review + - experiment_design + - evidence_gathering + - analysis + - synthesis + - extraction_schema_design + - theorizer_extraction + - theory_generation + - grounded_theory_generation + - novelty_assessment + +bootstrap: + - id: scope + task_type: scope + title: "Scope" + inputs: [] + input_instructions: | + Frame the precise research question this run answers. List in-scope + and out-of-scope boundaries, plus success criteria the closing report + must satisfy. Anchor the question on the AutoDS run referenced in + mission.md. + output_instructions: | + Populate question (one sentence), boundaries[] (3-7 bullets, + explicit in/out lists), success_criteria[] (3-5 bullets — the + closing report's mission_satisfied check will reference these). + + - id: definitions + task_type: definitions + title: "Definitions" + inputs: [scope] + blocked_by: [scope] + input_instructions: | + Operationalize the domain terms used in scope.question and + scope.boundaries. Each term must be testable against data. + output_instructions: | + Populate terms[] with {name, operational_definition, rationale}. + + - id: auto_discovery + task_type: auto_discovery + title: "Import AutoDS run" + inputs: [scope, definitions] + blocked_by: [definitions] + input_instructions: | + Import the AutoDS run referenced in mission.md (run_pointer in + config). Materialize the export into `.asta/autods-run/` under the + run dir so downstream tasks (and the asta-flows web UI) can serve + its files: + .asta/autods-run/metadata.json + .asta/autods-run/mcts_nodes_all.json + .asta/autods-run/data/.csv + Copy small files (<50 MB) and prefer a relative symlink for larger + ones (`/api/artifact` follows symlinks within the run dir). + Then read mcts_nodes_all.json and emit a curated list of laws — + favor surprising / high-normalized_surprisal nodes. + output_instructions: | + Populate run_id, metadata_path, nodes_path, laws[] (each with + node_id, hypothesis, analysis (excerpt), review (excerpt), + surprising, normalized_surprisal, supporting_data[]). The JSON + path fields (`metadata_path`, `nodes_path`, and every + `supporting_data` entry) are **run-root-relative** (e.g. + `.asta/autods-run/mcts_nodes_all.json`); machine-read. Record the + upstream absolute path in `source_path` for provenance only — do + not reference it from output.md. In output.md, link each artifact + with a **file-relative** path: from this auto_discovery task's + output.md the link is `[label](../../autods-run/...)`. + +replan: + auto_discovery: + - create: + task_type: evidence_gathering + inputs: [scope, definitions, auto_discovery] + edges: + parent_child: epic + blocks_from: auto_discovery + config: + lane: source + input_instructions: | + Data provenance for the AutoDS run. For each input dataset in + auto_discovery's metadata.json, find the foundational paper + and a public copy of the data. Materialize downloaded data + under `.asta/autods-run/data/`. Use kind "dataset_uri" + for data and kind "paper" for foundational papers. + + synthesis: + - when: source.config.run_kind == "themes_and_gaps" + foreach_union: + - foreach: theme in source.output_json.themes + create: + task_type: literature_review + inputs: [scope, definitions, auto_discovery, source] + edges: + parent_child: epic + blocks_from: source + config: + lane: theme + theme: theme + thread_dir: ".asta/literature/threads/theme-${theme.name}" + input_instructions: | + Multi-turn paper finder for theme '${theme.name}' + (${theme.description}). Use `asta literature interactive` + with the thread_dir in config. Find papers that confirm, + complicate, or extend the auto_discovery laws under this + theme (supporting_laws: ${theme.supporting_laws}). + output_instructions: | + Populate key_findings[] (3-10 bullets), citations[] + ({id, title, url, relevance}). Leave gaps[] empty — the + theme already came from upstream. + - foreach: gap in source.output_json.gaps + create: + task_type: literature_review + inputs: [scope, definitions, auto_discovery, source] + edges: + parent_child: epic + blocks_from: source + config: + lane: gap + gap: gap + thread_dir: ".asta/literature/threads/gap-${gap.summary}" + input_instructions: | + Multi-turn paper finder for gap '${gap.summary}' + (${gap.why_open}). Use `asta literature interactive` with + a fresh thread_dir. Find papers that close, complicate, or + shed light on this gap (related_laws: + ${gap.related_laws}). + output_instructions: | + Populate key_findings[] and citations[]. Leave gaps[] + empty. + + - when: source.config.run_kind == "per_theme_gap_lane" and all_lane_syntheses_closed + create: + task_type: synthesis + inputs: [auto_discovery, all_lane_syntheses] + edges: + parent_child: epic + blocks_from: all_lane_syntheses + config: + run_kind: across_lanes + input_instructions: | + Roll up every per-lane synthesis (themes + gaps) into a single + cross-lane view. Reconcile contradictions between lanes, + identify auto_discovery laws confirmed by multiple lanes, and + surface the union of candidate papers that should feed the + theorizer pipeline. + output_instructions: | + Populate answer, supporting_hypotheses (bd-ids of confirmed + laws / analyses), refuted_hypotheses, open_questions, and + candidate_papers[] (paper refs or bd-ids). + + - when: source.config.run_kind == "across_lanes" + create: + task_type: extraction_schema_design + inputs: [source] + edges: + parent_child: epic + blocks_from: source + config: + phase: theorizer + input_instructions: | + Theorizer pipeline starts here. No additional literature_review + is needed — the across_lanes synthesis's candidate_papers + already constitute the theorizer corpus. + + Invoke the theorizer's Build-Extraction-Schema A2A skill: + asta generate-theories build-extraction-schema \ + --theory-query "" + (skill_id: build-extraction-schema; runs only the schema-build + step, not the full pipeline.) + output_instructions: | + Output matches `asta generate-theories card` extraction-schema + shape (validated via bash_ref). Write the schema JSON to + extraction_schema.json in the task dir; reference it from + output.json via schema_path. + + - when: source.config.run_kind == "report" + noop: true + + literature_review: + - when: source.config.lane in ["theme", "gap"] + create: + task_type: experiment_design + inputs: [source, prior_synthesis] + edges: + parent_child: epic + blocks_from: source + config: + lane: source.config.lane + reproduction: true + input_instructions: | + Design a reproduction study for this ${source.config.lane} + lane. The lane's literature_review corpus is in hand. The laws + to reproduce are listed under the lane's spec (supporting_laws + for themes, related_laws for gaps). Specify a method that can + be applied to each law independently — evidence_gathering will + locate data via the lane's literature; analysis then fans out + one task per law to reproduce as many as possible. + output_instructions: | + Populate method, procedure[] (ordered steps), variables + (independent / dependent / controls), and artifacts_expected[]. + + experiment_design: + - when: source.config.lane in ["theme", "gap"] + create: + task_type: evidence_gathering + inputs: [literature_review, source] + edges: + parent_child: epic + blocks_from: source + config: + lane: source.config.lane + reproduction: true + input_instructions: | + Find the data sets required by the lane's experiment_design. + Use the lane's literature_review corpus (supplementary tables, + linked repositories, dataset DOIs) and follow citations as + needed. The AutoDS run already materialized its datasets under + `.asta/autods-run/data/` — prefer those when a law's + reproduction needs them. Mark complete when every targetable + law has at least one candidate dataset or further search is + judged unproductive. + output_instructions: | + Record every located URI in artifacts[] with + kind: "dataset_uri"; the `path` JSON field is the URL/URI + string (or, for files stored locally, a **run-root-relative** + path under `.asta/`); description carries human context. The + `log_path` JSON field is also run-root-relative. Markdown + links to these artifacts from output.md must be + **file-relative** (e.g. `[csv](../../autods-run/data/foo.csv)` + from this task's output.md). + + evidence_gathering: + - when: source.config.lane == "source" + create: + task_type: synthesis + inputs: [scope, definitions, auto_discovery, source] + edges: + parent_child: epic + blocks_from: source + config: + run_kind: themes_and_gaps + input_instructions: | + Cluster the auto_discovery laws into themes and surface the + gaps. Use the upstream data-provenance evidence (foundational + papers + dataset URIs) as additional context. + + - when: source.config.lane in ["theme", "gap"] + foreach: law in upstream.lane_spec.laws + create: + task_type: analysis + inputs: [source] + edges: + parent_child: epic + blocks_from: source + config: + lane: source.config.lane + target_law: law + input_instructions: | + Data reproduction step of the AutoDS run. Reproduce law + '${law.node_id}' (${law.hypothesis}) using the datasets + located by the lane's evidence_gathering. Execute the + reproduction code (fetch URIs, run pandas / sklearn / etc.); + mark inconclusive with caveat `data_unavailable` if the data + for this law was not located. Original AutoDS analysis for + context: ${law.original_analysis}. + output_instructions: | + Populate verdict, confidence (0-1), reasoning, and caveats[] + (vocabulary: data_unavailable, code_failed, …). Include + markdown links in output.md to the reproduction code, output + figures, and log files so a human can trace what ran — all + link targets must be **file-relative** from this task's + output.md: `[code](script.py)` or `[fig](fig1.png)` for files + the task wrote alongside its own output.md; + `[csv](../../autods-run/data/foo.csv)` for AutoDS-materialized + sources. Never link to absolute paths under /Users/, /home/, + /private/, or to `file://` URLs. + + analysis: + - when: source.config.lane in ["theme", "gap"] and all_lane_analyses_closed + create: + task_type: synthesis + inputs: [literature_review, evidence_gathering, all_lane_analyses] + edges: + parent_child: epic + blocks_from: all_lane_analyses + config: + run_kind: per_theme_gap_lane + lane: source.config.lane + input_instructions: | + Close this ${source.config.lane} lane. Roll up the per-law + analysis verdicts: count reproduced / refuted / inconclusive. + Surface candidate papers worth re-using in the theorizer + pipeline. Note caveats (e.g. data_unavailable laws). + output_instructions: | + Populate answer, supporting_hypotheses (bd-ids of supported + analyses), refuted_hypotheses, open_questions, and + candidate_papers[] (paper refs worth re-using). + + extraction_schema_design: + - when: source.config.phase == "theorizer" + create: + task_type: theorizer_extraction + inputs: [synthesis_across_lanes, extraction_schema_design] + edges: + parent_child: epic + blocks_from: source + input_instructions: | + Extract evidence from the candidate_papers corpus using the + schema from the upstream extraction_schema_design. Pass the + schema so this step does not re-build one. + output_instructions: | + Write extraction_results.json and paper_store.json sidecars. + + theorizer_extraction: + - create: + task_type: theory_generation + inputs: [synthesis_across_lanes, extraction_schema_design, theorizer_extraction] + edges: + parent_child: epic + blocks_from: source + config: + phase: theorizer + input_instructions: | + Generate 4-8 literature-grounded theories from the upstream + extraction results. Re-use theorizer_extraction's run_id so + this step does not re-run extraction. + output_instructions: | + Write theories.json sidecar. + + theory_generation: + - when: source.config.phase == "theorizer" + create: + task_type: grounded_theory_generation + inputs: [auto_discovery, source, synthesis_across_lanes] + edges: + parent_child: epic + blocks_from: source + input_instructions: | + Cross-cut the auto_discovery laws (confirmed/undisputed by + per-lane reproduction, summarized in the across_lanes + synthesis) and the literature-grounded theories from theorizer. + Use parametric memory only — do NOT call theorizer. + output_instructions: | + Output matches theorizer's Theory shape (validated via + bash_ref). Each theory carries theory_statements[], + supporting_evidence[] (cite bd-ids and auto_discovery + node_ids), negative_experiments[]. + + grounded_theory_generation: + - create: + task_type: novelty_assessment + inputs: [source] + edges: + parent_child: epic + blocks_from: source + input_instructions: | + Score each theory_statement in the upstream + grounded_theory_generation output against the retrieved + literature, across the seven theorizer dimensions. + output_instructions: | + Write a results sidecar with per_statement entries. + + novelty_assessment: + - create: + task_type: synthesis + inputs: [auto_discovery, prior_synthesis, novelty_assessment] + edges: + parent_child: epic + blocks_from: source + config: + run_kind: report + input_instructions: | + Resolve mission.md. The user-facing deliverable is `report.md` + (this task's output.md). Pull surviving cross-cutting theories + from grounded_theory_generation, novelty scores from + novelty_assessment, and reproduction verdicts from the per-lane + syntheses (summarized in the across_lanes synthesis). + + Voice and structure (style guide for output.md): + - Audience: college-educated reader who knows the domain but + not its jargon. Define terms inline on first use. + - Direct prose. No filler ("essentially", "fundamentally", + "actually", "interestingly"); no hedge adverbs ("fairly", + "quite", "rather"); no throat-clearing windups. + - Open with a three-paragraph TL;DR, each labeled in bold: + **Key findings** (what we did and what we found), + **The theories** (one or two sentences per surviving theory, + terms defined as they appear), + **Next steps** (the prioritized experiments, in priority tiers). + - Inline citation convention applies (see SKILL.md). Every + quantitative claim and every theory statement gets a + markdown hyperlink to its grounding task's `output.md`. + output_instructions: | + Populate answer (mission-satisfaction narrative), + supporting_hypotheses, refuted_hypotheses, open_questions. The + output.md body follows the section order in the style guide + above and includes: surviving theories, supporting laws + + analyses, proposed experiments, and proposed NEGATIVE + experiments derived from each theory's negative_experiments[]. diff --git a/plugins/asta-preview/skills/research-step/templates/hypothesis_driven_research.yaml b/plugins/asta-preview/skills/research-step/templates/hypothesis_driven_research.yaml new file mode 100644 index 0000000..a32a7a4 --- /dev/null +++ b/plugins/asta-preview/skills/research-step/templates/hypothesis_driven_research.yaml @@ -0,0 +1,157 @@ +name: hypothesis_driven_research +description: > + Hypothesis-driven literature reconciliation without an AutoDS anchor. + Survey literature → fan out hypotheses from gaps → design / gather / + analyze each → synthesize a closing report. + +declares: + task_types: + - scope + - definitions + - literature_review + - hypothesis + - experiment_design + - evidence_gathering + - analysis + - synthesis + +bootstrap: + - id: scope + task_type: scope + title: "Scope" + inputs: [] + input_instructions: | + Frame the precise research question this run answers. List in-scope + and out-of-scope boundaries, plus success criteria the closing + report must satisfy. + output_instructions: | + Populate question, boundaries[], success_criteria[]. + + - id: definitions + task_type: definitions + title: "Definitions" + inputs: [scope] + blocked_by: [scope] + input_instructions: | + Operationalize the domain terms used in scope.question. Each term + must be testable against data. + output_instructions: | + Populate terms[] with {name, operational_definition, rationale}. + + - id: literature_review + task_type: literature_review + title: "Literature review" + inputs: [scope, definitions] + blocked_by: [scope, definitions] + input_instructions: | + Survey literature framed by scope and definitions. Produce + key_findings, gaps (these seed the next fanout), and ranked + citations. + output_instructions: | + Populate key_findings[] (3-10 bullets), gaps[] (each a one-line + gap statement; each will become a hypothesis), citations[] + ({id, title, url, relevance}). + +replan: + literature_review: + - foreach: gap in source.output_json.gaps + create: + task_type: hypothesis + inputs: [scope, source] + edges: + parent_child: epic + blocks_from: source + config: + mode: literature_fanout + input_instructions: | + Convert literature_review gap '${gap}' into a falsifiable + hypothesis with a concrete prediction. + output_instructions: | + Populate statement (H_n: …), rationale, + falsifiable_prediction, expected_evidence[]. + + hypothesis: + - when: source.config.mode == "literature_fanout" + sequence: + - task_type: experiment_design + inputs: [hypothesis] + input_instructions: | + Design an experiment that would falsify the upstream + hypothesis. + output_instructions: | + Populate method, procedure[], variables, artifacts_expected[]. + - task_type: evidence_gathering + inputs: [experiment_design] + input_instructions: | + Execute the experiment_design. Capture artifacts, log, + deviations. + output_instructions: | + Populate artifacts[] (path/kind/description), log_path, + deviations[]. + - task_type: analysis + inputs: [hypothesis, evidence_gathering] + input_instructions: | + Render a verdict on the hypothesis given the gathered + evidence. + output_instructions: | + Populate verdict (supported/refuted/inconclusive), + confidence (0-1), reasoning, caveats[]. + edges: + parent_child: epic + blocks: chain + + analysis: + - when: all_lit_hypotheses_have_analyses + create: + task_type: synthesis + inputs: [scope, all_lit_analyses] + edges: + parent_child: epic + blocks_from: all_lit_analyses + config: + run_kind: lit_fanout + input_instructions: | + Reconcile all closed hypothesis analyses into a single answer + to scope.question. + + Voice and structure (style guide for output.md, since this is + the closing report for this template): + - Audience: college-educated reader who knows the domain but + not its jargon. Define terms inline on first use. + - Direct prose. No filler ("essentially", "fundamentally", + "actually", "interestingly"); no hedge adverbs ("fairly", + "quite", "rather"); no throat-clearing windups. + - Open with a three-paragraph TL;DR, each labeled in bold: + **Key findings** (what we did and what we found), + **The hypotheses** (one or two sentences per supported / + refuted hypothesis), **Next steps** (open questions or + follow-ups, in priority tiers if applicable). + - Inline citation convention applies (see SKILL.md). Every + quantitative claim and every hypothesis verdict gets a + markdown hyperlink to its grounding task's `output.md`. + output_instructions: | + Populate answer, supporting_hypotheses (bd-ids of supported), + refuted_hypotheses, open_questions[]. output.md is the + closing report and follows the style guide above. + + synthesis: + - when: source.config.run_kind == "lit_fanout" + ask_user: + prompt: "Synthesis has open questions. Create new hypothesis tasks?" + options: + yes: + description: "Fan out one hypothesis task per open question." + foreach: q in source.output_json.open_questions + create: + task_type: hypothesis + inputs: [scope, source] + edges: + discovered_from: source + config: + mode: literature_fanout + input_instructions: | + Convert open question '${q}' into a falsifiable + hypothesis. + no: + description: "Leave open questions for a future run." + noop: true diff --git a/plugins/asta-preview/skills/research-step/templates/iterative_theorizer.yaml b/plugins/asta-preview/skills/research-step/templates/iterative_theorizer.yaml new file mode 100644 index 0000000..ca046be --- /dev/null +++ b/plugins/asta-preview/skills/research-step/templates/iterative_theorizer.yaml @@ -0,0 +1,331 @@ +name: iterative_theorizer +description: > + Iterative, user-gated theorizer pipeline. After each step (literature + survey, extraction schema, evidence extraction, theory generation, + novelty assessment, closing report) the user picks proceed / redo / + append before plan creates the next task. Reuses the Ai2 Theorizer's + four A2A skills: build-extraction-schema, find-and-extract, + form-theory, evaluate-novelty (plus resume-extraction for append). + +declares: + task_types: + - scope + - definitions + - literature_review + - extraction_schema_design + - theorizer_extraction + - theory_generation + - novelty_assessment + - synthesis + +bootstrap: + - id: scope + task_type: scope + title: "Scope" + inputs: [] + input_instructions: | + Frame the research question the theorizer will work from. + scope.question becomes the theorizer's theory_query. + + - id: definitions + task_type: definitions + title: "Definitions" + inputs: [scope] + blocked_by: [scope] + input_instructions: | + Operationalize the domain terms in scope.question so downstream + steps share vocabulary. + + - id: literature_review + task_type: literature_review + title: "Literature review" + inputs: [scope, definitions] + blocked_by: [scope, definitions] + config: + iteration: 1 + thread_dir: ".asta/literature/threads/lit-1" + input_instructions: | + Seed the theorizer's extraction corpus. citations[] become + candidate papers for find-and-extract. + +replan: + literature_review: + - ask_user: + prompt: "Literature review done. proceed / redo / append?" + options: + proceed: + description: "Move on to building the extraction schema." + create: + task_type: extraction_schema_design + inputs: [scope, definitions, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: 1 + phase: theorizer + input_instructions: | + Build an extraction schema for the theory_query = + scope.question, grounded in the upstream literature. + redo: + description: "Rerun literature review from scratch." + followup_prompt: "What should change in the redo?" + create: + task_type: literature_review + inputs: [scope, definitions] + edges: + parent_child: epic + discovered_from: source + config: + iteration: source.config.iteration + 1 + thread_dir: ".asta/literature/threads/lit-${iteration}" + input_instructions: | + Redo iteration ${iteration}. ${user_feedback}. + append: + description: "Extend the prior corpus with more papers." + followup_prompt: "What angle / topics to add?" + create: + task_type: literature_review + inputs: [scope, definitions, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: source.config.iteration + 1 + mode: append + thread_dir: ".asta/literature/threads/lit-${iteration}" + input_instructions: | + Iteration ${iteration} (append). Extend the prior + citations[]. ${user_feedback}. + + extraction_schema_design: + - ask_user: + prompt: "Extraction schema ready. proceed / redo / append?" + options: + proceed: + description: "Run find-and-extract with this schema." + create: + task_type: theorizer_extraction + inputs: [scope, source, latest_literature_review] + edges: + parent_child: epic + blocks_from: source + config: + iteration: 1 + input_instructions: | + Run `asta generate-theories find-and-extract` against + this schema and the latest literature_review corpus. + redo: + description: "Rebuild the schema from scratch." + followup_prompt: "What should change?" + create: + task_type: extraction_schema_design + inputs: [scope, definitions, latest_literature_review] + edges: + parent_child: epic + discovered_from: source + config: + iteration: source.config.iteration + 1 + phase: theorizer + input_instructions: | + Redo iteration ${iteration}. ${user_feedback}. + append: + description: "Extend the schema with more fields." + followup_prompt: "What fields to add?" + create: + task_type: extraction_schema_design + inputs: [scope, definitions, source, latest_literature_review] + edges: + parent_child: epic + blocks_from: source + config: + iteration: source.config.iteration + 1 + mode: append + phase: theorizer + input_instructions: | + Iteration ${iteration} (append). Extend the prior + schema. ${user_feedback}. + + theorizer_extraction: + - ask_user: + prompt: "Extraction complete. proceed / redo / append?" + options: + proceed: + description: "Move to theory_generation." + create: + task_type: theory_generation + inputs: [scope, latest_extraction_schema_design, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: 1 + phase: theorizer + input_instructions: | + Run `asta generate-theories form-theory` on the + upstream extraction results. + redo: + description: "Rerun extraction from scratch." + followup_prompt: "What should change?" + create: + task_type: theorizer_extraction + inputs: [scope, latest_extraction_schema_design, latest_literature_review] + edges: + parent_child: epic + discovered_from: source + config: + iteration: source.config.iteration + 1 + input_instructions: | + Redo iteration ${iteration}. ${user_feedback}. + append: + description: "Add more papers via resume-extraction." + followup_prompt: "What papers / topics to add?" + create: + task_type: theorizer_extraction + inputs: [scope, latest_extraction_schema_design, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: source.config.iteration + 1 + mode: append + resume_from: source + input_instructions: | + Iteration ${iteration} (append) via `asta + generate-theories resume-extraction`. ${user_feedback}. + + theory_generation: + - ask_user: + prompt: "Theories generated. proceed / redo / append?" + options: + proceed: + description: "Move to novelty_assessment." + create: + task_type: novelty_assessment + inputs: [source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: 1 + input_instructions: | + Run `asta generate-theories evaluate-novelty` on the + generated theories. + redo: + description: "Regenerate theories from scratch." + followup_prompt: "What should change?" + create: + task_type: theory_generation + inputs: [scope, latest_extraction_schema_design, latest_theorizer_extraction] + edges: + parent_child: epic + discovered_from: source + config: + iteration: source.config.iteration + 1 + phase: theorizer + input_instructions: | + Redo iteration ${iteration}. ${user_feedback}. + append: + description: "Generate additional theories beyond these." + followup_prompt: "What angle for the additions?" + create: + task_type: theory_generation + inputs: [scope, latest_extraction_schema_design, latest_theorizer_extraction, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: source.config.iteration + 1 + mode: append + phase: theorizer + input_instructions: | + Iteration ${iteration} (append). Generate additional + theories beyond the prior set. ${user_feedback}. + + novelty_assessment: + - ask_user: + prompt: "Novelty evaluated. proceed / redo / append?" + options: + proceed: + description: "Move to the closing report." + create: + task_type: synthesis + inputs: [scope, latest_literature_review, latest_theorizer_extraction, latest_theory_generation, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: 1 + run_kind: report + input_instructions: | + Write the closing report.md following the SKILL.md + voice (direct prose, college-educated reader, TL;DR + with Key findings / The theories / Next steps). Cite + grounding tasks inline. + redo: + description: "Re-evaluate novelty from scratch." + followup_prompt: "What should change?" + create: + task_type: novelty_assessment + inputs: [latest_theory_generation] + edges: + parent_child: epic + discovered_from: source + config: + iteration: source.config.iteration + 1 + input_instructions: | + Redo iteration ${iteration}. ${user_feedback}. + append: + description: "Evaluate additional theories." + followup_prompt: "Which additional theories?" + create: + task_type: novelty_assessment + inputs: [latest_theory_generation, source] + edges: + parent_child: epic + blocks_from: source + config: + iteration: source.config.iteration + 1 + mode: append + input_instructions: | + Iteration ${iteration} (append). Evaluate the + additional theories. ${user_feedback}. + + synthesis: + - when: source.config.run_kind == "report" + ask_user: + prompt: "Report drafted. proceed / redo / append?" + options: + proceed: + description: "Done — close the run." + noop: true + redo: + description: "Rewrite the report from scratch." + followup_prompt: "What should change?" + create: + task_type: synthesis + inputs: [scope, latest_literature_review, latest_theorizer_extraction, latest_theory_generation, latest_novelty_assessment] + edges: + parent_child: epic + discovered_from: source + config: + iteration: source.config.iteration + 1 + run_kind: report + input_instructions: | + Redo iteration ${iteration}. ${user_feedback}. + append: + description: "Extend the existing report with more sections." + followup_prompt: "What to add?" + create: + task_type: synthesis + inputs: [source, latest_novelty_assessment] + edges: + parent_child: epic + blocks_from: source + config: + iteration: source.config.iteration + 1 + run_kind: report + mode: append + input_instructions: | + Iteration ${iteration} (append). Extend the prior + report. ${user_feedback}. diff --git a/plugins/asta-preview/skills/research-step/workflows/brainstorm.md b/plugins/asta-preview/skills/research-step/workflows/brainstorm.md index 884f48f..25569a0 100644 --- a/plugins/asta-preview/skills/research-step/workflows/brainstorm.md +++ b/plugins/asta-preview/skills/research-step/workflows/brainstorm.md @@ -1,8 +1,8 @@ # Workflow: brainstorm -Default workflow when the user opens the skill without naming a specific action. Conversational; reads beads state and `mission.md` to answer questions, surface what's next, and refine the research direction. +Default workflow when the user opens the skill without naming a specific action. Conversational; reads beads state and `mission.md` to answer questions, surface what's next, and refine the research direction. Also performs **template selection** once a `mission.md` exists but no epic does. -Read-mostly. The only file brainstorm writes directly is `mission.md`, and only after explicit user confirmation. It does invoke **update-summary** at the start (which rewrites `summary.md` only when stale), but it never mutates beads. When the user is ready to act, it hands off to `init`, `plan`, `execute`, or `update-summary`. +Read-mostly. The only files brainstorm writes directly are `mission.md` (only after explicit user confirmation) and the `template:` field of its frontmatter (after template selection). It does invoke **update-summary** at the start (which rewrites `summary.md` only when stale), but it never mutates beads. When the user is ready to act, it hands off to `init`, `plan`, `execute`, or `update-summary`. ## Preconditions @@ -15,6 +15,7 @@ None. Brainstorm runs from any state. Compute these signals once at the start so the conversation can branch sensibly: - **`has_mission`** — `mission.md` exists and is non-empty. +- **`has_template`** — `mission.md` exists and its frontmatter has a non-empty `template:` field. - **`has_bd`** — `command -v bd` succeeds and `.beads/` exists. - **`has_epic`** — `has_bd` and `scripts/epic-root.sh` prints `status: found` (the `id:` line gives the epic ID for follow-up queries). @@ -25,12 +26,36 @@ If `has_epic`, hand off to **update-summary** before anything else so `summary.m Pick the branch that matches; do not run more than one. - **No `mission.md`** → help the user draft one. - Engage in a short Socratic exchange. Useful prompts: the research question, why it matters, what success looks like, what's already known, what's explicitly out of scope. When you have enough, propose a draft, get confirmation, and write `mission.md`. Then offer to run **init**. + Engage in a short Socratic exchange. Useful prompts: the research question, why it matters, what success looks like, what's already known, what's explicitly out of scope. When you have enough, propose a draft, get confirmation, and write `mission.md` (without the `template:` field yet). Then proceed to step 2.5 for template selection. -- **`mission.md` exists, no epic** → recap the mission, check whether the user wants to refine it, then offer to run **init** to bootstrap the research session. +- **`mission.md` exists, no template chosen (`has_mission && !has_template`)** → run step 2.5 (template selection). + +- **`mission.md` and template both set, no epic** → recap the mission, confirm the template choice is still right (offer to change it), then offer to run **init** to bootstrap the research session. - **Active session (`has_epic`)** → answer the user's question, or if they didn't ask one, give a short status report (closed / in-progress / ready counts plus the single most-relevant ready task) and ask what they want to do next. +### 2.5. Template selection + +Trigger condition: `has_mission && !has_template && !has_epic`. + +Ask via `AskUserQuestion`: +> "Which plan template should drive this session?" +> options: +> - `hypothesis_driven_research` — survey literature → fan out hypotheses from gaps → design / gather / analyze → synthesize a closing report. +> - `grounded_theory_generation` — drive from an AutoDS run: extract themes & gaps → per-lane literature search + reproduction → cross-cutting parametric theory generation → novelty assessment → final report. +> - `iterative_theorizer` — user-gated theorizer pipeline: literature_review → build-extraction-schema → find-and-extract → form-theory → evaluate-novelty → closing report, with a proceed / redo / append checkpoint after each step. + +After the user picks, update `mission.md`'s frontmatter: + +```yaml +--- +template: +--- + +``` + +If the user picks `grounded_theory_generation`, also probe for the AutoDS run pointer: ask whether `mission.md` already references the run path / run ID, and if not, ask for one. Append a `Datasets` / `AutoDS run` section to `mission.md` if needed. + ### 3. Answer questions, preferring `summary.md` `summary.md` is the synthesized view of the session — mission, scope, definitions, related work, hypotheses, results, open questions, and status. It was just regenerated by the `update-summary` hand-off in step 1, so it is current. @@ -41,13 +66,14 @@ Pick the branch that matches; do not run more than one. | Need | Query | |---|--------------------------------------------------------------------------------------------------------| -| Single issue's full `metadata.research_step.output` | `bd show --json` | -| Full open-issue metadata (rare; usually the digest covers it) | `bd list` | -| Dependency structure | `bd dep tree --direction up`| -| Long-form notes from an evidence_gathering task | follow `metadata.research_step.output.summary_path` referenced from the digest | -| Exact `verdict` / `confidence` for a hypothesis | `bd show --json` (digest reports the verdict, not the confidence number) | - -Rule of thumb: if you can answer from `summary.md`, do. If the user asks for a specific number, file path, or verbatim output that the digest abstracts, then fetch it from `bd`. +| Single issue's full structured output | `cat .asta/tasks//output.json` | +| Single issue's full metadata envelope | `bd show --json` | +| Full open-issue metadata (rare; usually the digest covers it) | `bd list --json` | +| Dependency structure | `bd dep tree --direction up` | +| Long-form notes from a task | open `.asta/tasks//output.md` | +| Exact verdict / confidence for a closed analysis | `cat .asta/tasks//output.json` (digest reports the verdict, not the confidence number) | + +Rule of thumb: if you can answer from `summary.md`, do. If the user asks for a specific number, file path, or verbatim output that the digest abstracts, then fetch from disk or `bd`. ### 4. Offer to update `mission.md` diff --git a/plugins/asta-preview/skills/research-step/workflows/execute.md b/plugins/asta-preview/skills/research-step/workflows/execute.md index 5fba9ea..839a588 100644 --- a/plugins/asta-preview/skills/research-step/workflows/execute.md +++ b/plugins/asta-preview/skills/research-step/workflows/execute.md @@ -1,6 +1,6 @@ # Workflow: execute -Run one ready task end-to-end. Loads its schema, gathers its declared inputs, produces a structured output, validates it, and closes the issue. After closing, hands off to **plan** if the closed task type unlocks new graph structure; otherwise hands off to **update-summary**. +Run one ready task end-to-end. Materialize its `input.md` / `input.json` from the issue's metadata and upstream task outputs, invoke the agent to produce `output.md` / `output.json`, validate the result, persist pointers in beads, and close the issue. After closing, hand off to **plan** if the closed task type unlocks new structure; otherwise hand off to **update-summary**. ## Preconditions @@ -9,32 +9,66 @@ Run one ready task end-to-end. Loads its schema, gathers its declared inputs, pr ## Steps -1. **Pick a task.** If a task ID was supplied, use it. Else `bd ready --json` and pick the oldest issue (tiebreak by `bd-id` ascending). Hypothesis tasks are normally auto-resolved at creation by **plan**, so they should not appear here. If one does, it means the gap text was too thin for plan to fill the output without inventing content — flag this to the user and ask whether to refine the source `literature_review` first. +1. **Pick a task.** If a task ID was supplied, use it. Else `bd ready --json` and pick the oldest issue (tiebreak by `bd-id` ascending). + 2. **Claim it.** `bd update --status=in_progress`. -3. **Load the schema.** Read the task type with `bd show --json | jq -r '.[0].metadata.research_step.task_type'`. Open `assets/schemas.yaml` and find the matching entry under `task_types`. -4. **Gather inputs.** For every issue listed in this issue's `inputs` (`bd show --json | jq '.[0].metadata.research_step.inputs'`), read its output with `bd show --json | jq '.[0].metadata.research_step.output'`. Also load `mission.md` and any files referenced from input outputs via `_path` fields (e.g., `summary_path` from a `literature_review`). **This is the only context to use** — do not pull in unrelated repo state. -5. **Do the work.** Produce a JSON object matching the schema. For schema fields ending in `_path`, write the file to disk first and put the relative path in the JSON. -6. **Validate structurally.** Run `scripts/validate-output.sh `. It checks the envelope (`research_step.task_type`, `inputs`, `output_schema_version`, `output`) and every required `output.` for the task_type, plus type spot-checks for the high-leverage cases (e.g., `analysis.verdict` enum, `analysis.confidence` range). Exit 0 ⇒ valid. Any non-zero exit ⇒ fail loudly and **leave the issue `in_progress`** for retry. Do not close. -7. **Persist the output.** Materialize the metadata JSON via `scripts/write-meta.sh` (reads JSON from stdin, prints a temp file path), then `bd update --metadata @`. Preserve the existing `task_type`, `inputs`, and `output_schema_version`. + +3. **Read the issue's metadata.** + ``` + bd show --json | jq '.[0].metadata.research_step' + ``` + Extract `task_type`, `inputs[]`, `input_instructions`, `output_instructions`, `config`. + +4. **Read the prepared inputs.** Plan has already written `.asta/tasks//input.md` (a short task brief with inline citations to upstream sources) and `.asta/tasks//input.json` (the structured upstream output fragments plus this task's config). Their paths are in `metadata.research_step.input_md` and `metadata.research_step.input_json`. The agent reads: + - `input.md` for the human-readable brief. + - `input.json` for the structured upstream data. + - `metadata.research_step.input_instructions` for the full task prompt. + - `metadata.research_step.output_instructions` for what the output must contain. + + If `input.md` or `input.json` is missing (e.g., on a task created by an older plan run), fall back to gathering upstream context inline: read each upstream task's `output.json` and the first few paragraphs of its `output.md`, and proceed. + +5. **Do the work.** The agent produces: + - `.asta/tasks//output.md` — the narrative artifact. + - `.asta/tasks//output.json` — the structured output (matches the task type's schema in `assets/schemas.yaml`). + - Any task-type-specific sidecar files referenced from `output.json` (e.g., `extraction_schema.json`, `theories.json`). + + If the task type has an `execute_ref` field in `assets/schemas.yaml`, that's the asta CLI subcommand to invoke. Use it directly; do not fall back to `send-message` without setting the right `skill_id` (that routes to the agent's default task, which is usually the wrong granularity). + +6. **Validate structurally.** Run: + ``` + scripts/validate-output.sh .asta/tasks//output.json + ``` + Exit 0 ⇒ valid; any non-zero ⇒ fail loudly and **leave the issue `in_progress`** for retry. Do not close. + +7. **Persist output pointers.** Build a metadata JSON adding `output_md` and `output_json` fields and `bd update --metadata @`. Preserve all existing fields. + 8. **Close.** `bd close `. -9. **Hand off to plan or update-summary.** Some closed task types unlock new graph structure; others don't. Decide based on the closed task's `task_type`: - | Closed task_type | Hand off to | - |---|---| - | `literature_review`, `hypothesis`, `analysis`, `synthesis` | **plan** (with this issue as the source). `plan` then chains to **update-summary**. Note: `hypothesis` only reaches this branch in the rare case it was left open at creation; the normal path is plan→auto-resolve. | - | `scope`, `definitions`, `experiment_design`, `evidence_gathering` | **update-summary** directly. | +9. **Hand off to plan or update-summary.** Decide based on the closed task's `task_type`: - Either path ends with `summary.md` rebuilt. + | Closed task_type | Hand off to | + |---|---| + | `literature_review`, `hypothesis`, `analysis`, `synthesis` | **plan** (with this issue as the source). `plan` then chains to **update-summary**. | + | `auto_discovery`, `extraction_schema_design`, `theorizer_extraction`, `theory_generation`, `grounded_theory_generation`, `novelty_assessment` | **plan** (with this issue as the source). | + | `scope`, `definitions`, `experiment_design`, `evidence_gathering` | **update-summary** directly. | -## Notes on output files + Either path ends with `summary.md` rebuilt. + +## Special case: report-kind synthesis -Schema fields ending in `_path` are relative paths. Conventions: +If the closed task is `synthesis(config.run_kind=report)`, additionally create a project-root `report.md` symlink (or copy on platforms without symlink support) pointing at this task's `output.md`. This surfaces the closing deliverable at the conventional location for the user. -- `summary_path` (from `literature_review`) → `background_knowledge.txt` by convention, but any path works. -- `log_path` (from `evidence_gathering`) → typically under `logs/`. -- `report_path` (from `synthesis`) → typically `report.md`. +``` +ln -sf .asta/tasks//output.md report.md +``` + +Then proceed to **update-summary** as usual (no further plan replan — `synthesis(run_kind=report)` is the terminal node). + +## Notes on output files -Write the file before setting the output JSON. If the executor crashes between writing the file and closing the issue, the file is harmless orphan data — re-running `execute` on the same issue will overwrite it. +- `.asta/tasks//output.json` is the canonical structured output; `validate-output.sh` runs against it. +- `.asta/tasks//output.md` is the human narrative; it may include markdown hyperlinks to code, figures, log files. These are not auto-indexed in this version of the spec — the human is expected to follow them by hand. +- Sidecar JSON files (`extraction_schema.json`, `theories.json`, `paper_store.json`, `novelty_results.json`) are referenced from `output.json` via `_path` fields. The sidecar contents are themselves loose-typed (their shape is governed by the upstream CLI, not by `validate-output.sh`). ## Out of scope for this workflow diff --git a/plugins/asta-preview/skills/research-step/workflows/init.md b/plugins/asta-preview/skills/research-step/workflows/init.md index fd11be3..d461f40 100644 --- a/plugins/asta-preview/skills/research-step/workflows/init.md +++ b/plugins/asta-preview/skills/research-step/workflows/init.md @@ -1,12 +1,12 @@ # Workflow: init -Bootstrap the environment for a research session: install `bd` and `jq`, run `bd init`, wire beads to the project's git remote for cross-machine sync, and verify the staleness check works. This is the only workflow that may install or configure tools; `plan`, `update-summary`, and `execute` assume the environment is ready. +Bootstrap the environment for a research session: install `bd` and `jq`, run `bd init`, wire beads to the project's git remote for cross-machine sync, create the `.asta/` skeleton, and verify the staleness check works. This is the only workflow that may install or configure tools; `plan`, `update-summary`, and `execute` assume the environment is ready. After environment setup, hand off to **plan** to bootstrap the mission epic and initial frontier. ## Preconditions -None. `init` is idempotent — installing already-installed tools is a no-op, and `bd init` is safe to skip if `.beads/` already exists. +None. `init` is idempotent — installing already-installed tools is a no-op, `bd init` is safe to skip if `.beads/` already exists, and the `.asta/` mkdir calls are `-p` form. ## Backend choice @@ -21,30 +21,38 @@ Server mode (`bd init --server`) is out of scope: it requires running a Dolt sql - Otherwise, install per the beads project's documented method for the current platform. Consult https://github.com/gastownhall/beads (cross-platform options at time of writing include `go install`, Homebrew, `winget`, the project's install script, and the Nix flake). If you are uncertain which method applies, fetch the install docs at run time rather than guessing. - Verify with `bd --version`. If it still fails, abort and ask the user to install manually. -2. **Initialize beads (embedded Dolt) and wire it to git.** +2. **Ensure `jq` is on PATH.** + - Run `command -v jq`. If present, skip. + - Otherwise install (`brew install jq`, `apt-get install jq`, etc.) and retry. + +3. **Initialize beads (embedded Dolt) and wire it to git.** - If `.beads/` does not exist, run `bd init`. - If the working directory has a git remote `origin`, configure beads' Dolt sync against it so `bd dolt push`/`pull` work without further setup. Probe with `bd dolt remote list`; if nothing is configured, add a remote pointing at `origin` per beads docs. - If no git remote exists, skip the Dolt-remote step and tell the user that cross-machine transfer will need a remote added later. Do not block — the single-machine flow works without it. - Verify with `bd list` (should succeed and return an empty list or existing issues). -3. **Fresh-clone recovery.** If `bd init` produced an empty DB but `.beads/issues.jsonl` is present (the project had upstream beads state and was freshly cloned), **do not silently `bd import`**. +4. **Fresh-clone recovery.** If `bd init` produced an empty DB but `.beads/issues.jsonl` is present (the project had upstream beads state and was freshly cloned), **do not silently `bd import`**. - Run `git ls-remote origin 'refs/dolt/*'`. If any Dolt refs exist, run `bd dolt pull` — this is the canonical recovery path and preserves Dolt history. - If no Dolt refs exist on the remote, surface the situation to the user with three options: (a) `bd import .beads/issues.jsonl` (fast, but discards Dolt history and any state newer than the export), (b) configure a Dolt remote and `bd dolt push` from another machine that has the live DB, then retry, (c) abort. - Pick one path only after explicit user confirmation. Never auto-import. -4. **Verify the staleness check works.** +5. **Create the `.asta/` skeleton.** + - `mkdir -p .asta/tasks` (per-task working directories live here). + - `mkdir -p .asta/literature/threads` (multi-turn paper-finder thread state). + - These are no-ops if the dirs exist. + +6. **Verify the staleness check works.** - Run `scripts/summary-check.sh`. It hashes the sorted IDs of currently-open issues and compares against `summary.md`'s frontmatter. Backend-agnostic — beads can use whichever storage it likes. - - Requires `jq` on PATH; if missing, install it (`brew install jq`, `apt-get install jq`, etc.) and retry. - At init time `summary.md` does not yet exist, so the script will print `status: missing` and exit 1 — that's fine; **update-summary** will create the file later. `status: no-tools` (exit 3) means abort and ask the user. -5. **Hand off to plan.** Per the router's chaining rule, run the **plan** workflow next. It will detect that no epic exists yet and bootstrap one from `mission.md`. If `mission.md` is missing, **plan** will route the user back to **brainstorm**. +7. **Hand off to plan.** Per the router's chaining rule, run the **plan** workflow next. It will detect that no epic exists yet and bootstrap one from `mission.md`'s `template:` hint (set by brainstorm) plus the body. If `mission.md` is missing, **plan** will route the user back to **brainstorm**. ## Cross-machine transfer To move a session to another machine: 1. On machine A: `bd dolt push` (sends the Dolt data as a git ref to `origin`). -2. On machine B: clone the repo and run this `init` workflow. Step 3's fresh-clone recovery will see the Dolt refs on `origin` and `bd dolt pull` them automatically. +2. On machine B: clone the repo and run this `init` workflow. Step 4's fresh-clone recovery will see the Dolt refs on `origin` and `bd dolt pull` them automatically. Two machines writing at the same time is not supported in embedded mode; coordinate manually (push before handing off, pull before resuming). @@ -54,3 +62,4 @@ Two machines writing at the same time is not supported in embedded mode; coordin - Writing `summary.md`. That belongs to **update-summary** (chained automatically after `plan`). - Re-running setup once a session is initialized. If `bd` or `jq` breaks later, fix it manually rather than re-running `init`. - Server-mode beads (`bd init --server`) and any setup requiring a running Dolt sql-server. +- Template selection (lives in **brainstorm**). diff --git a/plugins/asta-preview/skills/research-step/workflows/plan.md b/plugins/asta-preview/skills/research-step/workflows/plan.md index c5ffb2d..4df6b87 100644 --- a/plugins/asta-preview/skills/research-step/workflows/plan.md +++ b/plugins/asta-preview/skills/research-step/workflows/plan.md @@ -1,91 +1,179 @@ # Workflow: plan -Create or extend the research graph. The single home for "design the next set of typed tasks." Two modes, selected from state: +Create or extend the research graph by reading the epic's chosen YAML template and walking it. **You are the walker.** Read the template directly (via the Read tool), evaluate its constructs as documented below, and issue `bd` commands inline. No separate walker script. -- **bootstrap** — no epic exists yet. Create the mission epic and the initial frontier (scope, definitions, literature_review) from `mission.md`. -- **replan** — an epic exists. Add downstream tasks based on a recently-closed task's output, or on user direction. +Two modes, selected from state: + +- **bootstrap** — no epic exists yet. Create the mission epic and the initial frontier from `template.bootstrap[]`. +- **replan** — an epic exists. Add downstream tasks per `template.replan[][]`, where `` is the type of the most-recently-closed task. Always chains to **update-summary** afterward so `summary.md` reflects the new graph. ## Preconditions - `bd` is installed and `.beads/` is initialized. If not, run **init** first. -- For **bootstrap**: `mission.md` exists and is non-empty, and `scripts/epic-root.sh` reports `status: none` (no epic yet). If `mission.md` is missing, abort and route the user to **brainstorm** to draft one. -- For **replan**: `scripts/epic-root.sh` reports `status: found` (an epic exists). If a specific source task was supplied (typically by `execute` chaining into this workflow), it is closed and has a populated `metadata.research_step.output`. +- `mission.md` exists and its frontmatter has a `template:` field naming a valid template. If not, route the user back to **brainstorm**. +- For **bootstrap**: `scripts/epic-root.sh` reports `status: none` (no epic yet). +- For **replan**: `scripts/epic-root.sh` reports `status: found` (an epic exists). If a specific source task was supplied (typical when `execute` chains in), it is `closed` and has populated `.asta/tasks//output.json`. -## Issue metadata convention +## Issue metadata convention (v2) -Every task issue carries: +Every task issue carries (in `bd update --metadata` JSON): ```json { "research_step": { - "task_type": "", + "task_type": "", "inputs": ["bd-xxxx", "bd-yyyy"], - "output_schema_version": 1, - "output": null + "output_schema_version": 2, + "input_instructions": "", + "output_instructions": "", + "config": { "": "..." } } } ``` -The mission epic additionally carries `epic_root: true`. +The mission epic additionally carries `epic_root: true` and `template: `. + +After **execute** has run a task and closed it, the metadata gains `input_md`, `input_json`, `output_md`, `output_json` pointers (set by execute, not by plan). ## Mode selection 1. Run `scripts/epic-root.sh`. `status: none` → **bootstrap**. -2. `status: found` (epic ID on the `id:` line) → **replan**. If the caller named a specific closed task (typical when `execute` chains here), use it as the source. Else, ask the user which closed task to plan around or which subgraph to extend, then proceed. +2. `status: found` (epic ID on the `id:` line) → **replan**. + - If the caller named a specific closed task (typical when `execute` chains here), use it as the source. + - Otherwise, ask the user which closed task to plan around or which subgraph to extend, then proceed. ## Bootstrap mode -1. **Verify mission.** Read `mission.md`. If missing or empty, abort and suggest **brainstorm**. -2. **Create the epic.** +1. **Read the template name** from `mission.md` frontmatter: + ``` + template=$(awk '/^---$/{n++; next} n==1 && /^template:/ {print $2}' mission.md) + ``` + Open `templates/