Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,7 @@ skills-lock.json
.idea/
*.iml

# macOS
.DS_Store

.asta
8 changes: 4 additions & 4 deletions plugins/asta-preview/skills/research-step/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
name: research-step
description: Plan and execute autonomous research as a graph of typed tasks tracked in beads. Use when working from a mission.md to drive multi-step research with explicit dependencies and structured outputs.
allowed-tools: Bash(bd:*) Bash(date:*) Bash(scripts/*) Read(assets/**) Read(workflows/**) Read(scripts/**) Skill(asta:*) Skill(asta-preview:*) Skill(asta-plugins:*)
allowed-tools: Bash(bd:*) Bash(date:*) Bash(scripts/*) Bash(asta:*) Read(assets/**) Read(workflows/**) Read(scripts/**) Skill(asta:*) Skill(asta-preview:*) Skill(asta-plugins:*)
---

# Research Step

Models a research session as a beads epic. Each unit of work is a typed sub-issue whose `metadata.research_step.output` matches a JSON schema in `assets/schemas.yaml`.
Models a research session as a beads epic. A session runs a **flow** — the composed `data_and_literature_grounded_theory_generation` (which begins with `data_provenance`), its sub-flows `reproduction` and `theorizer`, the standalone `hypothesis_driven_research` flow (literature → falsifiable hypotheses → one prespecified test per hypothesis), the standalone `auto_discovery` flow (source a cohort and run a fresh discovery; run it as its own session in a **separate workspace** — own `mission.md` and `.beads` — typically kicked off after a theory-generation run; a second epic root in the same workspace breaks `scripts/epic-root.sh`), or a custom chain (each flow's purpose is in its `mission` field in `assets/schemas.yaml`). `assets/schemas.yaml` defines the reusable `types` (immutable records — verdicts are `adjudication` records referencing their subject), the `tasks` (pure output contracts mapping each output key to its type), and the `flows` (each step carrying its `mission`, its `input` steps, and its asta `chain`). Each unit of work is a typed sub-issue whose `metadata.research_step.output_json` matches its task's output in the schema; the issue envelope carries `flow` and `task_type`.

This skill is a **router**. Inspect the working directory and the user's request, pick one workflow, then read its `.md` file in `workflows/` and follow it. Do not execute a workflow from memory — always open the file first.

Expand All @@ -23,7 +23,7 @@ Installing `bd` and `jq`, running `bd init`, and verifying `scripts/summary-chec
| `mission.md` | Input. The research task. |
| `.beads/` | Source of truth for state. |
| `summary.md` | Derived view of the session, regenerated by **update-summary**. Beads is the source of truth; this file is just a digest for humans and for **brainstorm**. Frontmatter `beads_snapshot` records the state it was rendered from. |
| `background_knowledge.txt` | Optional. Long-form context referenced from issue metadata via `summary_path`. |
| `.asta/<agent>/<slug>/` | Heavy artifacts (raw agent JSON, datasets, reports), referenced from `output_json` by repo-root-relative `_path` fields. |

## Workflows

Expand Down Expand Up @@ -51,7 +51,7 @@ If the user did not name a workflow, run **brainstorm**. It inspects the working

- **init** → always run **plan** afterwards (which then chains to **update-summary**).
- **plan** → always run **update-summary** afterwards so the digest reflects the new graph.
- **execute** → if the closed task type is `literature_review`, `hypothesis`, `analysis`, or `synthesis`, chain to **plan** (which chains to **update-summary**); otherwise chain directly to **update-summary**.
- **execute** → chain to **plan** when the closed task type unlocks new structure for its flow (see the hand-off rule in `execute.md`, last step); otherwise chain directly to **update-summary**.
- **update-summary** and **brainstorm** → never chain.

## Boundaries
Expand Down
718 changes: 638 additions & 80 deletions plugins/asta-preview/skills/research-step/assets/schemas.yaml

Large diffs are not rendered by default.

53 changes: 53 additions & 0 deletions plugins/asta-preview/skills/research-step/scripts/close-task.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/usr/bin/env bash
# close-task.sh <issue-id> <output-json> <output-markdown>
# Publish a task's output and finish it: write output_json + output_markdown into the issue
# metadata, validate output_json against the schema, close the issue, assert it closed, then
# close any ancestor group whose last child just closed.
set -euo pipefail
here="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

[[ $# -eq 3 ]] || { echo "usage: close-task.sh <issue-id> <output-json> <output-markdown>" >&2; exit 1; }
id="$1"; oj="$2"; om="$3"
[[ -f "$oj" ]] || { echo "close-task: no output-json $oj" >&2; exit 1; }
[[ -f "$om" ]] || { echo "close-task: no output-markdown $om" >&2; exit 1; }
jq -e . "$oj" >/dev/null 2>&1 || { echo "close-task: $oj is not valid JSON" >&2; exit 1; }

# 1. publish: merge output_json + output_markdown into the existing research_step metadata
cur="$(bd show "$id" --json | jq -c '.[0].metadata')"
merged="$(jq -c --slurpfile oj "$oj" --rawfile om "$om" \
'.research_step.output_json = $oj[0] | .research_step.output_markdown = $om' <<<"$cur")"
tmp="$(mktemp)"; trap 'rm -f "$tmp"' EXIT
printf '%s' "$merged" > "$tmp"
bd update "$id" --metadata @"$tmp" >/dev/null

# 2. validate structurally (reads the issue back; no style lint)
bash "$here/validate-output.sh" "$id"

# 3. close and 4. assert closure
bd close "$id" >/dev/null
[[ "$(bd show "$id" --json | jq -r '.[0].status')" == "closed" ]] \
|| { echo "close-task: $id did not close" >&2; exit 2; }
echo "closed $id"

# 5. cascade: close each ancestor group whose direct children are all closed.
# The epic root is never closed here — "root open, no open tasks" is the
# session-complete state that epic-root.sh and the workflows rely on.
cur_id="$id"
while [[ "$cur_id" == *.* ]]; do
parent="${cur_id%.*}"
parent_json="$(bd show "$parent" --json 2>/dev/null)" || break
[[ "$(jq -r '.[0].metadata.research_step.epic_root // false' <<<"$parent_json")" == "true" ]] && break
open_kids="$(bd list --json --limit 0 | jq --arg p "$parent" '
[ .[]
| select(.id | startswith($p + "."))
| select((.id[($p|length)+1:] | contains(".")) | not)
| select(.status != "closed") ] | length')"
[[ "$open_kids" -eq 0 ]] || break
if bd close "$parent" >/dev/null 2>&1; then
echo "closed group $parent"
else
echo "close-task: warning: could not close group $parent (task $id is closed; close the group manually)" >&2
break
fi
cur_id="$parent"
done
26 changes: 26 additions & 0 deletions plugins/asta-preview/skills/research-step/scripts/create-task.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env bash
# create-task.sh <parent-id> <task_type> <flow> <title> <brief-description> [input-id ...]
# Create a leaf task issue under <parent-id>: hierarchical id, a brief one-line description,
# and initialized research_step metadata. output_json / output_markdown stay null until
# execute publishes them via close-task.sh. Prints the new issue id.
set -euo pipefail
here="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

[[ $# -ge 5 ]] || { echo "usage: create-task.sh <parent-id> <task_type> <flow> <title> <brief-desc> [input-id ...]" >&2; exit 1; }
parent="$1"; task_type="$2"; flow="$3"; title="$4"; desc="$5"; shift 5

# Validate the task_type against schemas.yaml. The helper exits 3 for an
# unknown task_type (and prints the known ones) or 5 when the schema cannot
# be read (e.g. PyYAML missing — run init); set -e propagates either.
"$here/task-output-keys.sh" "$task_type" >/dev/null

[[ -n "$desc" ]] || { echo "create-task: a brief description is required" >&2; exit 4; }
[[ "$desc" != *$'\n'* ]] || { echo "create-task: description must be one line" >&2; exit 4; }
[[ "${#desc}" -le 200 ]] || { echo "create-task: description too long (${#desc} chars > 200) — keep it brief" >&2; exit 4; }

if [[ $# -eq 0 ]]; then inputs_json="[]"; else inputs_json="$(printf '%s\n' "$@" | jq -R . | jq -cs .)"; fi
meta="$(jq -nc --arg f "$flow" --arg tt "$task_type" --argjson inp "$inputs_json" \
'{research_step: {flow: $f, task_type: $tt, inputs: $inp, output_schema_version: 2, output_json: null, output_markdown: null}}')"
tmp="$(mktemp)"; trap 'rm -f "$tmp"' EXIT
printf '%s' "$meta" > "$tmp"
bd create "$title" --parent "$parent" -d "$desc" --metadata @"$tmp" --silent
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ if ! command -v jq >/dev/null 2>&1; then
exit 3
fi

ids=$(bd list --json | jq -r '.[] | select(.metadata.research_step.epic_root == true) | .id')
ids=$(bd list --json --limit 0 | jq -r '.[] | select(.metadata.research_step.epic_root == true) | .id')
count=$(printf '%s' "$ids" | grep -c . || true)

case "$count" in
Expand Down
34 changes: 34 additions & 0 deletions plugins/asta-preview/skills/research-step/scripts/next-task.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/usr/bin/env bash
# next-task.sh — the single definition of task ordering. Prints the open task
# issues (status == open, metadata.research_step.task_type set), sorted
# *numerically* by hierarchical id (wf.1.2 before wf.1.10 — a plain lexical
# sort would get this wrong past 9 siblings). Groups (no task_type) are never
# listed; there are no dependency edges, so this order is the ordering signal.
#
# Used by execute (pick the next task) and update-summary (render the queue),
# so the two never disagree about what runs next.
#
# Output (stdout, key: value lines):
# next: <bd-id> | none
# queue: <space-separated bd-ids> (omitted when empty)
# Exit: 0 (even when next: none) · 3 bd/jq missing
set -euo pipefail

command -v bd >/dev/null 2>&1 || { echo "next-task: 'bd' not found on PATH" >&2; exit 3; }
command -v jq >/dev/null 2>&1 || { echo "next-task: 'jq' not found on PATH" >&2; exit 3; }

ids="$(bd list --json --limit 0 | jq -r '
[ .[]
| select(.status == "open")
| select(.metadata.research_step.task_type != null) ]
| sort_by(.id | split(".") | map(tonumber? // .))
| .[].id')"

if [[ -z "$ids" ]]; then
echo "next: none"
exit 0
fi

echo "next: $(head -n1 <<<"$ids")"
rest="$(tail -n +2 <<<"$ids" | tr '\n' ' ' | sed 's/ $//')"
[[ -n "$rest" ]] && echo "queue: $rest" || true
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ if ! command -v jq >/dev/null 2>&1; then
exit 3
fi

current=$(bd list --json \
current=$(bd list --json --limit 0 \
| jq -r '.[] | select(.status != "closed") | .id' \
| sort \
| shasum -a 256 \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env bash
# task-output-keys.sh <task_type> — print the space-separated output keys for a
# task from assets/schemas.yaml. The single schema reader for scripts:
# create-task.sh uses it to validate a task_type, validate-output.sh to get the
# expected output_json keys.
# Exit: 0 ok · 1 usage · 3 unknown task_type · 5 cannot read schema
# (python3/PyYAML missing or schemas.yaml unreadable — run init)
set -euo pipefail
here="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
schemas="$here/../assets/schemas.yaml"

[[ $# -eq 1 ]] || { echo "usage: task-output-keys.sh <task_type>" >&2; exit 1; }

python3 - "$schemas" "$1" <<'PY'
import sys

try:
import yaml
except ImportError:
print("task-output-keys: python3 cannot import yaml (PyYAML) - run the init workflow", file=sys.stderr)
sys.exit(5)

try:
with open(sys.argv[1]) as f:
d = yaml.safe_load(f)
except Exception as e:
print(f"task-output-keys: cannot read {sys.argv[1]}: {e}", file=sys.stderr)
sys.exit(5)

tasks = d.get("tasks") or {}
t = tasks.get(sys.argv[2])
if t is None:
print(f"task-output-keys: unknown task_type '{sys.argv[2]}'", file=sys.stderr)
print(f"task-output-keys: known: {' '.join(sorted(tasks))}", file=sys.stderr)
sys.exit(3)
print(" ".join(t["output"]))
PY
141 changes: 52 additions & 89 deletions plugins/asta-preview/skills/research-step/scripts/validate-output.sh
Original file line number Diff line number Diff line change
@@ -1,102 +1,65 @@
#!/usr/bin/env bash
# validate-output.sh — structural validation of a research_step output JSON.
#
# Usage: validate-output.sh <task_type> <metadata-json-file>
#
# Verifies that the JSON file:
# 1. parses
# 2. carries the canonical metadata envelope
# ({research_step: {task_type, inputs, output_schema_version, output}})
# 3. has every required `output.<key>` for the given <task_type> per
# assets/schemas.yaml (schema_version: 1)
#
# Exit codes:
# 0 — valid
# 2 — JSON parse error
# 3 — unknown task_type
# 4 — missing required field
# 5 — task_type mismatch with envelope
#
# This is structural validation only. Quality validation (sound prediction,
# sane confidence, valid citations) is out of scope per execute.md.
# validate-output.sh <issue-id> — structural check of a task's stored output_json.
# Reads the issue from beads and deep-validates metadata.research_step.output_json
# against the compiled JSON Schema (assets/compiled/<task_type>.schema.json,
# regenerated from schemas.yaml by scripts/compile-schemas.py at build time):
# top-level keys closed, declared nested fields required, extra nested fields
# permitted (payloads nest verbatim). No style or quality linting.
# Exit: 0 ok · 1 usage · 2 bad issue/metadata · 3 unknown task
# · 4 schema violation
# · 5 schema unreadable (PyYAML/jsonschema missing or compiled schema
# absent — run the init workflow, or update the plugin)
set -euo pipefail
here="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

if [[ $# -ne 2 ]]; then
echo "usage: validate-output.sh <task_type> <metadata-json-file>" >&2
exit 1
fi
[[ $# -eq 1 ]] || { echo "usage: validate-output.sh <issue-id>" >&2; exit 1; }
id="$1"

task_type="$1"
file="$2"
rs="$(bd show "$id" --json 2>/dev/null | jq -c '.[0].metadata.research_step // empty')"
[[ -n "$rs" ]] || { echo "validate-output: $id has no metadata.research_step" >&2; exit 2; }
task_type="$(jq -r '.task_type // empty' <<<"$rs")"
[[ -n "$task_type" ]] || { echo "validate-output: $id has no task_type" >&2; exit 2; }

if ! jq -e . "$file" > /dev/null 2>&1; then
echo "validate-output: $file is not valid JSON" >&2
exit 2
fi
# Exits 3 (unknown task_type) or 5 (schema unreadable) with its own message.
"$here/task-output-keys.sh" "$task_type" >/dev/null

# Required output fields, mirroring assets/schemas.yaml (schema_version: 1).
case "$task_type" in
scope) required="question boundaries success_criteria" ;;
definitions) required="terms" ;;
literature_review) required="summary_path key_findings gaps citations" ;;
hypothesis) required="statement rationale falsifiable_prediction expected_evidence" ;;
experiment_design) required="method procedure variables artifacts_expected" ;;
evidence_gathering) required="artifacts log_path deviations" ;;
analysis) required="verdict confidence reasoning caveats" ;;
synthesis) required="answer supporting_hypotheses refuted_hypotheses open_questions report_path" ;;
*)
echo "validate-output: unknown task_type '$task_type'" >&2
echo "validate-output: expected one of scope|definitions|literature_review|hypothesis|experiment_design|evidence_gathering|analysis|synthesis" >&2
exit 3
;;
esac
got="$(jq -c '.output_json // empty' <<<"$rs")"
[[ -n "$got" && "$got" != "null" ]] || { echo "validate-output: $id has no output_json" >&2; exit 4; }

# Envelope must carry the matching task_type so we don't validate scope JSON
# against an analysis schema by accident.
envelope_type=$(jq -r '.research_step.task_type // empty' "$file")
if [[ -z "$envelope_type" ]]; then
echo "validate-output: $file missing .research_step.task_type" >&2
schema="$here/../assets/compiled/${task_type}.schema.json"
[[ -r "$schema" ]] || {
echo "validate-output: compiled schema missing for '$task_type' ($schema) — update the plugin (it is regenerated at build time)" >&2
exit 5
fi
if [[ "$envelope_type" != "$task_type" ]]; then
echo "validate-output: envelope task_type='$envelope_type' but expected '$task_type'" >&2
exit 5
fi
}
OUTPUT_JSON="$got" python3 - "$schema" "$task_type" <<'PY'
import json
import os
import sys

# Envelope shape sanity.
for key in inputs output_schema_version output; do
if ! jq -e ".research_step | has(\"$key\")" "$file" >/dev/null; then
echo "validate-output: $file missing .research_step.$key" >&2
exit 5
fi
done
try:
import jsonschema
except ImportError:
print("validate-output: python3 cannot import jsonschema - run the init workflow", file=sys.stderr)
sys.exit(5)

# Required output fields.
for key in $required; do
if ! jq -e ".research_step.output | has(\"$key\")" "$file" >/dev/null; then
echo "validate-output: missing required field 'output.$key' for task_type '$task_type'" >&2
exit 4
fi
done
with open(sys.argv[1]) as f:
schema = json.load(f)
data = json.loads(os.environ["OUTPUT_JSON"])

# Type spot-checks for the high-leverage cases. Not exhaustive — just the
# fields where a wrong type at this layer would silently break update-summary rendering
# or downstream tasks.
case "$task_type" in
literature_review)
jq -e '.research_step.output.key_findings | type == "array"' "$file" >/dev/null \
|| { echo "validate-output: output.key_findings must be an array" >&2; exit 4; }
jq -e '.research_step.output.gaps | type == "array"' "$file" >/dev/null \
|| { echo "validate-output: output.gaps must be an array" >&2; exit 4; }
jq -e '.research_step.output.citations | type == "array"' "$file" >/dev/null \
|| { echo "validate-output: output.citations must be an array" >&2; exit 4; }
;;
analysis)
jq -e '.research_step.output.verdict | IN("supported", "refuted", "inconclusive")' "$file" >/dev/null \
|| { echo "validate-output: output.verdict must be one of supported|refuted|inconclusive" >&2; exit 4; }
jq -e '.research_step.output.confidence | type == "number" and . >= 0 and . <= 1' "$file" >/dev/null \
|| { echo "validate-output: output.confidence must be a number in [0, 1]" >&2; exit 4; }
;;
esac
validator = jsonschema.Draft202012Validator(schema)
errors = sorted(validator.iter_errors(data), key=lambda e: list(map(str, e.absolute_path)))
if errors:
for e in errors[:5]:
path = ".".join(str(p) for p in e.absolute_path)
where = f"output_json.{path}" if path else "output_json"
hint = ""
if e.validator == "additionalProperties" and not path:
hint = " - byproducts go in artifacts"
print(f"validate-output: {where}: {e.message}{hint}", file=sys.stderr)
if len(errors) > 5:
print(f"validate-output: ... and {len(errors) - 5} more schema violation(s)", file=sys.stderr)
print(f"validate-output: output_json does not satisfy the '{sys.argv[2]}' schema", file=sys.stderr)
sys.exit(4)
PY

echo "ok"
Loading
Loading