Problem
The reflections loop has a structural gap: agents can log a decision or prediction at bet start, but the validation and resolution counterparts happen during PR review — a phase owned by sensei (the orchestrator), not an agent.
Currently there is no enforced or even suggested path for sensei to write validation or resolution reflection records when reviewing a PR. This means:
reflections.jsonl files exist but contain zero validation or resolution entries
BeltCalculator.readRunMetrics() reads predictionOutcomePairs from type: 'validation' records — always 0
- The calibration loop is broken: predictions are made, outcomes are observed, but nothing closes the loop
Example scenario
- Agent bets that "refactoring X will reduce test time by 20%" — logs a
prediction observation
- Agent completes work, PR opened
- Sensei reviews PR, CI shows test time reduced by 18%
- Gap: no mechanism or reminder for sensei to log a
validation reflection linking the outcome back to the prediction
- Belt calculator sees 0 prediction-outcome pairs; calibration accuracy stays at 0
Desired behavior
- When sensei runs
kata kiai complete <run-id> --success (or --failure), the CLI should prompt (or accept flags) for:
- Was there an active prediction for this run? Did it validate?
- Were any frictions resolved? (resolution record)
- Alternatively,
kata kansatsu record validation --run-id <id> --prediction-id <id> --accuracy 0.9 should be surfaced in the kata kiai complete flow
- Agent context should remind agents to log predictions at start AND remind sensei to close the loop at completion
Notes
Acceptance criteria
Problem
The reflections loop has a structural gap: agents can log a decision or prediction at bet start, but the validation and resolution counterparts happen during PR review — a phase owned by sensei (the orchestrator), not an agent.
Currently there is no enforced or even suggested path for sensei to write
validationorresolutionreflection records when reviewing a PR. This means:reflections.jsonlfiles exist but contain zerovalidationorresolutionentriesBeltCalculator.readRunMetrics()readspredictionOutcomePairsfromtype: 'validation'records — always 0Example scenario
predictionobservationvalidationreflection linking the outcome back to the predictionDesired behavior
kata kiai complete <run-id> --success(or--failure), the CLI should prompt (or accept flags) for:kata kansatsu record validation --run-id <id> --prediction-id <id> --accuracy 0.9should be surfaced in thekata kiai completeflowNotes
predictionOutcomePairsto always be 0 in belt snapshotskiai complete) and partly documentation/agent-context (remind agents what to log)Acceptance criteria
kata kiai completeflow includes a validation prompt or--validate-predictionflagvalidationrecord written per completed run that had a prediction