Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,7 @@ Projects use JSON schema files in the `agentcore/` directory:
- [Evaluations](docs/evals.md) - Evaluators, on-demand evals, and online monitoring
- [Batch Evaluation](docs/batch-evaluation.md) - Run evaluators across sessions at scale
- [Recommendations](docs/recommendations.md) - Optimize prompts and tool descriptions
- [Insights](docs/insights.md) - Failure-pattern analysis and clustering across agent sessions
- [A/B Tests](docs/ab-tests.md) - Split traffic between variants and promote the winner

**Operations**
Expand Down
146 changes: 146 additions & 0 deletions docs/insights.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Insights — `[preview]`

Insights run failure-pattern analysis across your agent's sessions. The insights service inspects historical traces,
clusters bad outcomes into failure categories, and surfaces root causes with recommendations you can act on. Run it
on-demand with `run insights`, or attach a continuous config with `add online-insights`.

> **Preview:** the insights feature is in preview. Commands and output may change.

## Quick Start

```bash
# On-demand failure analysis over the last 7 days of sessions
agentcore run insights \
-r MyAgent \
--insights Builtin.Insight.FailureAnalysis

# Block until the job finishes
agentcore run insights -r MyAgent --insights Builtin.Insight.FailureAnalysis --wait
```

If you omit `--insights`, the CLI defaults to `Builtin.Insight.FailureAnalysis`.

## On-Demand Insights

`run insights` starts a job that analyzes the sessions it finds for your runtime in CloudWatch.

### Choosing the session window

By default, insights looks back 7 days. Narrow or widen the window with `--lookback-days`, or pin an explicit range
with `--start-time` / `--end-time`:

```bash
# Custom lookback window (1–90 days)
agentcore run insights -r MyAgent --insights Builtin.Insight.FailureAnalysis --lookback-days 14

# Explicit time range (ISO-8601)
agentcore run insights \
-r MyAgent \
--insights Builtin.Insight.FailureAnalysis \
--start-time 2026-06-01T00:00:00Z \
--end-time 2026-06-15T00:00:00Z
```

### Limiting to specific sessions

```bash
agentcore run insights -r MyAgent --session-ids <id-1> <id-2>
```

### Using an existing online eval config as the source

```bash
agentcore run insights --online-eval-config-arn <arn>
```

### Chaining into recommendations

Pass evaluators with `-e` so the resulting batch evaluation can later feed `run recommendation --from-insights`:

```bash
agentcore run insights -r MyAgent -e Builtin.Correctness
agentcore run recommendation -r MyAgent -e Builtin.Correctness --type system-prompt --from-insights <insights-id>
```

## Options Reference

| Option | Description |
| ---------------------------- | ---------------------------------------------------------------------- |
| `-r, --runtime <name>` | Runtime name from project config. |
| `--insights <ids...>` | Insight type(s). Defaults to `Builtin.Insight.FailureAnalysis`. |
| `-e, --evaluator <ids...>` | Evaluator(s) to include (needed for chaining into recommendations). |
| `--online-eval-config-arn` | Use an existing OnlineEvaluationConfig as the session source. |
| `-d, --lookback-days <days>` | Lookback window in days, 1–90 (default: 7). |
| `--start-time <iso8601>` | Session filter start time. |
| `--end-time <iso8601>` | Session filter end time. |
| `-s, --session-ids <ids...>` | Limit analysis to specific session IDs. |
| `-n, --name <name>` | Job name (auto-generated if omitted). |
| `--endpoint <name>` | Runtime endpoint name (e.g. `PROMPT_V1`). |
| `--wait` | Block until the job reaches a terminal state. |
| `--region <region>` | AWS region (auto-detected if omitted). |
| `--json` | Output as JSON. |

## Output

Insights jobs are fire-and-forget: `run insights` returns the job `id` and an initial `status`
(`PENDING`/`IN_PROGRESS`) — the failure analysis is **not** available immediately. Pass `--wait` to block until the job
finishes, or check later with `agentcore view insights <id>`.

```bash
agentcore run insights -r MyAgent --insights Builtin.Insight.FailureAnalysis --json
```

A completed job record includes:

| Field | Description |
| ----------------------- | ---------------------------------------------------------------------------- |
| `id` | Insights job ID. |
| `name` | Job name. |
| `status` | Job status (`PENDING`, `IN_PROGRESS`, `COMPLETED`, `FAILED`). |
| `insights` | Insight type(s) requested. |
| `evaluators` | Evaluators included (when chaining into recommendations). |
| `failureAnalysisResult` | Structured failure categories, each with root causes and recommendations. |
| `evaluationResults` | Per-evaluator score summaries (when evaluators were included). |

Each failure category carries a name, description, optional group, and one or more root causes. A root cause includes a
category, description, a recommendation, and the related session IDs.

## Viewing History

List past insights jobs or view one in detail:

```bash
# List all insights jobs (or open the TUI when run without --json)
agentcore view insights --json

# Detail for a single job
agentcore view insights <id> --json
```

You can also browse jobs interactively via the TUI:

```bash
agentcore
# Navigate to: View → Insights
```

## Continuous Insights

Attach a config that runs insights continuously alongside your online evals:

```bash
agentcore add online-insights # add a continuous insights config bound to a runtime
agentcore pause online-insights <name>
agentcore resume online-insights <name>
```

Use `--arn <arn>` with `pause`/`resume` to target configs outside the current project.

## Archiving

Delete an insights job on the service and clear local history:

```bash
agentcore archive insights -i <insights-id>
agentcore archive insights -i <insights-id> --region us-west-2 --json
```
39 changes: 39 additions & 0 deletions src/cli/tui/__tests__/run-insights-copy.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/**
* Regression test: the `run-insights` CLI examples in copy.ts must only reference
* flags that the `run insights` command actually registers. Guards against drift
* like `--lookback 7` (the real flag is `--lookback-days`).
*/
import { createProgram } from '../../cli';
import { CLI_ONLY_EXAMPLES } from '../copy';
import { describe, expect, it } from 'vitest';

function registeredFlags(): Set<string> {
const program = createProgram();
const runCmd = program.commands.find(c => c.name() === 'run');
const insights = runCmd?.commands.find(c => c.name() === 'insights');
if (!insights) throw new Error('run insights command not found');
const flags = new Set<string>();
for (const opt of insights.options) {
if (opt.short) flags.add(opt.short);
if (opt.long) flags.add(opt.long);
}
return flags;
}

const examples = CLI_ONLY_EXAMPLES['run-insights']?.examples ?? [];

describe('run-insights copy examples', () => {
it('only reference flags the `run insights` command registers', () => {
const flags = registeredFlags();
const tokens = examples.flatMap(example => example.split(/\s+/)).filter(token => token.startsWith('-'));

const unknown = tokens.filter(token => !flags.has(token));
expect(unknown).toEqual([]);
});

it('does not reference the non-existent --lookback flag', () => {
for (const example of examples) {
expect(example).not.toMatch(/--lookback\s/);
}
});
});
6 changes: 3 additions & 3 deletions src/cli/tui/copy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,9 @@ export const CLI_ONLY_EXAMPLES: Record<string, { description: string; examples:
'run-insights': {
description: '[preview] Run failure analysis on agent sessions. This command runs in the terminal.',
examples: [
'agentcore run insights -r MyAgent -i FailureAnalysis',
'agentcore run insights -r MyAgent -i FailureAnalysis --lookback 7',
'agentcore run insights -r MyAgent -i FailureAnalysis --wait',
'agentcore run insights -r MyAgent --insights Builtin.Insight.FailureAnalysis',
'agentcore run insights -r MyAgent --insights Builtin.Insight.FailureAnalysis --lookback-days 7',
'agentcore run insights -r MyAgent --insights Builtin.Insight.FailureAnalysis --wait',
],
},
feedback: {
Expand Down
Loading