diff --git a/README.md b/README.md index 32b3ab3dc..1ec84d801 100644 --- a/README.md +++ b/README.md @@ -91,10 +91,10 @@ agentcore invoke ### Resource Management -| Command | Description | -| -------- | ---------------------------------------------------- | -| `add` | Add agents, memory, credentials, evaluators, targets | -| `remove` | Remove resources from project | +| Command | Description | +| -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `add` | Add agents, memory, credentials, gateways and gateway-targets, evaluators, online evals, online insights, knowledge bases, config bundles, datasets, harnesses, policy engines and policies, payment managers and payment connectors, runtime endpoints | +| `remove` | Remove any of the above resources from the project | > **Note**: Run `agentcore deploy` after `add` or `remove` to update resources in AWS. @@ -145,6 +145,94 @@ agentcore invoke | `promote ab-test` | Apply the winning variant to `agentcore.json` | | `archive ab-test` | Delete the test on the service and clear local history | +### Knowledge Bases + +| Command | Description | +| -------------------- | ------------------------------------------------------- | +| `add knowledge-base` | Add a managed Bedrock Knowledge Base wired to a gateway | + +> See [Knowledge Bases](docs/knowledge-bases.md) for ingestion, vectorization, and retrieval setup. + +### Insights — `[preview]` + +Failure-pattern analysis across agent sessions. Insights configs run continuously alongside online evals and surface +clusters of bad outcomes. + +| Command | Description | +| ------------------------ | ----------------------------------------------------------- | +| `add online-insights` | Add a continuous insights config bound to a runtime | +| `run insights` | Run on-demand failure analysis across recent sessions | +| `view insights` | List insights jobs or view one in detail | +| `pause online-insights` | Pause a deployed online insights config | +| `resume online-insights` | Resume a paused online insights config | +| `archive insights` | Delete an insights job on the service + clear local history | + +### Harness — `[preview]` + +> Harness commands are only available in the preview release of the CLI. Install it with: +> +> ```bash +> npm install -g @aws/agentcore@preview +> ``` + +A harness bundles a runtime, model, tools, skills, memory, and observability into one declarative config. Use it when +you want infra without writing agent code. + +| Command | Description | +| ---------------- | --------------------------------------------------------------------------- | +| `add harness` | Add a harness resource (runtime + model + memory) | +| `add tool` | Add a tool to a harness (`--harness --type --name `) | +| `add skill` | Add a skill to a harness (`--harness ` + `--path` / `--s3` / `--git`) | +| `export harness` | Export a harness config to a deployable Strands Python agent under `app/` | + +> After `export harness`, **read `app//EXPORT_NOTES.md`** before running `deploy` — it lists any manual +> follow-up the exporter could not automate. + +### Policies & Guardrails + +Policy engines apply Cedar-based pre/post-call policies to agent invocations — including Bedrock content filters +(`VIOLENCE`, `HATE`, `SEXUAL`, `MISCONDUCT`, `INSULTS`), prompt-attack detection, and sensitive-information redaction. + +| Command | Description | +| ------------------- | -------------------------------------------------------------------- | +| `add policy-engine` | Add a Cedar policy engine to the project | +| `add policy` | Add a policy to a policy engine (form-based guardrails or raw Cedar) | + +### Payments + +Pay-per-call agent transactions via the [x402 protocol](https://www.x402.org/). When a tool call returns +`402 Payment Required`, the payments system signs and submits payment then retries automatically. + +| Command | Description | +| ----------------------- | ---------------------------------------------------------------------------- | +| `add payment-manager` | Add a payment manager (orchestrates payment sessions for the agent) | +| `add payment-connector` | Add a payment connector with provider credentials (CoinbaseCDP, StripePrivy) | + +> See [Payments](docs/payments.md) for the full setup including instrument creation and tool allowlists. + +### Datasets + +Curated session datasets for batch evaluation and recommendation runs. + +| Command | Description | +| ------------------------- | ------------------------------------------------------------------------- | +| `add dataset` | Add a dataset resource (session list, ground-truth file, or trace filter) | +| `dataset download` | Download a dataset version locally | +| `dataset publish-version` | Publish a new dataset version | +| `dataset remove-version` | Remove a dataset version | + +### Web Search Gateway Targets + +Add a managed web-search target to a gateway: + +```bash +agentcore add gateway-target --type connector --connector web-search \ + --gateway --name +# Optional: --exclude-domains "example.com,foo.org" +``` + +See [Gateway](docs/gateway.md) for full target setup including Lambda, MCP, OpenAPI, Smithy, and API Gateway. + ### Utilities | Command | Description | @@ -181,31 +269,62 @@ my-project/ Projects use JSON schema files in the `agentcore/` directory: -- `agentcore.json` - Agent specifications, memory, credentials, evaluators, online evals +- `agentcore.json` - Project resources (agents, memory, credentials, gateways, evaluators, online evals/insights, + knowledge bases, harnesses, policy engines and policies, payment managers and connectors, config bundles, datasets, + runtime endpoints) - `deployed-state.json` - Runtime state in agentcore/.cli/ (auto-managed) - `aws-targets.json` - Deployment targets (account, region) ## Capabilities - **Runtime** - Managed execution environment for deployed agents -- **Memory** - Semantic, summarization, and user preference strategies -- **Credentials** - Secure API key management via Secrets Manager -- **Evaluations** - LLM-as-a-Judge for on-demand and continuous agent quality monitoring +- **Memory** - Semantic, summarization, user-preference, and episodic strategies +- **Credentials** - Secure API key + OAuth credential management via Secrets Manager +- **Gateways** - MCP gateways with Lambda / MCP server / OpenAPI / Smithy / API Gateway / **web-search** / + **knowledge-base** targets +- **Evaluations** - LLM-as-a-Judge for on-demand, batch, and continuous agent quality monitoring +- **Recommendations** - Auto-optimize system prompts and tool descriptions from real session traces +- **A/B Tests** - Traffic-split between config-bundle or target-based variants and promote the winner +- **Insights** _[preview]_ - Failure-pattern analysis and clustering across agent sessions +- **Knowledge Bases** - Managed Bedrock Knowledge Bases auto-wired to gateways +- **Harness** - Declarative agent: bundle runtime + tools + skills + memory + observability without writing agent code +- **Policies & Guardrails** - Cedar pre/post-call policies including Bedrock content filters, prompt-attack detection, + and sensitive-information redaction +- **Payments** - x402-protocol microtransactions for pay-per-call tools and APIs +- **Config Bundles** - Versioned runtime configurations as a separately-deployable resource ## Documentation +**Reference** + - [CLI Commands Reference](docs/commands.md) - Full command reference for scripting and CI/CD - [Configuration](docs/configuration.md) - Schema reference for config files +- [Frameworks](docs/frameworks.md) - Supported frameworks and model providers +- [PERMISSIONS](docs/PERMISSIONS.md) - IAM permissions required to deploy + +**Resources & features** + +- [Memory](docs/memory.md) - Memory strategies and sharing +- [Gateway](docs/gateway.md) - Gateway setup, targets, and authentication +- [Knowledge Bases](docs/knowledge-bases.md) - Managed Bedrock Knowledge Bases wired to gateways +- [Payments](docs/payments.md) - x402-protocol microtransactions for paid tools/APIs +- [Config Bundles](docs/config-bundles.md) - Versioned runtime configurations +- [Container Builds](docs/container-builds.md) - Container build types and Dockerfile setup + +**Evaluation & quality** + - [Evaluations](docs/evals.md) - Evaluators, on-demand evals, and online monitoring - [Batch Evaluation](docs/batch-evaluation.md) - Run evaluators across sessions at scale - [Recommendations](docs/recommendations.md) - Optimize prompts and tool descriptions - [A/B Tests](docs/ab-tests.md) - Split traffic between variants and promote the winner -- [Config Bundles](docs/config-bundles.md) - Versioned runtime configurations -- [Frameworks](docs/frameworks.md) - Supported frameworks and model providers -- [Gateway](docs/gateway.md) - Gateway setup, targets, and authentication -- [Knowledge Bases](docs/knowledge-bases.md) - Managed Bedrock Knowledge Bases wired to gateways -- [Memory](docs/memory.md) - Memory strategies and sharing + +**Operations** + - [Local Development](docs/local-development.md) - Dev server and debugging +- [Transaction Search](docs/transaction_search.md) - Trace + log search across agent invocations +- [Telemetry](docs/telemetry.md) - CLI usage telemetry — what's collected and how to opt out +- [TUI Harness](docs/tui-harness.md) - Programmatic TUI driver for testing +- [Testing](docs/TESTING.md) - Unit, integration, and e2e test infrastructure - [Feedback](docs/feedback.md) - Submit feedback from your terminal ## Examples diff --git a/src/assets/__tests__/__snapshots__/assets.snapshot.test.ts.snap b/src/assets/__tests__/__snapshots__/assets.snapshot.test.ts.snap index 4b42f93a5..154ece89b 100644 --- a/src/assets/__tests__/__snapshots__/assets.snapshot.test.ts.snap +++ b/src/assets/__tests__/__snapshots__/assets.snapshot.test.ts.snap @@ -7148,13 +7148,21 @@ file maps to a JSON config file and includes validation constraints as comments ### Key Types -- **AgentCoreProjectSpec**: Root config with \`runtimes\`, \`memories\`, \`credentials\`, \`agentCoreGateways\`, \`evaluators\`, \`onlineEvalConfigs\`, \`policyEngines\` arrays +- **AgentCoreProjectSpec**: Root config with \`runtimes\`, \`memories\`, \`credentials\`, \`agentCoreGateways\`, \`evaluators\`, \`onlineEvalConfigs\`, \`onlineInsightsConfigs\`, \`knowledgeBases\`, \`harnesses\`, \`policyEngines\`, \`policies\`, \`payments\` (managers + connectors), \`configBundles\`, \`datasets\`, \`runtimeEndpoints\` arrays - **AgentEnvSpec**: Agent configuration (build type, entrypoint, code location, runtime version, network mode) - **Memory**: Memory resource with strategies (SEMANTIC, SUMMARIZATION, USER_PREFERENCE, EPISODIC) and expiry - **Credential**: API key or OAuth credential provider -- **AgentCoreGateway**: MCP gateway with targets (Lambda, MCP server, OpenAPI, Smithy, API Gateway) +- **AgentCoreGateway**: MCP gateway with targets (Lambda, MCP server, OpenAPI, Smithy, API Gateway, web-search, knowledge-base) - **Evaluator**: LLM-as-a-Judge or code-based evaluator - **OnlineEvalConfig**: Continuous evaluation pipeline bound to an agent +- **OnlineInsightsConfig** _[preview]_: Continuous failure-pattern analysis bound to an agent +- **KnowledgeBase**: Managed Bedrock Knowledge Base auto-wired to a gateway +- **Harness**: Declarative agent — runtime + tools + skills + memory + observability without writing agent code +- **PolicyEngine** + **Policy**: Cedar policy engine with form-based guardrails (Bedrock content filters, prompt-attack, sensitive-info) or raw Cedar policies +- **PaymentManager** + **PaymentConnector**: x402-protocol payment orchestration with provider credentials (CoinbaseCDP, StripePrivy) +- **ConfigBundle**: Versioned runtime configuration as a separately-deployable resource +- **Dataset**: Curated session dataset for batch evaluation and recommendation runs +- **RuntimeEndpoint**: Named endpoint (e.g. \`PROMPT_V1\`) targeting a specific runtime version ### Common Enum Values @@ -7162,8 +7170,11 @@ file maps to a JSON config file and includes validation constraints as comments - **NetworkMode**: \`'PUBLIC'\` | \`'VPC'\` - **RuntimeVersion**: \`'PYTHON_3_10'\` | \`'PYTHON_3_11'\` | \`'PYTHON_3_12'\` | \`'PYTHON_3_13'\` | \`'PYTHON_3_14'\` | \`'NODE_18'\` | \`'NODE_20'\` | \`'NODE_22'\` - **MemoryStrategyType**: \`'SEMANTIC'\` | \`'SUMMARIZATION'\` | \`'USER_PREFERENCE'\` | \`'EPISODIC'\` -- **GatewayTargetType**: \`'lambda'\` | \`'mcpServer'\` | \`'openApiSchema'\` | \`'smithyModel'\` | \`'apiGateway'\` | \`'lambdaFunctionArn'\` +- **GatewayTargetType**: \`'lambda'\` | \`'mcpServer'\` | \`'openApiSchema'\` | \`'smithyModel'\` | \`'apiGateway'\` | \`'lambdaFunctionArn'\` | \`'connector'\` (web-search, bedrock-knowledge-bases, bedrock-agentic-retrieve) - **ModelProvider**: \`'Bedrock'\` | \`'Gemini'\` | \`'OpenAI'\` | \`'Anthropic'\` +- **PaymentProvider**: \`'CoinbaseCDP'\` | \`'StripePrivy'\` +- **PolicyEnforcementMode**: \`'ACTIVE'\` | \`'PASSIVE'\` +- **GuardrailContentFilter**: \`'VIOLENCE'\` | \`'HATE'\` | \`'SEXUAL'\` | \`'MISCONDUCT'\` | \`'INSULTS'\` ### Build Types @@ -7229,22 +7240,73 @@ cat app//EXPORT_NOTES.md # read this before touching anyt ## CLI Commands +Run \`agentcore --help\` or \`agentcore --help\` for full flags. Commonly used: + +**Project lifecycle** + | Command | Description | | --- | --- | | \`agentcore create\` | Create a new project | -| \`agentcore add \` | Add agent, memory, credential, gateway, evaluator, policy | -| \`agentcore remove \` | Remove a resource | -| \`agentcore export harness\` | Export a harness to a Strands runtime agent | | \`agentcore dev\` | Run agent locally with hot-reload | | \`agentcore deploy\` | Deploy to AWS | -| \`agentcore status\` | Show deployment status | | \`agentcore invoke\` | Invoke agent (local or deployed) | -| \`agentcore logs\` | View agent logs | -| \`agentcore traces\` | View agent traces | -| \`agentcore eval\` | Run evaluations against an agent | -| \`agentcore package\` | Package agent artifacts | +| \`agentcore status\` | Show deployment status | | \`agentcore validate\` | Validate configuration | -| \`agentcore pause\` / \`resume\` | Pause or resume a deployed agent | +| \`agentcore package\` | Package agent artifacts | +| \`agentcore import\` | Import resources from a Bedrock AgentCore Starter Toolkit project | + +**Resources** + +| Command | Description | +| --- | --- | +| \`agentcore add \` | Add agent, memory, credential, gateway, gateway-target, evaluator, online-eval, online-insights, knowledge-base, harness, policy-engine, policy, payment-manager, payment-connector, config-bundle, dataset, runtime-endpoint | +| \`agentcore remove \` | Remove any resource | +| \`agentcore export harness\` | Export a harness to a Strands runtime agent under \`app//\` | + +**Jobs (run, view, archive, lifecycle)** + +| Command | Description | +| --- | --- | +| \`agentcore run eval\` | Run on-demand evaluation against agent traces | +| \`agentcore run batch-evaluation\` | Run evaluators across all sessions at scale | +| \`agentcore run recommendation\` | Optimize prompts or tool descriptions from real traces | +| \`agentcore run insights\` _[preview]_ | Run failure-pattern analysis across sessions | +| \`agentcore run ab-test\` | Start an A/B test (config-bundle or target-based) | +| \`agentcore run ingest\` | Start a fresh ingestion job for every data source on a deployed knowledge base | +| \`agentcore view \` | List or view jobs (recommendation, batch-evaluation, ab-test, insights) | +| \`agentcore archive \` | Delete a job on the service + clear local history | +| \`agentcore stop \` | Stop a running batch-evaluation or ab-test | +| \`agentcore promote ab-test\` | Apply the winning variant to \`agentcore.json\` | +| \`agentcore pause \` / \`agentcore resume \` | Pause/resume a deployed online-eval, online-insights, or ab-test | + +**Config bundles & datasets** + +| Command | Description | +| --- | --- | +| \`agentcore config-bundle versions\` (alias \`cb versions\`) | List version history for a bundle | +| \`agentcore config-bundle diff\` | Diff two versions of a bundle | +| \`agentcore config-bundle create-branch\` | Create a new branch on an existing bundle | +| \`agentcore dataset download\` | Download a dataset version locally | +| \`agentcore dataset publish-version\` | Publish a new dataset version | +| \`agentcore dataset remove-version\` | Remove a dataset version | + +**Observability & history** + +| Command | Description | +| --- | --- | +| \`agentcore logs\` | Stream/search agent runtime logs | +| \`agentcore logs evals\` | Stream/search online-eval logs | +| \`agentcore traces list\` / \`agentcore traces get\` | List recent traces or download one to JSON | +| \`agentcore evals history\` | View past on-demand eval results | + +**Utilities** + +| Command | Description | +| --- | --- | +| \`agentcore fetch access\` | Fetch access info for deployed gateway or agent | +| \`agentcore feedback\` | Send feedback (with optional screenshot) to the AgentCore team | +| \`agentcore update\` | Check for and install CLI updates | +| \`agentcore telemetry\` | View or change telemetry preferences | " `; diff --git a/src/assets/agents/AGENTS.md b/src/assets/agents/AGENTS.md index 431ce6cc3..7fd817773 100644 --- a/src/assets/agents/AGENTS.md +++ b/src/assets/agents/AGENTS.md @@ -56,13 +56,21 @@ file maps to a JSON config file and includes validation constraints as comments ### Key Types -- **AgentCoreProjectSpec**: Root config with `runtimes`, `memories`, `credentials`, `agentCoreGateways`, `evaluators`, `onlineEvalConfigs`, `policyEngines` arrays +- **AgentCoreProjectSpec**: Root config with `runtimes`, `memories`, `credentials`, `agentCoreGateways`, `evaluators`, `onlineEvalConfigs`, `onlineInsightsConfigs`, `knowledgeBases`, `harnesses`, `policyEngines`, `policies`, `payments` (managers + connectors), `configBundles`, `datasets`, `runtimeEndpoints` arrays - **AgentEnvSpec**: Agent configuration (build type, entrypoint, code location, runtime version, network mode) - **Memory**: Memory resource with strategies (SEMANTIC, SUMMARIZATION, USER_PREFERENCE, EPISODIC) and expiry - **Credential**: API key or OAuth credential provider -- **AgentCoreGateway**: MCP gateway with targets (Lambda, MCP server, OpenAPI, Smithy, API Gateway) +- **AgentCoreGateway**: MCP gateway with targets (Lambda, MCP server, OpenAPI, Smithy, API Gateway, web-search, knowledge-base) - **Evaluator**: LLM-as-a-Judge or code-based evaluator - **OnlineEvalConfig**: Continuous evaluation pipeline bound to an agent +- **OnlineInsightsConfig** _[preview]_: Continuous failure-pattern analysis bound to an agent +- **KnowledgeBase**: Managed Bedrock Knowledge Base auto-wired to a gateway +- **Harness**: Declarative agent — runtime + tools + skills + memory + observability without writing agent code +- **PolicyEngine** + **Policy**: Cedar policy engine with form-based guardrails (Bedrock content filters, prompt-attack, sensitive-info) or raw Cedar policies +- **PaymentManager** + **PaymentConnector**: x402-protocol payment orchestration with provider credentials (CoinbaseCDP, StripePrivy) +- **ConfigBundle**: Versioned runtime configuration as a separately-deployable resource +- **Dataset**: Curated session dataset for batch evaluation and recommendation runs +- **RuntimeEndpoint**: Named endpoint (e.g. `PROMPT_V1`) targeting a specific runtime version ### Common Enum Values @@ -70,8 +78,11 @@ file maps to a JSON config file and includes validation constraints as comments - **NetworkMode**: `'PUBLIC'` | `'VPC'` - **RuntimeVersion**: `'PYTHON_3_10'` | `'PYTHON_3_11'` | `'PYTHON_3_12'` | `'PYTHON_3_13'` | `'PYTHON_3_14'` | `'NODE_18'` | `'NODE_20'` | `'NODE_22'` - **MemoryStrategyType**: `'SEMANTIC'` | `'SUMMARIZATION'` | `'USER_PREFERENCE'` | `'EPISODIC'` -- **GatewayTargetType**: `'lambda'` | `'mcpServer'` | `'openApiSchema'` | `'smithyModel'` | `'apiGateway'` | `'lambdaFunctionArn'` +- **GatewayTargetType**: `'lambda'` | `'mcpServer'` | `'openApiSchema'` | `'smithyModel'` | `'apiGateway'` | `'lambdaFunctionArn'` | `'connector'` (web-search, bedrock-knowledge-bases, bedrock-agentic-retrieve) - **ModelProvider**: `'Bedrock'` | `'Gemini'` | `'OpenAI'` | `'Anthropic'` +- **PaymentProvider**: `'CoinbaseCDP'` | `'StripePrivy'` +- **PolicyEnforcementMode**: `'ACTIVE'` | `'PASSIVE'` +- **GuardrailContentFilter**: `'VIOLENCE'` | `'HATE'` | `'SEXUAL'` | `'MISCONDUCT'` | `'INSULTS'` ### Build Types @@ -137,19 +148,70 @@ cat app//EXPORT_NOTES.md # read this before touching anyt ## CLI Commands +Run `agentcore --help` or `agentcore --help` for full flags. Commonly used: + +**Project lifecycle** + | Command | Description | | --- | --- | | `agentcore create` | Create a new project | -| `agentcore add ` | Add agent, memory, credential, gateway, evaluator, policy | -| `agentcore remove ` | Remove a resource | -| `agentcore export harness` | Export a harness to a Strands runtime agent | | `agentcore dev` | Run agent locally with hot-reload | | `agentcore deploy` | Deploy to AWS | -| `agentcore status` | Show deployment status | | `agentcore invoke` | Invoke agent (local or deployed) | -| `agentcore logs` | View agent logs | -| `agentcore traces` | View agent traces | -| `agentcore eval` | Run evaluations against an agent | -| `agentcore package` | Package agent artifacts | +| `agentcore status` | Show deployment status | | `agentcore validate` | Validate configuration | -| `agentcore pause` / `resume` | Pause or resume a deployed agent | +| `agentcore package` | Package agent artifacts | +| `agentcore import` | Import resources from a Bedrock AgentCore Starter Toolkit project | + +**Resources** + +| Command | Description | +| --- | --- | +| `agentcore add ` | Add agent, memory, credential, gateway, gateway-target, evaluator, online-eval, online-insights, knowledge-base, harness, policy-engine, policy, payment-manager, payment-connector, config-bundle, dataset, runtime-endpoint | +| `agentcore remove ` | Remove any resource | +| `agentcore export harness` | Export a harness to a Strands runtime agent under `app//` | + +**Jobs (run, view, archive, lifecycle)** + +| Command | Description | +| --- | --- | +| `agentcore run eval` | Run on-demand evaluation against agent traces | +| `agentcore run batch-evaluation` | Run evaluators across all sessions at scale | +| `agentcore run recommendation` | Optimize prompts or tool descriptions from real traces | +| `agentcore run insights` _[preview]_ | Run failure-pattern analysis across sessions | +| `agentcore run ab-test` | Start an A/B test (config-bundle or target-based) | +| `agentcore run ingest` | Start a fresh ingestion job for every data source on a deployed knowledge base | +| `agentcore view ` | List or view jobs (recommendation, batch-evaluation, ab-test, insights) | +| `agentcore archive ` | Delete a job on the service + clear local history | +| `agentcore stop ` | Stop a running batch-evaluation or ab-test | +| `agentcore promote ab-test` | Apply the winning variant to `agentcore.json` | +| `agentcore pause ` / `agentcore resume ` | Pause/resume a deployed online-eval, online-insights, or ab-test | + +**Config bundles & datasets** + +| Command | Description | +| --- | --- | +| `agentcore config-bundle versions` (alias `cb versions`) | List version history for a bundle | +| `agentcore config-bundle diff` | Diff two versions of a bundle | +| `agentcore config-bundle create-branch` | Create a new branch on an existing bundle | +| `agentcore dataset download` | Download a dataset version locally | +| `agentcore dataset publish-version` | Publish a new dataset version | +| `agentcore dataset remove-version` | Remove a dataset version | + +**Observability & history** + +| Command | Description | +| --- | --- | +| `agentcore logs` | Stream/search agent runtime logs | +| `agentcore logs evals` | Stream/search online-eval logs | +| `agentcore traces list` / `agentcore traces get` | List recent traces or download one to JSON | +| `agentcore evals history` | View past on-demand eval results | + +**Utilities** + +| Command | Description | +| --- | --- | +| `agentcore fetch access` | Fetch access info for deployed gateway or agent | +| `agentcore feedback` | Send feedback (with optional screenshot) to the AgentCore team | +| `agentcore update` | Check for and install CLI updates | +| `agentcore telemetry` | View or change telemetry preferences |