Minimal Python smoke-test scaffold for evaluating Vertex AI Agent Engine Runtime platform capabilities.
This repo is not a full benchmark. It has a dry-run friendly local path and an optional cloud path that deploys a tiny custom Python agent to Vertex AI Agent Engine, invokes it once, and collects evidence.
- A local custom agent can run.
- The agent exposes at least two tools:
retrieve_document(query: str) -> dictsubmit_report(report: str, sensitive: bool = False, approval: bool = False) -> dict
- The agent routes prompts to the expected tool.
submit_reportblocks sensitive submissions unlessapproval=True.- Structured JSONL audit logs, checkpoint events, and trace events are written.
- Metric variables are recorded for Reliability, Recoverability, Governance, Observability, and Portability.
- Evidence artifacts are generated under
evidence/.
These parts run locally and do not call Google Cloud:
- Tool routing and tool execution.
- Governance policy check for sensitive report submission.
- Checkpoint simulation via
results/checkpoints.jsonl. - Audit simulation via
results/audit_log.jsonl. - Trace simulation via
results/trace_events.jsonl. - Smoke metrics in
results/smoke_metrics.jsonandresults/smoke_metrics.csv. - Evidence copies in
evidence/.
The local runner executes:
T01: retrieve document and summarize.T02: callsubmit_reportwith non-sensitive content.T03: blocksubmit_reportwith sensitive content.T04: simulate failure after checkpoint and resume.T05: simulate duplicate prevention using an idempotency key.
Google Cloud is required only when you are ready to test actual Vertex AI Agent Runtime / Agent Engine deployment and managed platform behavior.
Cloud-side work may include:
- Creating or selecting a Google Cloud project.
- Enabling required APIs.
- Authenticating with
gcloud. - Packaging an ADK-compatible agent.
- Deploying to Agent Runtime / Agent Engine.
- Collecting managed Cloud Logging, Monitoring, and Trace evidence.
This scaffold does not hardcode credentials. Cloud scripts are explicitly invoked and can create billable resources only when you choose to run them.
Cost warning:
- Enabling APIs is usually not the expensive part.
- Creating a Cloud Storage staging bucket can incur storage and operation charges.
- Deploying an Agent Engine runtime can create billable managed resources.
- Clean up the deployed agent and staging bucket after testing.
- Go to https://cloud.google.com/free.
- Start the free trial and complete the account setup.
- Open the Google Cloud Console.
- Create a new project and note the project ID.
- Open Cloud Shell for a browser-based environment with
gcloudpreinstalled.
Review Google Cloud pricing and free trial terms before deploying resources.
From Cloud Shell or a workstation with gcloud installed:
export PROJECT_ID="your-project-id"
export REGION="us-central1" # optional; defaults to us-central1 in cloud scripts
./scripts/setup_gcloud.shThe setup script enables:
aiplatform.googleapis.comlogging.googleapis.commonitoring.googleapis.comcloudtrace.googleapis.comartifactregistry.googleapis.comcloudbuild.googleapis.com
Use Python 3.11 or newer.
cd vertex-agent-smoke
python --version
./scripts/run_local_smoke.shYou can also run it as a module:
PYTHONPATH=src python -m vertex_agent_smoke.smoke_runnerGenerated outputs:
results/audit_log.jsonlresults/checkpoints.jsonlresults/trace_events.jsonlresults/smoke_metrics.jsonresults/smoke_metrics.csvevidence/smoke_evidence.json- copied evidence files under
evidence/
Install the optional cloud dependency in a Python 3.11+ environment:
python -m pip install -U 'google-cloud-aiplatform[agent_engines]'Authenticate without hardcoding credentials:
gcloud auth login
gcloud config set project "${PROJECT_ID}"For local development outside Cloud Shell, you may also need Application Default Credentials:
gcloud auth application-default loginSet the required environment variables:
export PROJECT_ID="your-project-id"
export REGION="us-central1"
export AGENT_DISPLAY_NAME="vertex-agent-smoke"
export STAGING_BUCKET="gs://your-globally-unique-agent-staging-bucket"Create the staging bucket explicitly:
./scripts/create_staging_bucket.shThe script uses gcloud storage buckets create and skips creation if the bucket is already visible in the configured project.
The cloud deploy path uses the official Vertex AI SDK / Agent Engine SDK style shown in the Agent Engine docs:
from google.cloud.aiplatform import vertexaiclient = vertexai.Client(project=..., location=...)client.agent_engines.create(...)
Run:
./scripts/deploy_agent_runtime.shThe script verifies:
PROJECT_IDREGION, defaulting tous-central1STAGING_BUCKETAGENT_DISPLAY_NAME- active
gcloudauthentication - current
gcloudproject
On success it writes:
evidence/deployed_agent_resource.txtevidence/deployed_agent_deploy_metadata.json
If the installed SDK does not expose the documented Agent Engine API, the script fails with instructions to upgrade instead of trying an undocumented fallback.
Invoke the deployed cloud agent once:
./scripts/invoke_deployed_agent.shOr pass a custom prompt:
./scripts/invoke_deployed_agent.sh "retrieve document for cloud smoke evidence"The response is written to:
evidence/deployed_agent_response.json
Collect basic Google Cloud evidence:
./scripts/collect_cloud_evidence.shGenerated files:
evidence/gcloud_enabled_services.txtevidence/gcloud_config.txtevidence/cloud_logging_recent.jsonevidence/cloud_trace_monitoring_todos.txt
Cloud Trace and Cloud Monitoring exports are left as TODOs because the stable CLI filters and metric type names can vary by deployed runtime and SDK version. The script records that gap explicitly instead of producing misleading evidence.
Delete the Agent Engine deployment after testing:
./scripts/delete_deployed_agent.shThe script uses the documented client-based SDK deletion path:
client.agent_engines.delete(name=RESOURCE_NAME, force=True)The resource name comes from:
cat evidence/deployed_agent_resource.txtThen delete the staging bucket when you no longer need deployment artifacts:
gcloud storage rm -r "${STAGING_BUCKET}" --project="${PROJECT_ID}"This permanently deletes bucket contents. Review the bucket before running the cleanup command.
The cloud path is based on these official Google Cloud docs:
- https://docs.cloud.google.com/agent-builder/agent-engine/quickstart
- https://docs.cloud.google.com/agent-builder/agent-engine/deploy
- https://docs.cloud.google.com/agent-builder/agent-engine/set-up
- https://docs.cloud.google.com/agent-builder/agent-engine/develop/adk
- https://docs.cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/develop/custom
scripts/deploy_agent_runtime.sh now performs an optional SDK deployment. It is no longer a placeholder.
The smoke runner records these variables:
T, Ts, C, Cc, R, V, S, Sc, F, Fr, A, Ad, P, Pe, E, El, W, Wa, Q, Qc, D, Dref, O, Ov, M, Ms, Tm, Tref, N, Np, Rsc, Rcsc, Gsc, Osc, Psc, OS
Score formulas:
Rsc = 1/3 * (Ts/T + Cc/C + V/R)
Rcsc = 1/3 * (Sc/S + Fr/F + (1 - Ad/A))
Gsc = 1/3 * (Pe/P + El/E + Wa/W)
Osc = 1/3 * (Qc/Q + (1 - D/Dref) + Ov/O)
Psc = 1/3 * (Ms/M + (1 - Tm/Tref) + (1 - Np/N))
OS = 1/5 * (Rsc + Rcsc + Gsc + Osc + Psc)
Division by zero is handled safely by returning 0.0 for undefined ratios.
After a local run, verify:
evidence/smoke_evidence.jsonshows all capability flags astrue.evidence/audit_log.jsonlcontains tool selection, policy block, and duplicate prevention events.evidence/checkpoints.jsonlcontains checkpoint save and resume events.evidence/trace_events.jsonlcontains trace start, end, and simulated error events.evidence/smoke_metrics.jsoncontains all metric variables and computed scores.evidence/smoke_metrics.csvcontains the same variables in tabular form.
After a cloud run, verify:
evidence/deployed_agent_resource.txtcontains the Agent Engine resource name.evidence/deployed_agent_response.jsoncontains one successful remote response.evidence/gcloud_enabled_services.txtcontains enabled Google Cloud services.evidence/gcloud_config.txtcaptures the active gcloud configuration.evidence/cloud_logging_recent.jsoncontains recent logs or a clear warning.