Ship evals before you ship features.
-
Updated
Feb 17, 2026 - Nunjucks
Ship evals before you ship features.
Security working agreements for AI coding agents: hardened AGENTS.md, prompt/tool-injection guardrails, dependency hygiene, Scorecard-ready OSS setup
Agent-CE is a containerized continuous evaluation (CE) platform for web browsing agents. It provides production-ready Docker images and CI/CD pipelines for running and evaluating multiple agent frameworks including Browser Use, Notte, Anthropic Computer Use, and OpenAI Computer Use.
Add a description, image, and links to the continuous-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the continuous-evaluation topic, visit your repo's landing page and select "manage topics."