continuous-evaluation

Star

Here are 3 public repositories matching this topic...

greynewell / evaldriven.org

Sponsor

Star

Ship evals before you ship features.

Updated Feb 17, 2026
Nunjucks

CloudDefenseAI / secure-agents-md

Star

Security working agreements for AI coding agents: hardened AGENTS.md, prompt/tool-injection guardrails, dependency hygiene, Scorecard-ready OSS setup

ai agents governance secure-coding ai-agents ai-security rag continuous-evaluation responsible-ai secure-supply-chain prompt-injection llm-security agentic-ai coding-agent agentic-ai-security agentsmd zero-trust-ai

Updated Feb 17, 2026

Agent-CE is a containerized continuous evaluation (CE) platform for web browsing agents. It provides production-ready Docker images and CI/CD pipelines for running and evaluating multiple agent frameworks including Browser Use, Notte, Anthropic Computer Use, and OpenAI Computer Use.

computer-vision evaluation ci-cd evaluation-metrics cua evaluation-framework continuous-evaluation web-agent browser-agent computer-use browser-use

Updated Oct 29, 2025
Python

Improve this page

Add a description, image, and links to the continuous-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the continuous-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

continuous-evaluation

Here are 3 public repositories matching this topic...

greynewell / evaldriven.org

CloudDefenseAI / secure-agents-md

anaishowland / agent-CE

Improve this page

Add this topic to your repo