continuous-evaluation

Star

Here are 5 public repositories matching this topic...

greynewell / evaldriven.org

Sponsor

Star

Ship evals before you ship features.

Updated Feb 25, 2026
Nunjucks

CloudDefenseAI / secure-agents-md

Star

Security working agreements for AI coding agents: hardened AGENTS.md, prompt/tool-injection guardrails, dependency hygiene, Scorecard-ready OSS setup

ai agents governance secure-coding ai-agents ai-security rag continuous-evaluation responsible-ai secure-supply-chain prompt-injection llm-security agentic-ai coding-agent agentic-ai-security agentsmd zero-trust-ai

Updated Mar 2, 2026

Agent-CE is a containerized continuous evaluation (CE) platform for web browsing agents. It provides production-ready Docker images and CI/CD pipelines for running and evaluating multiple agent frameworks including Browser Use, Notte, Anthropic Computer Use, and OpenAI Computer Use.

computer-vision evaluation ci-cd evaluation-metrics cua evaluation-framework continuous-evaluation web-agent browser-agent computer-use browser-use

Updated Oct 29, 2025
Python

runemdown / ai-agent-security-hardening

Star

Protect macOS AI agents from identity theft with shell scripts that secure configs, keys, tokens, and memory against autonomous proxy attacks.

bash docker devops research multi-agent hardening agents homelab claude ai-security rag continuous-evaluation ops-admin llm-security notebooklm coding-agents model-context-protocol agentsmd zero-trust-ai

Updated Mar 4, 2026
Shell

adrianlol7 / evaldriven.org

Star

Define, measure, and enforce code correctness with Eval-Driven Development, ensuring every probabilistic system ships with automated proof of quality.

testing devops benchmarking machine-learning automation best-practices evaluation manifesto software-engineering methodology quality-assurance ai-safety continuous-evaluation ai-engineering ai-evaluation ai-testing ai-quality llm-evaluation eval-driven-development

Updated Mar 4, 2026
Nunjucks

Improve this page

Add a description, image, and links to the continuous-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the continuous-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

continuous-evaluation

Here are 5 public repositories matching this topic...

greynewell / evaldriven.org

CloudDefenseAI / secure-agents-md

anaishowland / agent-CE

runemdown / ai-agent-security-hardening

adrianlol7 / evaldriven.org

Improve this page

Add this topic to your repo