Compliant Auditable Natural-language Directive Enforcement & Ledger Anchoring
- How to Run (Primary Guide): docs/RUNNING.md
- Fast Reviewer Demo (Recommended):
python3 scripts/candela_demo.py - Reviewer Checklist (5 min verification): docs/REVIEWER_CHECKLIST.md
- OSF DOI: 10.17605/OSF.IO/3S7BT
Large AI models (including LLMs) are powerful tools, but they come with inherent risks for any organisation where trust, safety, and compliance are paramount. Unpredictable outputs ("hallucinations"), gradual deviation from instructions ("drift"), and opaque "black box" reasoning make it difficult to deploy them in mission-critical roles.
CANDELA is a model-agnostic—and more importantly, intelligence-agnostic—governance framework. It acts as an external "Guardian" that applies the same rigorous standards of verification to all output, whether the author is human or machine.
How it works (concise):
- Rules you can see: A human-readable Directive Scaffold; its canonical hash is anchored on-chain for tamper evidence.
- Checks you can configure: Regex + semantic (MiniLM / all-MiniLM-L6-v2) with modes:
strict(default, blocks until semantic passes),sync_light(returns fast, semantic runs in background), orregex_only(no semantic). Warm preload avoids cold-start lag; latency is logged. - Proof you can show: Every checked output is logged off-chain, batched into a Merkle root, and that root is anchored on-chain via
src/anchor_outputs.py. You can later prove any specific output with its Merkle proof against the anchored root.- Privacy note: logging can be configured to store full text (demo-friendly) or hash-only (safer) via
config/guardian_scoring.yamlunderlogging:.
- Privacy note: logging can be configured to store full text (demo-friendly) or hash-only (safer) via
This keeps the UX fast while preserving cryptographic, auditable provenance for both the rule-set and the outputs.
Verification tools (CLI)
python3 src/verify_output.py --line N(or--hash <text_sha256>) prints the log entry, its Merkle proof, and the computed root to compare with the anchored root.python3 src/latency_stats.pyprints p50/p95 latency fromlogs/latency_log.jsonl, broken down by mode.
Optional demo ruleset packs
- Default baseline ruleset:
src/directives_schema.json - Optional packs (only if you want to explore):
rulesets/security_hardening.json,rulesets/privacy_strict.json - Select a pack with:
python3 run_guardian.py --input <file> --ruleset security_hardening
Visuals
- Mermaid diagrams of the Guardian flow and mode selection: see
docs/diagrams.md(rendered on GitHub). - Optional demo wiring (model generates output, CANDELA checks it):
docs/MODEL_INTEGRATION.md
Validation coverage
- What is enforced (BLOCK vs WARN) in the current enterprise ruleset (E1.0):
docs/VALIDATION_COVERAGE.md - Exact directive counts/IDs (reviewer-friendly):
python3 src/report_directives.py
The Core Idea: By separating the rules (the what) from the model (the how), Candela makes AI governance explicit, auditable, and reliable.
This project is evolving through distinct phases, moving from a robust foundation to a wider ecosystem.
The initial goal was to prove the core concept: that an external governance layer could verifiably enforce a set of rules. This phase delivered:
- A Core Guardian capable of regex and semantic checks (MiniLM detector in
src/detectors/mini_semantic.py). - The first On-Chain Anchoring of the directive set on the Sepolia testnet, creating a permanent, cryptographic proof of the rules.
- A full Reproducibility Suite (
pytest) to ensure the integrity of the framework can be independently verified.
The current focus is on making the PoC robust, efficient, and ready for expert review. This involves:
- Performance Optimisation: Implementing a low-latency runtime with caching to ensure the Guardian is fast enough for real-time use without impacting user experience.
- Community Outreach: Engaging with experts in AI safety, security, and compliance to gather feedback and stress-test the framework's principles.
- Documentation Polish: Creating a "single source of truth" through clear documentation (like this README) and a professional landing page on the Open Science Framework (OSF).
The next stage is to expand Candela from a standalone PoC into an extensible platform. This will involve:
- Standardised Plug-in Interfaces: Creating a formal API for third-party "detectors" (e.g., for prompt injection, stylometry, or other specialised checks) to integrate with the Guardian.
- A Benchmark Gallery: Developing a public repository of pass/fail examples to provide a clear benchmark for the Guardian's performance and to help the community contribute new tests.
CANDELA is designed as a foundational technology for verifiable governance. Its core principles can be extended far beyond simple AI output checking. The concepts below are speculative, long-term examples of the framework's power.
-
Prompt Injection Defence: The Guardian's rule-set can be extended to detect and block malicious prompt-injection attacks at both the input and output stages, with the entire "session recipe" anchored on-chain for forensic analysis.
-
Ransomware Defence: The framework could be adapted to govern file-system operations, using on-chain anchored file-state hashes (Merkle roots) to detect and block the unauthorised mass-encryption characteristic of ransomware.
-
Incentivising Quality (Post-v1.0): The "Anti-Slop" Engine Down the roadmap, after the core governance framework is mature, its principles could be used to address the growing problem of low-quality, machine-generated "AI slop." The Guardian's scoring mechanism could power a "Quality Token Engine," creating a tangible economic incentive that rewards human creators for producing verifiably high-quality, directive-compliant work. This remains a conceptual exploration focused on using verifiable quality to foster a healthier digital ecosystem.
- Clone the Repository
git clone [https://github.com/jebus197/CANDELA.git](https://github.com/jebus197/CANDELA.git) && cd CANDELA
- Set Up Your Environment
- Python 3.8+ is required.
- (Optional but recommended) Create and activate a virtual environment.
- Install dependencies (includes
sentence-transformersandtorch, which are larger downloads):pip install -r requirements.txt
- Run Tests to Verify Integrity
python3 -m pytest tests
See DEVELOPER_NOTES.md and the issues list.
MIT — see LICENSE.