Skip to content

edycutjong/docdrift

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DocDrift Icon

DocDrift πŸ”

AI-powered Code↔Docs Drift Detector and Security-first Auditing Pipeline

DocDrift Hero Banner

Live Demo Pitch Video Pitch Deck Built for Anna AI-Native Hackathon


Python 3.11 Node.js 22 AES-GCM-256 Anna Storage R2 Upload CI/CD Pipeline


πŸ“Έ See it in Action

Interactive Audit Walkthrough

1. Workspace Config & Setup

1. Setup

2. Analysis Dashboard

2. Dashboard

3. Side-by-Side Drift Viewer

3. Drift Viewer

4. AI Auditor Chat

4. AI Chat

5. Accepted Fix Status

5. Accepted

6. Exported R2 Signed Bundle

6. Export

The Audit Lifecycle: 1. Scans repository exports -> 2. Checks document mentions -> 3. Classifies drift & suggests corrections -> 4. Persists scan history to Anna KV -> 5. Exports signed .patch bundles to Cloudflare R2.


πŸ’‘ The Problem & Solution

Documentation rots silently. As APIs evolve, README guides and comment blocks drift, leading to onboarding failures and broken integrations.

DocDrift solves this by walking local codebases inside a secure sandbox to parse symbols (functions, classes, endpoints), hashing signatures via SHA-256, and cross-referencing them against Markdown files. Sensitive code snippets are encrypted under AES-GCM-256 prior to LLM drift classification.

Key Features:

  • ⚑ Local Walkers: Lightweight Python Executa process directory scans in <10ms.
  • πŸ”’ IP Protection: Ephemeral local AES keys encrypt snippets in transit and KV storage.
  • πŸ€– Auditor Agent: Interactive agent.session.* chatbot to explain and review signature drift.
  • πŸ“¦ R2 Export: Generates unified .patch bundles and uploads to R2 via host/uploadFile reverse-RPC.
  • πŸ’Ύ Persistent History: Scan history persisted to Anna Persistent Storage (APS KV) via storage/set β€” no external database needed.

πŸ—οΈ Architecture & Tech Stack

graph TD
    UI[HTML/CSS/JS Iframe] -->|window.open_view| DV[Drift Viewer View]
    UI -->|tools.invoke| EX[Python Executa Process]
    EX -->|SHA-256| HASH[Symbol Hash Verification]
    EX -->|AES-GCM-256| CRYP[Local Snippet Encryptor]
    CRYP -->|sampling/createMessage| LLM[Host LLM Reverse-RPC]
    EX -->|host/uploadFile| R2[Anna R2 Object Storage]
    EX -->|storage/set + storage/get| APS[Anna Persistent Storage KV]
    UI -->|storage.set/get| APS
Loading

πŸ”Œ Anna Platform Integration

DocDrift exercises the full Anna SDK capability surface:

Reverse-RPC Methods (Plugin β†’ Host)

Method Purpose Implementation
sampling/createMessage LLM inference for drift classification _sample() in plugin.py
storage/get Read persistent scan history from APS KV _storage_get() in plugin.py
storage/set Write scan history entries to APS KV _storage_set() in plugin.py
storage/delete Remove scan entries from APS KV _storage_delete() in plugin.py
storage/list List all past scan keys in APS KV _storage_list() in plugin.py
host/uploadFile (inline) Upload generated .diff patches to R2 _host_upload_inline() in plugin.py
host/uploadFile (negotiate+confirm) Stream large reports to R2 _host_upload_negotiate() and _host_upload_confirm()
embeddings/create Compute dense vectors for code and docs _embed() in plugin.py
image/generate Generate visual architecture illustrations _image_generate() in plugin.py
files/upload_begin + complete Durable artifact uploads (2-phase) _files_upload() in plugin.py
files/download_url Mint presigned links for archived reports _files_download_url() in plugin.py
files/list List archived report files _files_list() in plugin.py
files/delete Purge archived files _files_delete() in plugin.py
agent/complete Stateless L1 completion _agent_complete() in plugin.py
agent/session.create + run + history + cancel + delete Stateful L2 multi-turn agent sessions _agent_session_create(), _agent_session_run(), etc.

Host Capabilities Declared

Capability Usage
llm.sample Host-brokered LLM for drift classification & stateless completion
llm.embed Vector embedding compute for semantic search
llm.image DALL-E visual diagram generation
llm.agent.auto Stateful multi-turn L2 agent sessions
aps.kv Persistent scan history (last 50 scans)
host.upload R2 artifact upload for generated patches

Manifest Features (Schema 2)

Feature Status
schema: 2 βœ…
host_capabilities βœ… llm.sample, llm.embed, llm.image, llm.agent.auto, host.upload
user_message_prefix_template βœ…
system_prompt_addendum βœ…
optional_executas βœ…
csp_overrides βœ…
state_merge βœ…
dev.fixtures βœ…
dev.seed_storage βœ…
host_api.upload (negotiate + confirm) βœ…
host_api.chat (write_message + append_artifact) βœ…
host_api.storage (get/set/delete/list) βœ…
host_api.window (set_title/open_view/close) βœ…
host_api.llm (complete/embed) βœ…
host_api.image (generate) βœ…
host_api.agent (session) βœ…
Multiple views with min_size/max_size βœ… 2 views
Developer Console βœ… Interactive SDK playground & live log console
tags βœ…

Cryptographic Security

Layer Algorithm
Snippet encryption AES-GCM-256 (ephemeral session keys)
Symbol hashing SHA-256

πŸ† Sponsor Tracks Targeted

  • Winner Takes All β€” $300: Deep, real Anna integration β€” host LLM sampling/createMessage, APS KV storage (get/set/list/delete), durable APS Files, R2 uploads, embeddings/create semantic search, and image/generate diagrams β€” all driven through real Executa tools, a multi-view UI (main + drift_viewer), and chat.append_artifact cards, with local AES-GCM-256 cryptography. A sandboxed Developer Console lets you exercise the Host-API surface live (calls return labeled mock responses when run outside the Anna host).

πŸ“ Project Structure

dorahacks-anna-docdrift/
β”œβ”€β”€ app.json                    # App listing metadata
β”œβ”€β”€ manifest.json               # Anna App manifest (schema: 2)
β”œβ”€β”€ LICENSE                     # MIT License
β”œβ”€β”€ DECISIONS.md                # Architectural decisions log
β”œβ”€β”€ SPONSOR_DEFENSE.md          # SDK integration citations
β”œβ”€β”€ package.json                # Project script definitions
β”œβ”€β”€ bundle/
β”‚   β”œβ”€β”€ index.html              # Frontend SPA structure
β”‚   β”œβ”€β”€ styles.css              # Modern dark theme styles
β”‚   β”œβ”€β”€ tokens.css              # Design tokens
β”‚   β”œβ”€β”€ app.js                  # State engine, SDK bridge & fallback mocks
β”‚   β”œβ”€β”€ anna-tool-ids.js        # Auto-generated tool bindings
β”‚   β”œβ”€β”€ apple-touch-icon.png    # Mobile browser bookmark icon
β”‚   └── icon.svg                # Embedded app icon
β”œβ”€β”€ executas/
β”‚   └── docdrift/
β”‚       β”œβ”€β”€ pyproject.toml      # Executa package configuration
β”‚       β”œβ”€β”€ executa.json        # Executa config (host_capabilities, distribution)
β”‚       └── plugin.py           # Stdio JSON-RPC handler + APS KV + R2 upload
β”œβ”€β”€ fixtures/
β”‚   └── drift_seed.jsonl        # Dev fixture data for offline testing
β”œβ”€β”€ data/
β”‚   └── fixtures/               # Additional seed data
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ AUDIT_REPORT.md         # Threat model and invariants
β”‚   β”œβ”€β”€ friction-log.md         # Integration friction log
β”‚   β”œβ”€β”€ icon.svg                # Document icon
β”‚   β”œβ”€β”€ readme-hero.svg         # Tactical vector header SVG
β”‚   β”œβ”€β”€ assets/                 # HTML templates and asset generators
β”‚   └── screenshots/            # Step-by-step UX walkthrough screenshots
β”œβ”€β”€ public/
β”‚   β”œβ”€β”€ icon.svg                # Standalone app icon SVG
β”‚   β”œβ”€β”€ og-image.png            # Open Graph banner PNG
β”‚   └── pitch.html              # Standalone marketing pitch deck HTML
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ bench.py                # Latency and recall benchmarks
β”‚   β”œβ”€β”€ verify_offline.py       # Air-gapped container test
β”‚   └── record-docdrift.mjs     # Puppeteer demo recording
└── tests/
    └── test_plugin.py          # Complete unit tests (100% offline coverage)

πŸš€ Getting Started

Prerequisites

  • Python β‰₯ 3.10
  • Node.js β‰₯ 20
  • uv (Python packaging tool)

Installation & Run

  1. Clone the repository:
    git clone https://github.com/edycutjong/docdrift.git
  2. Navigate to codebase:
    cd docdrift
  3. Install npm dependencies: Installs the required @anna-ai/cli devDependency locally:
    npm install
  4. Run the development harness:
    npm run dev
    # or
    npx anna-app dev

πŸ§ͺ Testing & CI

DocDrift includes a full verification harness with unit tests, offline air-gap audits, and benchmarks:

# ── Run Unit Tests (105+ assertions) ────────
PYTHONPATH=. python3 tests/test_plugin.py

# ── Run Air-Gapped Offline Verification ──────
PYTHONPATH=. python3 scripts/verify_offline.py

# ── Run Performance Benchmarks ──────────────
PYTHONPATH=. python3 scripts/bench.py
Layer Tool Status
Code Quality Pytest + Local Assertions βœ…
Unit Testing 100+ parameterized assertions βœ…
Air-Gap Scan Mock socket offline check βœ…
Latency Audit bench.py latency analysis βœ…

πŸ“„ License

Licensed under MIT. Copyright Β© 2026 Edy Cu.

About

πŸ” AI-powered Code↔Docs Drift Detector and Security-first Auditing Pipeline built for the Anna Hackathon

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors