Skip to content

Ranjitbarnala0/ContextZero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ContextZero

Code cognition engine for AI agents. One MCP call replaces 24 file reads. 5.3x fewer tokens. Data that doesn't exist in files.

Works with Claude Code, Cursor, Windsurf, and any MCP-compatible AI tool.


Why ContextZero Exists

AI coding agents read files one at a time. To understand a single function change, they open 16 files, read 10,000+ lines, consume 40,000+ tokens — and still miss transitive side effects, contract violations, and duplicate code patterns.

ContextZero eliminates this. It indexes your codebase once, then serves precise, structured context in a single call.


Measured Results: 5 Real Tasks, Head-to-Head

Same codebase (91 files, 4,374 symbols). Same 5 tasks. Traditional file reading vs ContextZero. Real numbers.

Task 1: "Understand this function and everything it depends on"

File Reading ContextZero
Files opened 6 0
Lines consumed 2,731 0
Tokens consumed 10,924 7,999
Tool calls 9 1
What you get Raw source code, figure it out yourself Source + 13 dependencies pre-assembled + effect signature + inclusion rationale for every decision

Task 2: "What breaks if I change this function?"

File Reading ContextZero
Files opened 4 0
Lines consumed 4,823 0
Tokens consumed 19,292 ~600
Tool calls 7 1
What you get List of files that mention the function name 8 contract impacts with severity scores, invariant violation predictions, confidence levels, and recommended validation scope

32x fewer tokens. And file reading cannot tell you which contract invariants will break — it only finds text matches.

Task 3: "Does this function write to the database?"

File Reading ContextZero
Files opened 9 0
Lines consumed 6,195 0
Tokens consumed 24,780 ~400
Tool calls 14 1
What you get Manually trace 9 files, hope you catch every call 13 typed effects: DB reads, DB writes, file I/O, HTTP calls, lock acquisition, event emission — with transitive call chain tracing and 0.95 confidence

62x fewer tokens. A human reading files will miss a transitive DB write buried 3 call levels deep. ContextZero traces it automatically.

Task 4: "Find all code similar to this function"

File Reading ContextZero
Files opened 8 0
Lines consumed 4,968 0
Tokens consumed 19,872 ~2,000
Tool calls 11 2
What you get Grep results for similar names (noisy, misses different names) 31 homologs with 7-dimensional similarity scoring: semantic, logic, signature, behavioral, contract, test, and history — plus contradiction flags

10x fewer tokens. Grep cannot find behaviorally similar code that has different names. ContextZero finds it by analyzing what the code does, not what it's called.

Task 5: "Give me everything I need to safely modify this function"

File Reading ContextZero
Files opened 16 0
Lines consumed 10,246 0
Tokens consumed 40,984 11,057
Tool calls 24 1
What you get 16 raw files dumped, 95% irrelevant to your change Token-budgeted capsule: source + callers + 31 blast radius impacts + severity scores + contract invariants, all in one call

3.7x fewer tokens. 24x fewer calls. One call replaces an entire investigation.

Total Across All 5 Tasks

File Reading ContextZero Difference
Files opened 43 0 -43 files
Lines consumed 28,963 0 -29K lines
Tokens consumed 115,852 22,056 5.3x fewer
Tool calls 65 6 10.8x fewer

Data That Doesn't Exist Without ContextZero

File reading gives you source code. That's it. ContextZero computes and serves:

What you need to know File reading ContextZero
"What breaks if I change this?" Grep for the name. Hope you find everything. 31 impacts with severity, confidence, and invariant violation predictions
"Does this touch the database?" Read the function. Read everything it calls. Read everything that calls. Repeat for 9 files. 13 typed effects with transitive tracing. 1 call.
"Is this function safe to call?" Read it and all its callees. Manually decide. pure / read_only / read_write / side_effecting — every function classified automatically
"Find similar code" Grep for similar names. Miss everything with different names. 31 homologs with 7-dimensional scoring and contradiction flags
"What contracts must be preserved?" You guess. 29 derived invariants with confidence scores
"Which functions form a logical group?" You don't know. 10 concept families with exemplars and contradiction detection
"How risky is this change?" git log and manual analysis. Temporal risk: change frequency, bug-fix correlation, churn, co-change partners
"How has this symbol evolved?" Diff old commits. Symbol lineage: identity tracking through renames and refactors

None of this exists in files. It is computed by static analysis, behavioral inference, and git history mining.


At Scale

For a team making 100 AI-assisted code changes per day:

Without ContextZero With ContextZero
Tokens consumed per day ~2.3M ~440K
Tokens saved per day ~1.9M
Tool calls per day ~1,300 ~120
Blind-spot errors Unknown transitive side effects, missed contracts, duplicate code Every dimension analyzed

What ContextZero Computes

Capability Description
Capsule Compilation Token-budgeted context packages — source + deps + contracts + effects in one call. 99.96% budget utilization.
Blast Radius Structural, behavioral, and contract impact analysis with severity and confidence scoring.
Behavioral Profiling Every function classified: pure (95.2%), read-only (3.2%), read-write (1.3%), side-effecting (0.3%).
Effect Signatures 29 typed effects (reads, writes, locks, HTTP, events) with transitive propagation via Kahn's topological sort.
Contract Extraction Input/output types, error contracts, guard clauses, 29 derived invariants per complex function.
Homolog Detection 7-dimensional evidence scoring with 4 contradiction flag types. Batch candidate loading.
Smart Context One call: source + blast radius + callers + tests + contracts. Replaces 8+ separate lookups.
Dispatch Resolution Class hierarchy, virtual call resolution, C3 linearization, field-sensitive points-to analysis.
Concept Families Automatic grouping with exemplar identification and contradiction detection across family members.
Temporal Intelligence Git-derived co-change analysis, temporal risk scoring, churn metrics.
Symbol Lineage Cross-snapshot identity tracking through renames and refactors.
Transactional Editing 9-state lifecycle with DB-backed rollback. Undo any AI edit in 1 second.
Semantic Search Find code by what it does: TF-IDF + MinHash LSH similarity. No external APIs.

13 Languages

TypeScript, JavaScript, Python, C++, Go, Rust, Java, C#, Ruby, Kotlin, Swift, PHP, Bash.

TypeScript/JavaScript: full AST analysis via TypeScript Compiler API. All others: tree-sitter with language-specific walkers.

Architecture

AI Agent (Claude Code, Cursor, etc.)
    |
    | MCP protocol (stdio)
    |
ContextZero MCP Bridge (46 tools)
    |
    +-- Ingestor (13 language adapters, delta ingestion)
    +-- 13 Analysis Engines
    |     Behavioral | Contract | Deep Contract | Blast Radius
    |     Effect | Dispatch | Concept Families | Temporal
    |     Symbol Lineage | Runtime Evidence | Uncertainty
    |     Structural Graph | Capsule Compiler
    +-- Semantic Engine (TF-IDF, MinHash LSH)
    +-- Homolog Engine (7-dimensional detection)
    +-- Transactional Editor (9-state lifecycle)
    +-- Database Driver (circuit breaker, retry, advisory locks)
    |
PostgreSQL (all data local, nothing leaves your machine)

Setup

Prerequisites

  • Node.js 20+
  • PostgreSQL 14+ with pg_trgm extension

Install

git clone https://github.com/Ranjitbarnala0/ContextZero.git
cd ContextZero
npm install

Database

createdb scg_v2
psql -d scg_v2 -c "CREATE EXTENSION IF NOT EXISTS pg_trgm;"
cp .env.example .env    # Edit with your credentials
npm run db:migrate
npm run build

Connect to Claude Code

claude mcp add contextzero -s user \
  -e DB_HOST=localhost \
  -e DB_PORT=5432 \
  -e DB_NAME=scg_v2 \
  -e DB_USER=your_user \
  -e DB_PASSWORD=your_password \
  -e NODE_ENV=development \
  -e LOG_LEVEL=warn \
  -e SCG_ALLOWED_BASE_PATHS=/your/code/directory \
  -- node /path/to/ContextZero/dist/mcp-bridge/index.js

Docker

docker compose up -d

Verify

Ask Claude Code to run scg_health_check. You should see status: healthy with DB latency and version.

46 MCP Tools

Core: scg_health_check scg_register_repo scg_list_repos scg_ingest_repo scg_incremental_index scg_codebase_overview scg_snapshot_stats scg_cache_stats

Symbol Intelligence: scg_resolve_symbol scg_get_symbol_details scg_get_symbol_relations scg_read_source scg_search_code scg_semantic_search

Behavioral & Contract: scg_get_behavioral_profile scg_get_contract_profile scg_get_invariants scg_get_uncertainty scg_get_effect_signature scg_diff_effects

Impact Analysis: scg_blast_radius scg_compile_context_capsule scg_smart_context scg_find_homologs scg_persist_homologs scg_propagation_proposals

Code Graph: scg_get_dispatch_edges scg_get_class_hierarchy scg_get_symbol_lineage scg_get_co_change_partners scg_get_temporal_risk scg_get_runtime_evidence scg_get_concept_family scg_list_concept_families

Transactional Editing: scg_create_change_transaction scg_get_transaction scg_apply_patch scg_validate_change scg_commit_change scg_rollback_change

Data Management: scg_list_snapshots scg_batch_embed scg_ingest_runtime_trace

Native Workspace (no DB required): scg_native_codebase_overview scg_native_symbol_search scg_native_search_code

Security

  • Zero SQL injection surface — 100% parameterized queries
  • Path traversal protection — null bytes, URL-encoding, backslash, symlink escape detection
  • Fail-closed auth — timing-safe comparison, 32-char minimum, per-IP brute-force lockout
  • Sandboxed execution — ulimit, process groups, SIGKILL escalation, env sanitization
  • No data leaves your machine — no telemetry, no external APIs, fully local
  • Circuit breaker on DB connections with exponential backoff retry
  • Prometheus metrics at /metrics, structured JSON logging, correlation IDs

Testing

npm test              # 1,419 tests, 38 suites, 100% pass
npm run test:ci       # With coverage
npm run typecheck     # TypeScript strict mode

Integration tests with real PostgreSQL. Unit tests for all analysis engines, adapters, handlers, security, and caching.

Comparison with Alternatives

Tool Price Behavioral Analysis Blast Radius Effect Tracing Homologs Token Budgeting
ContextZero Free Yes (4 classes) Yes (5 dimensions) Yes (29 types, transitive) Yes (7-dimensional) Yes (99.96%)
Sourcegraph + Cody $19-59/user/mo No No No No No
CodeScene ~$18/author/mo File-level only Change coupling only No No No
Greptile $30/dev/mo No No No No No
SonarQube Free community No No No No No
Semgrep Free tier No No No No No

ContextZero is complementary. Use it alongside your existing tools — it fills a gap none of them cover.

License

ISC

About

Code cognition engine for AI agents. 5.3x fewer tokens. 10.8x fewer calls. Data that doesn't exist in files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors