ContextOS/PRD.md at main · rjspence3/ContextOS

ContextOS ("The Brain")

Product Vision To create an "Active Memory Operating System" that transforms AI interactions from stateless chats into continuous, evolving relationships.

One-Line Pitch: "The Graph-based Memory Layer that prevents Groundhog Day in AI."

Problem Statement Amnesia: AI Agents forget context the moment a session closes.

Retrieval Blindness: Standard RAG retrieves "keywords" but misses "relationships" (e.g., knowing that Project A is blocking Project B).

Context Window Limits: You cannot stuff 6 months of history into every prompt.

Solution Overview A background service that processes chat logs into a Knowledge Graph. It extracts entities (People, Projects) and relationships (Decisions, Blockers) and injects relevant context into future sessions before the user prompts.
Key Features & Functional Requirements A. The "Episodic Ingestor" (The Debrief Agent) Requirement: Process raw chat transcripts into structured data without user intervention.

Trigger: runs 10 minutes after chat session inactivity.

Logic: Uses a small, fast LLM (e.g., gpt-4o-mini) to extract:

Facts: "User works at Acme Corp."

Preferences: "User hates verbose code comments."

Work Objects: "User created a 'Draft Audit Report'."

Triples: (User) --[HAS_GOAL]--> (Launch AppFlow)

B. The "Property Graph" Storage Requirement: Store both semantic vectors (for fuzzy search) and graph edges (for precise logic).

Stack: LlamaIndex PropertyGraphIndex backed by Neo4j.

Schema:

Nodes: Person, Project, Event, Document.

Edges: AUTHORED, MODIFIED, BLOCKED_BY, DECIDED_ON.

C. The "Context Injection" Middleware Requirement: proactively insert memory into the system prompt of AppFlow agents.

Logic:

User opens "ServiceNow Bot."

Middleware queries ContextOS: "What is the user's current IT context?"

Graph Result: User --[REPORTED]--> (Ticket #123: Wifi Issue).

Injection: "System Note: The user recently reported Wifi issues (Ticket #123). Ask if this is resolved."

D. The "Morning Brief" (Diff Engine) Requirement: Show the user what changed in their "World State."

Function: Compare Graph_State_T0 vs Graph_State_T1.

Output: A natural language summary: "Since yesterday, your 'Audit Project' moved from 'Planning' to 'Blocked'."

Technical Stack Orchestration: LlamaIndex (Python).

Graph DB: Neo4j (AuraDB Free Tier for MVP).

Vector DB: Pinecone or Weaviate (for the node embeddings).

LLM: OpenAI gpt-4o-mini (for extraction) / gpt-4o (for synthesis).

Success Metrics (MVP) Recall Accuracy: The system retrieves the correct "Active Project" 95% of the time.

Extraction Quality: "Debrief Agent" correctly identifies 90% of user decisions from chat logs.

Latency: Context retrieval adds < 500ms to the start of a chat session.

7. Graph Schema (Complete)

Node Types & Properties

Person

{
  id: UUID,
  name: string,
  role?: string,              // "manager", "client", "vendor"
  company?: string,
  email?: string,
  relationship_strength: number,  // 0.0-1.0, decays over time
  last_mentioned: Date,
  extraction_confidence: number
}

Project

{
  id: UUID,
  name: string,
  status: "planning" | "active" | "blocked" | "completed" | "abandoned",
  priority?: "high" | "medium" | "low",
  deadline?: Date,
  blockers: UUID[],           // References to blocking nodes
  last_mentioned: Date,
  extraction_confidence: number
}

Event

{
  id: UUID,
  type: "meeting" | "deadline" | "decision" | "milestone",
  description: string,
  date?: Date,
  participants: UUID[],       // Person references
  outcome?: string,
  extraction_confidence: number
}

Fact

{
  id: UUID,
  subject_id: UUID,           // What this fact is about
  claim: string,              // "User prefers dark mode"
  source_session_id: UUID,
  extracted_at: Date,
  extraction_confidence: number,
  superseded_by?: UUID        // If updated by newer fact
}

Edge Types & Properties

Edge	From → To	Properties
`AUTHORED`	Person → Document	`{ date, role: "primary"\|"contributor" }`
`BLOCKED_BY`	Project → Project/Event	`{ reason, since: Date }`
`DECIDED_ON`	Person → Event	`{ decision_type, outcome }`
`WORKS_ON`	Person → Project	`{ role, since: Date }`
`RELATED_TO`	Any → Any	`{ relationship_type, strength: 0.0-1.0 }`
`SUPERSEDES`	Fact → Fact	`{ reason: "updated"\|"corrected" }`

8. Platform-Specific Ingestion

Supported Platforms (MVP)

Platform	Detection	Parsing Strategy
Claude.ai	URL contains `claude.ai/chat`	Parse `data-testid` conversation blocks
ChatGPT	URL contains `chat.openai.com` or `chatgpt.com`	Parse `data-message-author-role` attributes

Inactivity Detection

// Content script injected into chat pages
let lastActivityTime = Date.now();

// Monitor for new messages
const observer = new MutationObserver(() => {
  lastActivityTime = Date.now();
});

// Trigger debrief after 10 minutes of inactivity
setInterval(() => {
  const inactiveMinutes = (Date.now() - lastActivityTime) / 60000;
  if (inactiveMinutes >= 10) {
    triggerDebrief();
  }
}, 60000); // Check every minute

Extraction Pipeline

Scrape: Pull conversation HTML from active tab
Parse: Extract user/assistant turns with timestamps
Chunk: Split into 4000-token windows
Extract: LLM identifies entities, facts, preferences
Dedupe: Match against existing graph nodes (embedding similarity > 0.9)
Store: Upsert nodes/edges to Neo4j

9. Conflict Resolution (Confidence-Based)

Conflict Detection

Two facts conflict when:

Same subject_id (about the same entity)
Claims are semantically contradictory (cosine similarity < 0.3 between claim embeddings)

Resolution Algorithm

def resolve_conflict(existing_fact: Fact, new_fact: Fact) -> Fact:
    """
    Keep the fact with higher confidence.
    If tie, prefer newer fact.
    """
    if new_fact.extraction_confidence > existing_fact.extraction_confidence:
        winner = new_fact
        loser = existing_fact
    elif new_fact.extraction_confidence < existing_fact.extraction_confidence:
        winner = existing_fact
        loser = new_fact
    else:
        # Tie: prefer newer
        winner = new_fact
        loser = existing_fact

    # Don't delete loser, mark as superseded
    loser.superseded_by = winner.id
    winner.supersedes = loser.id

    return winner

# Confidence thresholds
HIGH_CONFIDENCE = 0.8    # Explicit user statement
MEDIUM_CONFIDENCE = 0.6  # Inferred from context
LOW_CONFIDENCE = 0.4     # Ambiguous extraction

Confidence Scoring

Signal	Confidence Boost
User explicitly states fact	+0.3
Fact repeated across sessions	+0.2
Fact from recent session (< 7 days)	+0.1
Fact contradicts user correction	-0.4

10. Context Injection Templates

Injection Format

<context_injection>
  <active_projects>
    - {project_name} ({status}): {one_line_summary}
  </active_projects>

  <recent_decisions>
    - {date}: {decision_summary}
  </recent_decisions>

  <user_preferences>
    - {preference_category}: {preference_value}
  </user_preferences>

  <relevant_people>
    - {person_name} ({role}): Last discussed {days_ago} days ago
  </relevant_people>
</context_injection>

Injection Rules

Agent Type	Injected Context
General Assistant	Top 3 active projects, top 5 preferences
Code Assistant	Current project only, tech stack preferences
Meeting Assistant	Relevant people, recent decisions
Task Manager	All active projects with blockers

Token Budget

Maximum injection size: 500 tokens
If exceeds: Prioritize by last_mentioned recency
Always include: Active blockers, user corrections from last 7 days

11. Architecture Decision: Deployment Model

Chosen Architecture: Browser Extension + Local Server

┌─────────────────────────────────────────────────────────────┐
│                    User's Machine                            │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐    ┌────────────────────────────────┐ │
│  │ Browser Extension│    │ ContextOS Local Server         │ │
│  │ (Content Scripts)│    │ (Python/FastAPI)               │ │
│  │                  │    │                                │ │
│  │ • Scrapes chat   │───>│ • Extraction Pipeline          │ │
│  │ • Detects idle   │    │ • Graph Storage (Neo4j)        │ │
│  │ • Triggers sync  │<───│ • Vector Embeddings (Weaviate) │ │
│  │                  │    │ • Context Query API            │ │
│  └─────────────────┘    └────────────────────────────────┘ │
│                                    │                         │
│                                    ▼                         │
│                         ┌─────────────────┐                 │
│                         │ OpenAI API      │                 │
│                         │ (gpt-4o-mini)   │                 │
│                         └─────────────────┘                 │
└─────────────────────────────────────────────────────────────┘

Rationale:

Local server preserves privacy (no cloud dependency for graph storage)
Browser extension has direct DOM access for reliable scraping
Separation allows each component to be updated independently
Future: Option to run server in Docker for portability

Communication Protocol

Extension → Server: HTTP POST to localhost:8742
Server → Extension: Server-Sent Events for real-time updates

12. Tech Stack Decisions (Resolved)

Component	Technology	Rationale
Vector DB	Weaviate (local Docker)	Open-source, embedded mode, built-in vectorizer
Graph DB	Neo4j (AuraDB Free or local)	Best graph query language (Cypher)
Extraction LLM	OpenAI gpt-4o-mini	Fast, cheap, good at structured output
Synthesis LLM	OpenAI gpt-4o	Better reasoning for context relevance
Backend	Python 3.11 + FastAPI	Best LLM tooling ecosystem
Extension	Manifest V3	Required for Chrome Web Store

"AppFlow" Clarification

The term "AppFlow" in this document refers to any LLM-powered application that can receive injected context. Examples:

Claude Code (via MCP server)
Custom Streamlit apps
API-based agents

Context injection works via a REST API that any application can query.

13. Project File Structure

contextos/
├── extension/                    # Chrome Extension
│   ├── manifest.json
│   ├── content_scripts/
│   │   ├── scraper.js           # DOM scraping logic
│   │   ├── platforms/
│   │   │   ├── claude.js
│   │   │   └── chatgpt.js
│   │   └── idle_detector.js
│   ├── background/
│   │   └── service-worker.js    # Extension coordination
│   └── popup/
│       ├── popup.html
│       └── popup.js
│
├── server/                       # Python Backend
│   ├── src/
│   │   ├── main.py              # FastAPI entry point
│   │   ├── config.py            # Pydantic Settings
│   │   ├── api/
│   │   │   ├── routes.py        # REST endpoints
│   │   │   └── schemas.py       # Request/response models
│   │   ├── extraction/
│   │   │   ├── pipeline.py      # Main extraction flow
│   │   │   ├── prompts.py       # LLM prompt templates
│   │   │   └── parsers.py       # Chat format parsers
│   │   ├── graph/
│   │   │   ├── neo4j_client.py  # Neo4j connection
│   │   │   ├── schema.py        # Cypher schema DDL
│   │   │   └── queries.py       # Common graph queries
│   │   ├── vector/
│   │   │   ├── weaviate_client.py
│   │   │   └── embeddings.py    # OpenAI embedding calls
│   │   └── injection/
│   │       ├── builder.py       # Context assembly
│   │       └── templates.py     # Injection formats
│   ├── tests/
│   │   ├── test_extraction.py
│   │   ├── test_graph.py
│   │   └── fixtures/
│   ├── requirements.txt
│   └── Dockerfile
│
├── docker-compose.yml           # Neo4j + Weaviate + Server
├── .env.example
└── README.md

14. API Contracts

Base URL

http://localhost:8742/api/v1

POST /ingest

Submit a chat transcript for processing.

Request:

{
  "platform": "claude" | "chatgpt",
  "session_id": "uuid",
  "messages": [
    {
      "role": "user" | "assistant",
      "content": "string",
      "timestamp": "ISO8601"
    }
  ]
}

Response (202 Accepted):

{
  "job_id": "uuid",
  "status": "queued",
  "estimated_seconds": 30
}

GET /ingest/{job_id}

Check extraction job status.

Response (200):

{
  "job_id": "uuid",
  "status": "completed" | "processing" | "failed",
  "extracted": {
    "facts": 12,
    "entities": 5,
    "relationships": 8
  },
  "errors": []
}

GET /context

Retrieve context for injection into an agent.

Query Parameters:

agent_type: "general" | "code" | "meeting" | "task" (required)
max_tokens: integer (default 500)
project_filter: string (optional, filter to specific project)

Response (200):

{
  "injection": "<context_injection>...</context_injection>",
  "token_count": 423,
  "sources": [
    { "type": "fact", "id": "uuid", "relevance": 0.92 },
    { "type": "project", "id": "uuid", "relevance": 0.88 }
  ]
}

GET /diff

Get changes since last snapshot for Morning Brief.

Query Parameters:

since: ISO8601 timestamp (required)

Response (200):

{
  "changes": [
    {
      "entity_type": "project",
      "entity_name": "Audit Project",
      "field": "status",
      "from": "planning",
      "to": "blocked",
      "changed_at": "ISO8601"
    }
  ],
  "summary": "Your 'Audit Project' moved from 'Planning' to 'Blocked'. 2 new facts extracted about you."
}

Error Responses

HTTP	Code	Meaning
400	`invalid_platform`	Unsupported platform in request
400	`empty_transcript`	No messages in transcript
404	`job_not_found`	Invalid job_id
500	`extraction_failed`	LLM or DB error during processing
503	`neo4j_unavailable`	Graph DB connection failed

15. Environment Configuration

# .env.example

# OpenAI (Required)
OPENAI_API_KEY=sk-...

# Neo4j
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password

# Weaviate
WEAVIATE_URL=http://localhost:8080

# Server
SERVER_PORT=8742
LOG_LEVEL=INFO

# Extension (for development)
EXTENSION_DEV_MODE=true

16. Error Handling Strategy

Extraction Errors

Error	Detection	Recovery
OpenAI rate limit	429 response	Exponential backoff, queue remaining chunks
OpenAI timeout	>30s response	Retry once, then mark job as partial
Malformed LLM output	JSON parse fails	Retry with stricter prompt, then skip chunk
Empty extraction	0 entities found	Log warning, continue (some chats have no extractable content)

Storage Errors

Error	Detection	Recovery
Neo4j connection failed	Connection timeout	Cache extractions locally, retry on reconnect
Weaviate unavailable	Health check fails	Disable deduplication, proceed with extraction
Duplicate node	Constraint violation	Merge with existing node

Extension Errors

Error	Detection	Recovery
Server unreachable	Fetch fails	Show badge "Offline", queue transcripts locally
DOM structure changed	Selectors return null	Fall back to text extraction, log warning
Permission denied	Content script blocked	Show user notification with fix instructions

17. Security & Privacy

Data Storage

All data stored locally (user's machine only)
No cloud sync by default
Neo4j and Weaviate run in Docker with no external access

API Security (Local Server)

Server binds to 127.0.0.1 only (not exposed to network)
No authentication for MVP (local-only access)
CORS restricted to extension origin

Sensitive Data Handling

Chat content processed but not stored verbatim
Only extracted facts/entities retained
User can delete all data via /api/v1/reset endpoint

Extension Permissions

Permission	Justification
`activeTab`	Read current chat page DOM
`storage`	Persist local queue when server unavailable
`host_permissions: claude.ai/*`	Content script injection
`host_permissions: chat.openai.com/*`	Content script injection

18. Testing Requirements

Unit Tests

Module	Test Cases
`extraction/parsers.py`	Parse Claude format, parse ChatGPT format, handle malformed input
`extraction/pipeline.py`	Extract facts, extract relationships, handle empty input
`graph/queries.py`	Find related nodes, resolve conflicts, calculate recency
`injection/builder.py`	Token limiting, priority sorting, template rendering

Integration Tests

Scenario	Validation
Full pipeline	Ingest sample transcript → verify graph populated correctly
Context retrieval	Populate graph → query context → verify relevant facts returned
Conflict resolution	Insert conflicting facts → verify winner selected correctly

Test Fixtures

tests/fixtures/
├── claude_transcript_simple.json
├── chatgpt_transcript_long.json
├── expected_extraction.json
└── sample_graph_state.cypher

19. Success Criteria

MVP Definition of Done

Extension successfully scrapes Claude.ai and ChatGPT conversations
Extraction pipeline populates Neo4j with facts, projects, people
/context endpoint returns relevant context in < 500ms
Morning Brief shows accurate diff summary

Quality Gates

Metric	Target
Extraction precision	> 90% (facts match human labeling)
Context retrieval latency	< 500ms
Token efficiency	Injection uses < 80% of budget
False positive rate	< 5% (irrelevant context injected)

FilesExpand file tree

PRD.md

Latest commit

History

PRD.md

File metadata and controls

7. Graph Schema (Complete)

Node Types & Properties

Person

Project

Event

Fact

Edge Types & Properties

8. Platform-Specific Ingestion

Supported Platforms (MVP)

Inactivity Detection

Extraction Pipeline

9. Conflict Resolution (Confidence-Based)

Conflict Detection

Resolution Algorithm

Confidence Scoring

10. Context Injection Templates

Injection Format

Injection Rules

Token Budget

11. Architecture Decision: Deployment Model

Chosen Architecture: Browser Extension + Local Server

Communication Protocol

12. Tech Stack Decisions (Resolved)

"AppFlow" Clarification

13. Project File Structure

14. API Contracts

Base URL

POST /ingest

GET /ingest/{job_id}

GET /context

GET /diff

Error Responses

15. Environment Configuration

16. Error Handling Strategy

Extraction Errors

Storage Errors

Extension Errors

17. Security & Privacy

Data Storage

API Security (Local Server)

Sensitive Data Handling

Extension Permissions

18. Testing Requirements

Unit Tests

Integration Tests

Test Fixtures

19. Success Criteria

MVP Definition of Done

Quality Gates