This document provides a comprehensive explanation of the ChatForge backend architecture, request flow, and key components.
- Entry Points and Server Initialization
- Request Flow Architecture
- Authentication and Middleware Flow
- Request Handling - Chat API
- Database Access Patterns
- Tool System Architecture
- Provider/Adapter System
- Persistence (Data Recording)
- Streaming and Response Handling
- Conversation Management
- Provider Routes - Configuration Management
- System Prompts
- Key Architectural Patterns
File: backend/src/index.js
The main server file initializes Express with:
- CORS configuration with support for
x-session-idheader - Global middleware stack (in order):
- Session resolver - Establishes session identity from headers/cookies
- Request logger - Logs all incoming requests
- Rate limiting - Protects against abuse
- Authentication-protected routes - Per-router auth middleware
- Database setup -
getDb()triggers migrations and seeders on first call - Retention worker - Hourly cleanup job for conversation retention policies
- Model cache worker - Background refresh for model cache
- Runs on
MODEL_CACHE_REFRESH_MSinterval (default 1 hour) - Refreshes model lists per user in background
- Ensures model availability without blocking requests
- Runs on
- Security headers - HSTS, X-Frame-Options, etc. in production
The request processing follows a layered approach:
HTTP Request
↓
Session Resolution (session.js)
↓
Authentication Middleware (auth.js)
↓
Route Handler (e.g., chat.js, conversations.js)
↓
Request Validation & Sanitization
↓
Provider Selection & Context Building
↓
Persistence Initialization (SimplifiedPersistence)
↓
Tool Orchestration or Direct Proxy
↓
Streaming or JSON Response
↓
Database Persistence (checkpoint + final writes)
Location: backend/src/middleware/
- Precedence: Header
x-session-id> Cookiecf_session_id> Generate UUID - Sets persistent HttpOnly cookie (365 days)
- Computes IP hash (SHA256 first 16 chars) for session tracking
- Does NOT require authentication
authenticateToken()- Required auth, returns 401 if no token- Extracts JWT from
Authorization: Bearer <token> - Verifies token with config secret
- Validates user still exists in database
- Populates
req.userwith id, email, displayName, emailVerified - Upserts session with user context
- Extracts JWT from
optionalAuth()- Soft auth, setsreq.user = nullif no tokengetUserContext()- Delegated wrapper aroundauthenticateToken
- Every database query filters by
user_id - All database functions enforce
NOT NULLconstraints onuser_id - Prevents cross-user data access
Primary Route: /v1/chat/completions (POST)
Location: backend/src/lib/openaiProxy.js
proxyOpenAIRequest(req, res)
├─ buildRequestContext()
│ ├─ Resolve provider (DB or env-based)
│ ├─ Extract conversation ID
│ ├─ Sanitize incoming body
│ ├─ Expand tool names to specs
│ └─ Get default model if missing
├─ validateRequestContext()
│ ├─ Check reasoning_effort, verbosity, etc.
│ └─ Validate against model capabilities
├─ executeRequestHandler()
│ ├─ Initialize SimplifiedPersistence
│ ├─ Load conversation history (if applicable)
│ ├─ Select execution path based on flags:
│ │ ├─ Tools enabled → Tool orchestration
│ │ │ ├─ Streaming → handleToolsStreaming()
│ │ │ └─ JSON → handleToolsJson()
│ │ └─ No tools → Direct proxy
│ │ ├─ Streaming → handleRegularStreaming()
│ │ └─ JSON → Direct JSON response
│ ├─ Stream conversation metadata early
│ ├─ Accumulate tool calls during streaming
│ └─ Persist final message & tool calls
└─ Update system prompt usage tracking- System prompt is injected as first message
- Tool names (strings) expanded to full OpenAI specs
- Message history loaded from database with diff-based sync
- Response ID tracked for Responses API optimization
- Prompt caching:
addPromptCaching()inserts cache breakpoints for Anthropic models - Reasoning format: Transformed based on provider (e.g.,
reasoning_formatparameter handling)
Location: backend/src/db/
Every database function enforces user isolation:
// Example pattern - conversations.js
export function getConversationById({ id, userId }) {
if (!userId) throw new Error('userId is required');
const db = getDb();
return db.prepare(
`SELECT ... FROM conversations
WHERE id=@id AND user_id=@user_id AND deleted_at IS NULL`
).get({ id, user_id: userId });
}- Fields: id, session_id, user_id, title, provider_id, model, metadata (JSON)
- Settings: streaming_enabled, tools_enabled, reasoning_effort, verbosity
- Tracks: created_at, updated_at, deleted_at (soft delete)
- Fields: conversation_id, role (user/assistant/tool), content, content_json
- Tool data: tool_calls (JSON array), function_call (legacy)
- Reasoning: reasoning_details (JSON), reasoning_tokens
- Metadata: seq (sequence number for ordering), finish_reason, status
- Links: message_id → message
- Data: function name, arguments (JSON), index, id
- Links: message_id → message, tool_call_id → tool_calls.id
- Data: output, status (success/error)
- Fields: id, name, provider_type, api_key, base_url, metadata (JSON)
- User scoping: user_id (enforced in all queries)
- Flags: enabled, is_default, deleted_at (soft delete)
- Fields: id, email, display_name, email_verified
- Links: message_id -> messages
- Fields: type ('content', 'reasoning'), content, seq
- Purpose: Stores individual streaming chunks for replay/recovery
- Fields: id, name, content, is_builtin, user_id
- User scoping: Built-in prompts shared, custom prompts per user
- Fields: id, user_id, content, created_at, updated_at
- Purpose: Enables AI to store and retrieve notes across conversations
- User scoping: Entries strictly scoped per user
- Fields: id, user_id, key, value
- Purpose: Stores user-specific configuration (tool API keys, preferences)
- Security: API keys encrypted at rest
- Fields: id, user_id, session_id, ip_hash, created_at, last_active_at
- Purpose: Track user sessions for security and analytics
- Reads - Always filtered by
user_id AND deleted_at IS NULL - Writes - Require explicit
user_idparameter - Updates - Include
WHERE user_id=@userIdclause - Soft Deletes - Set
deleted_attimestamp instead of removing
Location: backend/src/lib/tools/
// tools/index.js
const registeredTools = [
webSearchTool,
webSearchExaTool,
webSearchSearxngTool,
webFetchTool,
journalTool
];
const toolMap = new Map(); // name → tool implementation
export const tools = Object.fromEntries(toolMap.entries());name- Tool identifierspec- OpenAI-compatible function definitionvalidate(args)- Input validationhandler(args, context)- Async execution
buildConversationMessagesOptimized()- Attempts Responses API optimization- Falls back to full history if no previous_response_id
- Merges stored messages with request messages
executeToolCall()- Executes tool and returns output- Passes user context for user-scoped tools
- Handles JSON parsing and validation errors gracefully
handleToolsStreaming()- Real-time tool orchestration with SSEhandleToolsJson()- Non-streaming tool orchestration
Location: backend/src/lib/providers/
resolveProviderSettings(config, options)
├─ Check DB-backed provider (if providerId specified)
├─ Fall back to latest enabled provider
└─ Fall back to env-based config
createProvider()
├─ Instantiate provider class (OpenAI/Anthropic/Gemini)
└─ Inject resolved settings-
BaseProvider - Abstract base with common logic
sendRequest()- Send to upstream APIsupportsTools()- Tool capability detectionsupportsReasoningControls()- Advanced reasoning supportgetDefaultModel()- Model resolution
-
OpenAIProvider - OpenAI API (and OpenAI-compatible)
- Supports: tools, reasoning_effort, verbosity
- Custom logic for model filtering
-
AnthropicProvider - Claude API
- Supports: different parameter names and formats
-
GeminiProvider - Google Gemini API
- Supports: its own API format
- BaseAdapter - Request normalization
- ChatCompletionsAdapter - Chat API handling
- ResponsesApiAdapter - OpenAI Responses API (state management optimization)
Location: backend/src/lib/simplifiedPersistence.js
The persistence system uses a hybrid approach combining draft checkpoints during streaming with final writes on completion, enabling recovery if clients disconnect mid-stream.
SimplifiedPersistence
├─ initialize(conversationId, sessionId, userId, req, bodyIn)
├─ _handleConversation() - Create or retrieve conversation
├─ _processMessageHistory() - Sync message diffs
└─ _setupAssistantRecording() - Prepare for responseThe checkpoint system enables mid-stream recovery:
shouldCheckpoint()- Determines if checkpoint needed based on:- Time threshold: 3000ms since last checkpoint
- Size threshold: 500+ characters accumulated since last checkpoint
performCheckpoint()- Writes draft message to database during streamingcreateDraftMessage()- Creates initial draft withstatus='streaming'
appendContent(delta)- Buffer assistant message chunksappendReasoningText(delta)- Buffer reasoning tokensaddToolCalls(toolCalls)- Buffer tool callsaddToolOutputs(toolOutputs)- Buffer tool outputs
recordAssistantFinal(finishReason, responseId)- Write final assistant messagepersistToolCallsAndOutputs()- Write tool data to DB- Tool outputs stored as separate messages with role="tool"
- Updates draft message status from
'streaming'to'complete'
- Early metadata emission (conversation ID before chunks)
- Tool call accumulation during streaming
- Periodic checkpoint writes during long streams
- Final database write at stream end
- Recovery of partial content if client disconnects
Location: backend/src/lib/streamingHandler.js
- Set SSE headers (
text/event-stream) - Emit conversation metadata early
- Parse SSE chunks from upstream
- Pass through to client in real-time
- Accumulate in persistence buffer
finish_reason- Tracked per chunkreasoning_content- Captured from deltatool_calls- Accumulated with index trackingresponse_id- Captured from any chunkreasoning_tokens- Captured from usage
- On stream end: Call
recordAssistantFinal() - On stream error: Call
markError() - On client disconnect: Mark error
// Client receives conversation ID immediately
const conversationMeta = getConversationMetadata(persistence);
writeAndFlush(res, `data: ${JSON.stringify(conversationMeta)}\n\n`);- Event types:
'content'for regular text,'reasoning'for thinking content - Progressive rendering: Events stored in
message_eventstable for replay - Generated images: DALL-E image responses handled with URL extraction
- Endpoint:
POST /v1/chat/completions/stop - Registry: Active streams tracked by conversation ID
- Client disconnect: Handled via
req.on('close')listener - Cleanup: Removes stream from registry, triggers checkpoint persistence
req.on('close', () => {
// Persist buffered content as checkpoint
// Remove from active stream registry
// Mark message with appropriate status
});Location: backend/src/db/conversations.js
- Generate UUID for ID
- Store with user_id, session_id
- Store settings snapshot (streaming_enabled, tools_enabled, etc.)
- Always scoped by user_id
- Parse JSON metadata
- Extract active_tools from metadata
- Metadata updates (system_prompt, active_tools)
- Settings updates (streaming, tools, quality, reasoning, verbosity)
- Title generation (fire-and-forget background task)
- Create new conversation with
parent_conversation_idreference - Copy all metadata from original
- Copy messages up to specified sequence number
- Enables exploration of alternative conversation paths
- Field:
parent_conversation_idfor forking relationships - Model comparison: Multiple conversations linked for side-by-side evaluation
- Endpoint:
GET /v1/conversations/:id/linked- Retrieve linked conversations
- Set deleted_at timestamp
- Prevents retrieval in normal queries
- Cursor-based using created_at + id
- Handles 1-100 item limits
- Maintains order consistency
Location: backend/src/routes/providers.js
- GET /v1/providers - List user's providers
- GET /v1/providers/:id - Get specific provider
- POST /v1/providers - Create new provider
- PUT /v1/providers/:id - Update provider
- DELETE /v1/providers/:id - Soft delete provider
- POST /v1/providers/:id/default - Set as default
- GET /v1/providers/:id/models - Fetch available models from provider's API
- Server-side only (API keys not exposed to client)
- Filters OpenRouter models (last 1 year only)
- Applies model filters from provider metadata
- Comprehensive error handling for connectivity issues
Location: backend/src/lib/toolOrchestrationUtils.js
- Inline override (in messages array)
- Request parameter (system_prompt/systemPrompt)
- Active system prompt ID (built-in or custom)
- Legacy stored system_prompt
- Empty string
<system_instructions>
[Today's date]
[Shared modules for enabled tools]
</system_instructions>
<user_instructions>
[Prompt content]
</user_instructions>
Loaded based on enabled tools, wrapped with model filtering
- Routes handle HTTP concerns
- Database layer enforces user isolation
- Persistence layer buffers and finalizes writes
- Tool system is modular and registry-based
- Providers abstract upstream API differences
- Provider config injected into handlers
- User context passed through request object
- Database connection singleton via
getDb()
- SimplifiedPersistence composes ConversationManager, ConversationValidator, etc.
- Providers use adapter pattern for API differences
- Tool system uses registry rather than inheritance
- Enforced at query level (WHERE user_id=...)
- Enforced at parameter level (required userId)
- NOT NULL constraints in schema
- Every function validates userId before access
- Draft messages created at stream start with
status='streaming' - Periodic checkpoints based on time (3000ms) or size (500 chars)
- Recovery of partial content on client disconnect
- Final status update to
'complete'on successful finish
- Model lists cached per user with TTL
- Background refresh worker updates cache proactively
- Avoids blocking requests on cache expiry
- Provider-specific model filtering applied
reasoning_formatparameter supported across compatible models- Provider-specific transformations in adapter layer
- Reasoning content streamed separately from main content
parent_conversation_idtracks fork relationships- Messages copied up to specified sequence number
- Independent conversation history after fork point
- Linked conversations for model comparison mode
- Active streams tracked by conversation ID
POST /v1/chat/completions/stopendpoint for client-initiated abort- Automatic cleanup on client disconnect
- Checkpoint persistence triggered before cleanup
The ChatForge backend implements a sophisticated OpenAI-compatible proxy with:
- Layered Security - Session, authentication, and per-user data isolation
- Flexible Request Processing - Adapts to tool orchestration vs. direct proxy based on flags
- Hybrid Checkpoint Persistence - Draft checkpoints during streaming with final writes on completion
- Multi-Provider Support - Factory pattern for different AI providers
- Real-time Streaming - SSE-based with early metadata emission and abort capability
- Modular Tools - Registry-based, decoupled tool system (web search, fetch, journal)
- User-Scoped Data - Every operation filtered by authenticated user
- Conversation Settings Snapshots - Complete state captured per conversation for reproducibility
- Conversation Forking - Fork conversations at any point to explore alternative paths
- Model Comparison - Linked conversations for side-by-side model evaluation
- Stream Recovery - Checkpoint-based recovery for client disconnects
- Prompt Caching - Automatic cache breakpoints for Anthropic models
The architecture prioritizes separation of concerns, type safety, and user data isolation while maintaining OpenAI API compatibility and production reliability.