Skip to content

Deployment

idapixl edited this page Mar 15, 2026 · 2 revisions

Deployment

cortex-engine can run locally (MCP stdio), or as a hosted service on Cloud Run.

Local (MCP stdio)

The simplest setup. cortex-engine runs as a subprocess of your AI client:

npm install cortex-engine
# Configure via .mcp.json — see [[MCP Integration]]

Pros: No server to manage, works offline (with Ollama for embeddings) Cons: Only accessible from the local machine

Cloud Run (Hosted)

Run cortex as an HTTP service for remote access, multi-agent setups, or shared memory.

Prerequisites

  • Google Cloud project with:
    • Cloud Run enabled
    • Firestore enabled
    • Artifact Registry (for container images)
  • gcloud CLI authenticated

Create a server wrapper

cortex-engine is a library. To serve it over HTTP, wrap it in a simple server. Here's a minimal example using Hono:

mkdir my-cortex-server && cd my-cortex-server
npm init -y
npm install cortex-engine hono @hono/node-server
// index.ts
import { serve } from '@hono/node-server';
import { Hono } from 'hono';
import { CortexEngine } from 'cortex-engine';

const app = new Hono();
const cortex = new CortexEngine({
  projectId: process.env.GCP_PROJECT_ID!,
});

// Auth middleware
app.use('*', async (c, next) => {
  const token = c.req.header('x-cortex-token');
  if (token !== process.env.CORTEX_API_TOKEN) {
    return c.json({ error: 'Unauthorized' }, 401);
  }
  await next();
});

// Health check
app.get('/health', (c) => c.json({ status: 'ok' }));

// Query endpoint
app.post('/api/query', async (c) => {
  const { query, limit } = await c.req.json();
  const results = await cortex.query(query, { limit });
  return c.json(results);
});

// Observe endpoint
app.post('/api/observe', async (c) => {
  const { content, tags } = await c.req.json();
  const result = await cortex.observe(content, { tags });
  return c.json(result);
});

serve({ fetch: app.fetch, port: 8080 });
console.log('Cortex server running on :8080');

Deploy

Set environment variables:

export GCP_PROJECT_ID="your-project-id"
export CORTEX_API_TOKEN="your-secret-token"  # for auth

Deploy:

gcloud run deploy cortex \
  --source . \
  --region us-central1 \
  --update-env-vars "GCP_PROJECT_ID=$GCP_PROJECT_ID,CORTEX_API_TOKEN=$CORTEX_API_TOKEN"

Warning: Always use --update-env-vars, never --set-env-vars (which replaces all existing vars).

Authentication

All requests require the x-cortex-token header:

curl https://your-service.run.app/api/query \
  -H "x-cortex-token: your-secret-token" \
  -H "Content-Type: application/json" \
  -d '{"query": "what do I know?"}'

Self-Hosted (VPS)

Same server wrapper as above, just run it directly:

# Build
npm install && npm run build

# Set env vars
export GCP_PROJECT_ID="your-project-id"
export CORTEX_API_TOKEN="your-secret-token"

# Run with PM2 or systemd
pm2 start dist/index.js --name cortex

Multi-Agent Topologies

cortex supports several connection patterns:

Topology Description
1:1 One agent, one cortex (default)
1:N One agent, multiple cortex instances (multi-self)
N:1 Multiple agents sharing one cortex (shared memory)
N:N Federated agents with separate cortex instances

Configure in your agent.yaml:

cortex:
  self:
    url: https://my-cortex.run.app
    auth: CORTEX_API_TOKEN
    primary: true
  shared:
    url: https://team-cortex.run.app
    auth: TEAM_CORTEX_TOKEN
    primary: false

Clone this wiki locally