AI-powered networking memory assistant. Capture business cards, QR codes, brochures, and photos at events. Search them later in plain English — "the robotics founder from Wednesday". Built for the Pirates of the Coral-bean hackathon.
- Mobile: React Native (Expo SDK 56), offline-first SQLite
- Server: Node.js + Express + TypeScript
- DB: Postgres 16 + pgvector (Docker, port 5433)
- OCR: Google Cloud Vision (text + face detection)
- Blob storage: Google Cloud Storage (uploaded images)
- Embeddings: OpenAI
text-embedding-3-small(1536-dim) - Search: pgvector cosine similarity + MMR diversity re-ranking
- PDF text:
pdf-parsewhen a QR resolves to a PDF - Coral: read-only HTTP endpoints → SQL tables via custom source spec
Copy server/.env.example → server/.env and fill in:
| Key | Why |
|---|---|
DATABASE_URL |
Postgres + pgvector (Docker container — see run section) |
JWT_SECRET |
Signs mobile session tokens |
GOOGLE_CLIENT_ID/SECRET |
Google OAuth for mobile sign-in |
GOOGLE_APPLICATION_CREDENTIALS |
Path to GCP service account JSON (Vision + GCS) |
GCS_BUCKET |
Bucket name for uploaded images |
OPENAI_API_KEY |
Embeddings only — see budget note below |
We use text-embedding-3-small at $0.02 / 1M tokens. Concretely:
- Each capture embedded ≈ 300–2000 tokens → ~$0.000006–$0.00004 per capture
- Each search query embedded ≈ 20–50 tokens → essentially free
- $5 budget ≈ ~125,000 captures or millions of searches
Cost controls already baked in:
- Plain photos (
memory_type='photo') are never embedded — they go to the photo gallery instead. - Text input is capped at 8000 characters before embedding (
services/embeddings.ts:18). - Each capture produces exactly one embedding (idempotent insert).
You will not run out of budget during this hackathon.
# 1. Start Postgres + pgvector
docker run -d --name remeet-db \
-e POSTGRES_USER=remeet -e POSTGRES_PASSWORD=remeet -e POSTGRES_DB=remeet \
-p 5433:5432 \
-v "$(pwd)/db/init.sql:/docker-entrypoint-initdb.d/init.sql" \
pgvector/pgvector:pg16
# 2. Server
cd server
npm install
npm run dev # http://localhost:4000
# 3. Mobile
cd ../mobile
npx expo startUser captures offline on phone (SQLite)
↓ tap "Sync"
POST /memory/capture/image (one per image, base64 in JSON body)
POST /memory/batch (QR/URL items in one shot)
↓
JWT auth
↓
┌──── parallel ────┐
↓ ↓
GCS upload Google Vision API
(returns URL) (TEXT + FACE detect)
↓
Route by memory_type:
business_card → save to memory_items → embed → save to semantic_memories
photo → save to memory_items → STOP (no embed; gallery only)
url → scrape (cheerio or pdf-parse) → save → embed
qr_text → save → embed
{
"id": "6c443b14-f0a0-47cc-8fe6-1585c369b992",
"user_id": "00000000-0000-0000-0000-000000000001",
"source_type": "camera",
"memory_type": "business_card",
"content_text": "BlueFetch Robotics\nLet Robots Do the Job\nbluefetch.co\nRequest Demo",
"document_url": "https://storage.googleapis.com/remeetbucket/images/.../609372b9.jpeg",
"interaction_type": "personal",
"has_text": true,
"has_faces": false,
"venue": "Bangalore AI Summit",
"captured_at": "2026-05-31T03:42:00.000Z"
}{
"id": "a1f2...",
"memory_item_id": "6c443b14-f0a0-47cc-8fe6-1585c369b992",
"user_id": "00000000-0000-0000-0000-000000000001",
"semantic_summary": "BlueFetch Robotics. Let Robots Do the Job. At: Bangalore AI Summit",
"embedding": "[0.0123, -0.045, ..., 0.011]"
}The summary is the text that was embedded. The embedding is the 1536-dim vector. We do NOT expose the embedding column over the /coral/* endpoints — only the summary.
Why two stages? Vector search is the right tool for "find what's similar." SQL is the right tool for "give me the exact details." Combining them means the vector DB only carries small summaries + IDs, and the relational DB carries the heavy metadata. Token-efficient by design.
User: "robotics person from Wednesday"
↓
[Stage 1 — semantic]
embed query (≈30 tokens, near-zero cost)
↓
pgvector: SELECT id, memory_item_id
FROM semantic_memories
WHERE user_id = $1
ORDER BY embedding <=> $query_vector
LIMIT 50
↓
MMR re-rank → top 10 diverse memory_item_ids
↓
[Stage 2 — relational]
JOIN memory_items mi ON mi.id = sm.memory_item_id
→ full row: content_text, document_url, venue, captured_at, ...
↓
[Response to mobile]
Input: GET /memory/search?context=robotics+person
[
{
"id": "a1f2...",
"memory_item_id": "6c443b14-f0a0-47cc-8fe6-1585c369b992",
"semantic_summary": "BlueFetch Robotics. Let Robots Do the Job. At: Bangalore AI Summit",
"score": 0.8421,
"memory_item": {
"memory_type": "business_card",
"content_text": "BlueFetch Robotics\nLet Robots Do the Job\n...",
"document_url": "https://storage.googleapis.com/.../609372b9.jpeg",
"venue": "Bangalore AI Summit",
"captured_at": "2026-05-31T03:42:00.000Z",
"interaction_type": "personal"
}
}
]Note the score (cosine similarity, 0–1) and the joined memory_item payload. Mobile renders this directly.
Photos are excluded from search results and served separately via GET /memory/photos?date=YYYY-MM-DD.
Coral is a local CLI that wraps any HTTP API as queryable SQL tables. ReMeet exposes 4 read-only endpoints under /coral/* (no auth — Coral runs on your laptop), and coral/remeet.yaml tells Coral how to map them to SQL.
# 1. Install Coral
brew install withcoral/tap/coral
# 2. Validate the source spec
coral source lint ./coral/remeet.yaml
# 3. Register it (server must be running)
coral source add --file ./coral/remeet.yaml
# When prompted:
# REMEET_API_BASE → http://localhost:4000
# 4. Confirm it shows up
coral source list
coral source info remeet --verbose| Table | What |
|---|---|
remeet.memory_items |
every capture, all metadata |
remeet.semantic_memories |
embeddable items + summary (no vector column) |
remeet.users |
accounts |
remeet.search |
vector + MMR search exposed as a table — WHERE user_id=... AND q=... |
# All business cards, most recent first
coral sql "SELECT memory_type, content_text, venue, captured_at
FROM remeet.memory_items
WHERE memory_type = 'business_card'
ORDER BY captured_at DESC
LIMIT 10"
# Semantic search via Coral
coral sql "SELECT memory_type, semantic_summary, score, venue
FROM remeet.search
WHERE user_id = '00000000-0000-0000-0000-000000000001'
AND q = 'robotics conference'"
# Personal conversations at conference-named venues
coral sql "SELECT semantic_summary, venue, captured_at
FROM remeet.memory_items mi
JOIN remeet.semantic_memories sm ON sm.memory_item_id = mi.id
WHERE mi.interaction_type = 'personal'
AND mi.venue ILIKE '%conference%'"cd server
npx tsc --noEmit # surface any TypeScript errors
npm run dev # see runtime errorCommon causes:
- Missing env var → check
.envagainst.env.example OPENAI_API_KEYnot set → the embeddings client is lazy now, so server starts, but the first/memory/capture/*request will fail with "OPENAI_API_KEY is not set"- Port 4000 already in use →
lsof -i :4000and kill it
The Postgres container isn't running:
docker ps | grep remeet-db # should show Up
docker logs remeet-db --tail 30 # if not running, see why
docker start remeet-db # if it exists but is stoppedNote: we use port 5433 (not 5432) to avoid clashing with any other Postgres on your machine.
You probably have a different Postgres on 5432. Either stop it or confirm DATABASE_URL points to 5433:
grep DATABASE_URL server/.env
# Expected: postgres://remeet:remeet@localhost:5433/remeetThe body limit is already set to 20 MB in server/src/index.ts:11. If you bumped capture size, raise it further:
app.use(express.json({ limit: '50mb' }));Your GCS bucket uses uniform IAM. We don't call makePublic() — files become public via the bucket-level policy. If images return 403:
- GCP Console → Cloud Storage → bucket → Permissions → add
allUsers→Storage Object Viewer.
# Are there any rows to search over?
docker exec remeet-db psql -U remeet -d remeet -c \
"SELECT count(*) FROM semantic_memories;"If 0:
- You captured before adding
OPENAI_API_KEY→ embeddings silently failed. Re-sync captures. - Or: everything you captured was a plain photo (we don't embed those). Confirm with:
docker exec remeet-db psql -U remeet -d remeet -c \ "SELECT memory_type, count(*) FROM memory_items GROUP BY memory_type;"
curl http://localhost:4000/coral/memory_items?limit=2 # should return JSON
curl http://localhost:4000/health # {"status":"ok"}If those work but coral sql fails, run coral source info remeet --verbose to see the resolved REMEET_API_BASE.
docker exec -it remeet-db psql -U remeet -d remeet
\dt -- list tables
SELECT memory_type, count(*)
FROM memory_items GROUP BY memory_type; -- distribution of captures
SELECT id, semantic_summary
FROM semantic_memories LIMIT 5; -- preview embeddingsPhone isn't on the same WiFi as the laptop running the server. Check both are on the same SSID.
users— Google-signed-in accounts (one demo user pre-seeded)meetings— optional session grouping (column exists, not populated by mobile)memory_items— one row per capture, withmemory_type∈ {business_card,photo,url,qr_text,profile}semantic_memories— one row per embeddable item; pgvectorembedding+ human-readablesemantic_summary
Foreign key chain: semantic_memories.memory_item_id → memory_items.id. Both tables carry user_id directly for fast filter-before-search.