Skip to content

Commit 760ae55

Browse files
committed
docs: update default embedding model to jina-embeddings-v2-small-en
Replace Xenova/bge-m3 (1024 dims, ~560 MB) with Xenova/jina-embeddings-v2-small-en (512 dims, ~33 MB) as the default embedding model. Update all docs, examples, site content, and tests. Also fix bge-small/bge-base pooling from cls to mean per HuggingFace docs.
1 parent 3ad85ab commit 760ae55

23 files changed

Lines changed: 130 additions & 112 deletions

SPEC.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ See [docs/api-mcp.md](docs/api-mcp.md) for schemas and [docs/mcp-tools-guide.md]
4343
## Key features
4444

4545
- **Hybrid search**: BM25 + vector cosine, fused via RRF, BFS graph expansion — [docs/search.md](docs/search.md)
46-
- **Embeddings**: local ONNX (Xenova/bge-m3 default) or remote HTTP proxy — [docs/embeddings.md](docs/embeddings.md)
46+
- **Embeddings**: local ONNX (Xenova/jina-embeddings-v2-small-en default) or remote HTTP proxy — [docs/embeddings.md](docs/embeddings.md)
4747
- **File mirror**: `.notes/`, `.tasks/`, `.skills/` markdown files with reverse import — [docs/file-mirror.md](docs/file-mirror.md)
4848
- **Cross-graph links**: phantom proxy nodes connecting any graph to any graph — [docs/graphs-overview.md](docs/graphs-overview.md)
4949
- **Auth**: password login (JWT cookies) + API keys (Bearer) — [docs/authentication.md](docs/authentication.md)

docs/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ graph TD
2121
2222
subgraph Embed["Embedding Layer"]
2323
ONNX["ONNX Runtime"]
24-
Models["bge-m3 / jina-code"]
24+
Models["jina-small / jina-code"]
2525
end
2626
2727
Indexer --> Embed

docs/concepts-docs-indexing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ This means you can search for code examples by the symbols they define, or by se
7676

7777
### Step 4: Embed everything
7878

79-
Each chunk is embedded into a vector using the configured model (default: `Xenova/bge-m3`). The embedding captures the **semantic meaning** of `title + content`, enabling similarity-based search.
79+
Each chunk is embedded into a vector using the configured model (default: `Xenova/jina-embeddings-v2-small-en`). The embedding captures the **semantic meaning** of `title + content`, enabling similarity-based search.
8080

8181
Root nodes additionally get a `fileEmbedding` — embedded from `file path + h1 title` — used for file-level search ("find docs about authentication").
8282

docs/configuration.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Zero-config mode
44

5-
No config file needed. Just run `graphmemory serve` in your project directory — the current directory becomes the project with sensible defaults (BGE-M3 q8 model, all graphs enabled).
5+
No config file needed. Just run `graphmemory serve` in your project directory — the current directory becomes the project with sensible defaults (jina-small q8 model, all graphs enabled).
66

77
## Config file
88

@@ -60,8 +60,8 @@ server:
6060
search: 120
6161
auth: 10
6262
model:
63-
name: "Xenova/bge-m3"
64-
pooling: "cls"
63+
name: "Xenova/jina-embeddings-v2-small-en"
64+
pooling: "mean"
6565
normalize: true
6666
dtype: "q8"
6767
queryPrefix: ""
@@ -100,7 +100,7 @@ projects:
100100
name: "Project Bot"
101101
email: "bot@example.com"
102102
model:
103-
name: "Xenova/bge-m3"
103+
name: "Xenova/jina-embeddings-v2-small-en"
104104
embedding:
105105
maxChars: 24000
106106
access:
@@ -111,7 +111,7 @@ projects:
111111
include: "**/*.md"
112112
exclude: "**/drafts/**"
113113
model:
114-
name: "Xenova/bge-m3"
114+
name: "Xenova/bge-m3" # override: use multilingual model for docs
115115
pooling: "cls"
116116
normalize: true
117117
access:
@@ -140,7 +140,7 @@ workspaces:
140140
access:
141141
alice: rw
142142
model:
143-
name: "Xenova/bge-m3"
143+
name: "Xenova/jina-embeddings-v2-small-en"
144144
embedding:
145145
maxChars: 24000
146146
```
@@ -195,8 +195,8 @@ graphs.code.model → project.codeModel → server.codeModel → code defaults
195195
196196
| Field | Type | Default (general / code) | Description |
197197
|-------|------|---------|-------------|
198-
| `name` | string | `Xenova/bge-m3` / `jinaai/jina-embeddings-v2-base-code` | HuggingFace model ID |
199-
| `pooling` | string | `cls` / `mean` | Pooling strategy: `mean` or `cls` |
198+
| `name` | string | `Xenova/jina-embeddings-v2-small-en` / `jinaai/jina-embeddings-v2-base-code` | HuggingFace model ID |
199+
| `pooling` | string | `mean` / `mean` | Pooling strategy: `mean` or `cls` |
200200
| `normalize` | boolean | `true` | L2-normalize output vectors |
201201
| `dtype` | string | `q8` | Quantization: `fp32`, `fp16`, `q8`, `q4` |
202202
| `queryPrefix` | string | `""` | Prefix prepended to search queries |

docs/docker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ docker compose up -d
7676

7777
### Model cache
7878

79-
The default embedding model (`Xenova/bge-m3`, ~560 MB) downloads on first startup. Use a **named volume** so the model persists across container restarts.
79+
The default embedding model (`Xenova/jina-embeddings-v2-small-en`, ~33 MB) downloads on first startup. Use a **named volume** so the model persists across container restarts.
8080

8181
## Config for Docker
8282

docs/embeddings.md

Lines changed: 28 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ The embedding system converts text into high-dimensional vectors for semantic se
66

77
## Default models
88

9-
**Xenova/bge-m3** — the default embedding model (docs, knowledge, tasks, skills, files):
10-
- 1024 dimensions
11-
- Multilingual (100+ languages)
9+
**Xenova/jina-embeddings-v2-small-en** — the default embedding model (docs, knowledge, tasks, skills, files):
10+
- 512 dimensions
11+
- English, 33M parameters (4 transformer layers)
1212
- 8K token context
13-
- ~560 MB download size
14-
- Pooling: `cls`
13+
- ~33 MB download size (q8)
14+
- Pooling: `mean`
1515
- Normalization: L2-normalized (cosine similarity = dot product)
1616

1717
**jinaai/jina-embeddings-v2-base-code** — the default code graph model:
@@ -21,7 +21,7 @@ The embedding system converts text into high-dimensional vectors for semantic se
2121
- Pooling: `mean`
2222
- Normalization: L2-normalized
2323

24-
The code graph uses a separate model inheritance chain (`codeModel`) so it can use a code-optimized model by default while other graphs use BGE-M3.
24+
The code graph uses a separate model inheritance chain (`codeModel`) so it can use a code-optimized model by default while other graphs use jina-small.
2525

2626
## Model registry
2727

@@ -77,8 +77,8 @@ graph.model → project.codeModel → server.codeModel → code defaults (cod
7777

7878
| Field | Type | Default | Description |
7979
|-------|------|---------|-------------|
80-
| `name` | string | `Xenova/bge-m3` | HuggingFace model ID |
81-
| `pooling` | string | `cls` | Pooling strategy: `mean` or `cls` |
80+
| `name` | string | `Xenova/jina-embeddings-v2-small-en` | HuggingFace model ID |
81+
| `pooling` | string | `mean` | Pooling strategy: `mean` or `cls` |
8282
| `normalize` | boolean | `true` | L2-normalize output vectors |
8383
| `dtype` | string | `q8` | Quantization: `fp32`, `fp16`, `q8`, `q4` |
8484
| `queryPrefix` | string | `""` | Prefix prepended to search queries |
@@ -103,7 +103,16 @@ graph.embedding → project.embedding → server.embedding → defaults
103103

104104
## Model examples
105105

106-
### BGE-M3 (default, recommended)
106+
### jina-embeddings-v2-small-en (default)
107+
108+
```yaml
109+
model:
110+
name: "Xenova/jina-embeddings-v2-small-en"
111+
pooling: "mean"
112+
normalize: true
113+
```
114+
115+
### BGE-M3 (multilingual, larger)
107116
108117
```yaml
109118
model:
@@ -117,7 +126,7 @@ model:
117126
```yaml
118127
model:
119128
name: "Xenova/bge-base-en-v1.5"
120-
pooling: "cls"
129+
pooling: "mean"
121130
normalize: true
122131
queryPrefix: "Represent this sentence for searching relevant passages: "
123132
```
@@ -127,7 +136,7 @@ model:
127136
```yaml
128137
model:
129138
name: "Xenova/bge-small-en-v1.5"
130-
pooling: "cls"
139+
pooling: "mean"
131140
normalize: true
132141
queryPrefix: "Represent this sentence for searching relevant passages: "
133142
```
@@ -156,10 +165,10 @@ model:
156165
157166
```yaml
158167
model:
159-
name: "Xenova/bge-m3"
160-
pooling: "cls"
168+
name: "Xenova/jina-embeddings-v2-small-en"
169+
pooling: "mean"
161170
normalize: true
162-
dtype: "q8" # fp32, fp16, q8, q4
171+
dtype: "q4" # fp32, fp16, q8, q4
163172
```
164173
165174
## Remote embedding
@@ -222,7 +231,7 @@ server:
222231
{ "embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]] }
223232
```
224233

225-
The `model` parameter selects which embedding model to use: `"default"` (general, BGE-M3) or `"code"` (code-optimized, jina-code). Both models are loaded when the embedding API is enabled.
234+
The `model` parameter selects which embedding model to use: `"default"` (general, jina-small) or `"code"` (code-optimized, jina-code). Both models are loaded when the embedding API is enabled.
226235

227236
### Embedding API configuration
228237

@@ -271,19 +280,19 @@ projects:
271280
my-app:
272281
projectDir: "/path/to/my-app"
273282
model:
274-
name: "Xenova/bge-m3" # default for most graphs
283+
name: "Xenova/bge-m3" # multilingual model for most graphs
275284
pooling: "cls"
276285
normalize: true
277286
graphs:
278287
files:
279288
model:
280-
name: "Xenova/bge-small-en-v1.5" # smaller model for file paths
281-
pooling: "cls"
289+
name: "Xenova/jina-embeddings-v2-small-en" # lighter model for file paths
290+
pooling: "mean"
282291
normalize: true
283292
code:
284293
model:
285294
name: "Xenova/bge-base-en-v1.5" # different model for code
286-
pooling: "cls"
295+
pooling: "mean"
287296
normalize: true
288297
```
289298

docs/indexer.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ flowchart TD
2525
During initial indexing, the three queues run **sequentially by phase** rather than concurrently. This ensures only one embedding model is loaded at a time, reducing peak memory:
2626

2727
```
28-
Phase 1: docs → scan(docs) + drain(docs) — triggers bge-m3 lazy load
29-
Phase 2: files → scan(files) + drain(files) — reuses bge-m3 (already loaded)
28+
Phase 1: docs → scan(docs) + drain(docs) — triggers jina-small lazy load
29+
Phase 2: files → scan(files) + drain(files) — reuses jina-small (already loaded)
3030
Phase 3: code → scan(code) + drain(code) — triggers jina-code lazy load
3131
Finalize: rebuildDirectoryStats, resolvePendingLinks, scanMirrorDirs (K/T/S)
3232
```

docs/overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
- **Stores knowledge** (facts, notes, decisions) in a dedicated knowledge graph with typed relations, file attachments, and cross-graph links
1111
- **Tracks tasks** with kanban workflow, priorities, due dates, estimates, assignees, and cross-graph links
1212
- **Manages skills** (reusable recipes/procedures) with steps, triggers, usage tracking, and cross-graph links
13-
- **Embeds every node** locally using `Xenova/bge-m3` by default (no external API calls); supports per-graph models with configurable pooling, normalization, dtype, and prefixes
13+
- **Embeds every node** locally using `Xenova/jina-embeddings-v2-small-en` by default (no external API calls); supports per-graph models with configurable pooling, normalization, dtype, and prefixes
1414
- **Answers search queries** via hybrid search (BM25 keyword + vector cosine similarity) with BFS graph expansion
1515
- **Watches for file changes** and re-indexes incrementally in real time
1616

@@ -45,7 +45,7 @@
4545
## Requirements
4646

4747
- **Node.js** >= 22
48-
- The default embedding model (`Xenova/bge-m3`, ~560 MB) downloads on first startup
48+
- The default embedding model (`Xenova/jina-embeddings-v2-small-en`, ~33 MB) downloads on first startup
4949

5050
## Repository
5151

graph-memory.yaml.example

Lines changed: 29 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -77,8 +77,8 @@ server:
7777
# Default model config (fallback for all graphs except code).
7878
# Taken as a whole object from the first level that defines it (no field-by-field merge).
7979
# model:
80-
# name: "Xenova/bge-m3" # HuggingFace model ID (default: Xenova/bge-m3)
81-
# pooling: "cls" # Pooling strategy: "mean" or "cls"
80+
# name: "Xenova/jina-embeddings-v2-small-en" # HuggingFace model ID (default)
81+
# pooling: "mean" # Pooling strategy: "mean" or "cls"
8282
# normalize: true # L2-normalize output vectors
8383
# dtype: "q8" # Quantization: fp32, fp16, q8, q4 (default: q8)
8484
# queryPrefix: "" # Prefix prepended to search queries
@@ -130,8 +130,8 @@ projects:
130130

131131
# Per-project model config (overrides server.model — taken as whole object)
132132
# model:
133-
# name: "Xenova/bge-m3"
134-
# pooling: "cls"
133+
# name: "Xenova/jina-embeddings-v2-small-en"
134+
# pooling: "mean"
135135
# normalize: true
136136

137137
# Per-project code model (overrides server.codeModel — separate chain for code graph)
@@ -164,7 +164,7 @@ projects:
164164
# include: "**/*.md" # Glob for markdown files (default: "**/*.md")
165165
# exclude: "**/changelog/**" # Additional exclude (merged with project + server)
166166
# model: # Full model config (no merge with parent)
167-
# name: "Xenova/bge-m3"
167+
# name: "Xenova/bge-m3" # override: use multilingual model for docs
168168
# pooling: "cls"
169169
# normalize: true
170170
# embedding: # Embedding config (field-by-field merge with parent)
@@ -224,8 +224,8 @@ projects:
224224
# # exclude: "**/vendor/**" # Additional exclude (merged with server default)
225225
# # Workspace-level model (overrides server.model for shared graphs)
226226
# # model:
227-
# # name: "Xenova/bge-m3"
228-
# # pooling: "cls"
227+
# # name: "Xenova/jina-embeddings-v2-small-en"
228+
# # pooling: "mean"
229229
# # normalize: true
230230
# # Workspace-level embedding (overrides server.embedding for shared graphs)
231231
# # embedding:
@@ -235,7 +235,7 @@ projects:
235235
# # knowledge:
236236
# # enabled: true
237237
# # model:
238-
# # name: "Xenova/bge-m3"
238+
# # name: "Xenova/jina-embeddings-v2-small-en"
239239
# # embedding:
240240
# # maxChars: 16000
241241

@@ -245,9 +245,17 @@ projects:
245245
# Below are examples of how to configure different embedding models.
246246
# Copy the relevant `model:` block into server, project, or graphs section.
247247
#
248-
# ── Default: BGE-M3 (recommended) ─────────────────────────────────────────
249-
# Best general-purpose model. 1024 dimensions, multilingual (100+ languages),
250-
# 8K token context. Works out of the box with default settings.
248+
# ── Default: jina-embeddings-v2-small-en ──────────────────────────────────
249+
# Lightweight English model. 512 dimensions, 33M parameters, 8K token context.
250+
# ~33 MB download (q8). Works out of the box with default settings.
251+
#
252+
# model:
253+
# name: "Xenova/jina-embeddings-v2-small-en"
254+
# pooling: "mean"
255+
# normalize: true
256+
#
257+
# ── BGE-M3 (multilingual) ────────────────────────────────────────────────
258+
# Best multilingual model. 1024 dimensions, 100+ languages, 8K context, ~560 MB.
251259
#
252260
# model:
253261
# name: "Xenova/bge-m3"
@@ -260,7 +268,7 @@ projects:
260268
#
261269
# model:
262270
# name: "Xenova/bge-base-en-v1.5"
263-
# pooling: "cls"
271+
# pooling: "mean"
264272
# normalize: true
265273
# queryPrefix: "Represent this sentence for searching relevant passages: "
266274
#
@@ -269,7 +277,7 @@ projects:
269277
#
270278
# model:
271279
# name: "Xenova/bge-small-en-v1.5"
272-
# pooling: "cls"
280+
# pooling: "mean"
273281
# normalize: true
274282
# queryPrefix: "Represent this sentence for searching relevant passages: "
275283
#
@@ -298,20 +306,21 @@ projects:
298306
# Options: fp32 (default), fp16, q8, q4
299307
#
300308
# model:
301-
# name: "Xenova/bge-m3"
302-
# pooling: "cls"
309+
# name: "Xenova/jina-embeddings-v2-small-en"
310+
# pooling: "mean"
303311
# normalize: true
304-
# dtype: "q8"
312+
# dtype: "q4"
305313
#
306314
# ── Mixed config: different models per graph ──────────────────────────────
307315
# The code graph defaults to jinaai/jina-embeddings-v2-base-code via `codeModel`.
308-
# Other graphs default to Xenova/bge-m3 via `model`. You can override per-graph:
316+
# Other graphs default to Xenova/jina-embeddings-v2-small-en via `model`.
317+
# You can override per-graph:
309318
#
310319
# projects:
311320
# my-app:
312321
# projectDir: "/path/to/my-app"
313322
# model:
314-
# name: "Xenova/bge-m3" # default for docs, knowledge, tasks, skills, files
323+
# name: "Xenova/bge-m3" # override: multilingual model for most graphs
315324
# pooling: "cls"
316325
# normalize: true
317326
# codeModel:
@@ -321,6 +330,6 @@ projects:
321330
# graphs:
322331
# files:
323332
# model:
324-
# name: "Xenova/bge-small-en-v1.5" # smaller model for file paths
325-
# pooling: "cls"
333+
# name: "Xenova/jina-embeddings-v2-small-en" # lighter model for file paths
334+
# pooling: "mean"
326335
# normalize: true

site/blog/2026-03-23-getting-started-5-minutes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ No config file needed. Graph Memory uses your current directory as the project.
2323
You'll see output like:
2424

2525
```
26-
INFO Registered model (lazy) model="Xenova/bge-m3"
26+
INFO Registered model (lazy) model="Xenova/jina-embeddings-v2-small-en"
2727
INFO Starting indexing phase phase="1/3 docs"
2828
INFO Starting indexing phase phase="2/3 files"
2929
INFO Starting indexing phase phase="3/3 code"

0 commit comments

Comments
 (0)