Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -784,6 +784,8 @@ This allows the Agent to get "smarter with use" through interactions with the wo

For more details, please visit our [Full Documentation](./docs/en/).

If a Docker upgrade leaves your container failing to start (for example with `ModuleNotFoundError: No module named 'openviking.console.bootstrap'` or `EmbeddingRebuildRequiredError`), see the [Upgrades and Migrations guide](./docs/en/guides/14-upgrades-and-migrations.md).

### Community & Team

For more details, please see: **[About Us](./docs/en/about/01-about-us.md)**
Expand Down
2 changes: 2 additions & 0 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -827,6 +827,8 @@ OpenViking 内置了记忆自迭代循环。在每个会话结束时,开发者

更多详情,请访问我们的[完整文档](./docs/zh/)。

如果 Docker 升级后容器启动失败(例如 `ModuleNotFoundError: No module named 'openviking.console.bootstrap'` 或 `EmbeddingRebuildRequiredError`),请参阅[升级与迁移指南](./docs/zh/guides/14-upgrades-and-migrations.md)。

### 社区与团队

更多详情,请参见:**[关于我们](./docs/zh/about/01-about-us.md)**
Expand Down
10 changes: 10 additions & 0 deletions docs/en/faq/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,15 @@ This strategy finds semantically matching fragments while understanding the comp
3. **Use local storage**: Use `local` backend during development to reduce network latency
4. **Async operations**: Fully utilize `AsyncOpenViking` / `AsyncHTTPClient`'s async capabilities

### I can't upgrade my Docker container — what do I do?

Two specific failures cause most upgrade reports:

- `ModuleNotFoundError: No module named 'openviking.console.bootstrap'` — Web Studio is bundled into `openviking-server` starting in v0.3.19, so a `command:` line that still launches the standalone bootstrap module will exit immediately. Drop that line.
- `EmbeddingRebuildRequiredError` — the embedding model, provider, or dimension in `ov.conf` no longer matches the existing `vectordb/context` collection. You can either roll the embedding config back, or back up and rebuild only `vectordb/context/` and re-run `ov reindex` per namespace.

The full step-by-step recovery, including a `docker-compose.yml` before/after example and the exact `ov reindex` invocations, is in [Upgrades and Migrations](../guides/14-upgrades-and-migrations.md).

## Deployment

### What's the difference between embedded mode and service mode?
Expand Down Expand Up @@ -400,3 +409,4 @@ Yes, OpenViking main project is open source under the AGPL-3.0 license, and exam
- [Architecture Overview](../concepts/01-architecture.md) - Deep dive into system design
- [Retrieval Mechanism](../concepts/07-retrieval.md) - Detailed retrieval process
- [Configuration Guide](../guides/01-configuration.md) - Complete configuration reference
- [Upgrades and Migrations](../guides/14-upgrades-and-migrations.md) - Recover from upgrade-time startup failures
187 changes: 187 additions & 0 deletions docs/en/guides/14-upgrades-and-migrations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Upgrades and Migrations

This guide collects the recovery steps for the upgrade-time blockers that
have surfaced most often in real deployments. If your container exits at
boot after pulling a newer image, start here before filing an issue.

## When to read this guide

- You are upgrading an existing OpenViking deployment between minor
versions.
- The server fails to start after the upgrade (the container exits or
the health check never goes green).
- You see `ModuleNotFoundError: No module named 'openviking.console.bootstrap'`
in the container logs.
- You see `EmbeddingRebuildRequiredError` in the server logs.

## Before you upgrade

A few minutes of preparation makes every other step in this guide
recoverable. Do all of these before pulling a new image.

- **Snapshot your data directory.** This is the directory mounted into
the container at `/app/.openviking` (typically `~/.openviking` on the
host). The two paths that matter for retrieval are the AGFS root and
`vectordb/`. A simple `cp -a` or `tar` of the whole directory while
the server is stopped is enough; you do not need a live backup tool.
- **Note your current `ov.conf`.** Embedding model, provider, and
dimension are the fields most likely to drift between versions and
to break startup. Keep a copy of the file you were running with so
you can roll back if the upgrade fails.
- **Stop the server gracefully.** Use `docker stop <container>` (or
`docker compose down`). Avoid `docker kill -9` / `SIGKILL`: the
vector index relies on a clean shutdown to release locks under
`vectordb/<collection>/store/LOCK`, and a hard kill can leave a
stale lock that blocks the next start.

## Common breaking transitions

The two failures below account for the majority of upgrade reports
between v0.3.15 and the v0.3.x series after it. They can happen
together — the server may exit on the first one, and only after you
fix it do you see the second one — so read both before changing
anything.

### v0.3.15 → v0.3.19+ : `openviking.console.bootstrap` removed

- **Symptom.** The container exits immediately after start. The log
shows `ModuleNotFoundError: No module named 'openviking.console.bootstrap'`,
often coming from a `python -m openviking.console.bootstrap ...`
line in your `command:` override.
- **Cause.** Web Studio used to ship as a separate process started by
`python -m openviking.console.bootstrap`. Starting in v0.3.19 the
Studio assets are bundled into `openviking-server`, and the
standalone `openviking.console.bootstrap` module no longer exists
(see PR #2320). Any custom `command:` that still launches it will
fail with `ModuleNotFoundError`.
- **Fix.** In your `docker-compose.yml` (or whatever you use to run
the container), drop the `python -m openviking.console.bootstrap`
invocation. The default entrypoint already runs `openviking-server`,
which now serves both the API on port `1933` and the Studio UI.
- **Worked example.**

Before — two processes, one of them now-removed:

```yaml
services:
openviking:
image: ghcr.io/volcengine/openviking:latest
command: |
openviking-server &
python -m openviking.console.bootstrap --host 0.0.0.0 --port 8020
```

After — single process, default entrypoint:

```yaml
services:
openviking:
image: ghcr.io/volcengine/openviking:latest
# no `command:` override needed — the image entrypoint runs
# openviking-server, which now also serves Web Studio.
```

If you still want to keep an explicit `command:`, set it to
`command: openviking-server` and remove the bootstrap line.

### Any version with `EmbeddingRebuildRequiredError`

- **Symptom.** The server logs `EmbeddingRebuildRequiredError:
Existing collection embedding dimension (...) does not match current
configuration (...)` or
`EmbeddingRebuildRequiredError: Existing collection embedding metadata
does not match current configuration`. Startup aborts before the
HTTP server is ready.
- **Cause.** The vector collection on disk records which embedding
provider, model, and dimension were used to build it. When the
embedding section of `ov.conf` changes (different provider, different
model, or — most importantly — a different vector dimension) the
existing vectors are no longer comparable to new ones. The server
refuses to start rather than mix incompatible vectors.
- **Choose one path.** Both paths preserve your business data; they
differ only in whether you keep the old vectors or rebuild them.

**Path A — keep your data, restore the old embedding config.** Roll
the embedding section of `ov.conf` back to the values the existing
collection was built with (the values you noted in *Before you
upgrade*). The server will start. Schedule the embedding-model
change as a deliberate migration via Path B during a maintenance
window. If the only change between old and new config is provider
or model name and the dimension is identical, you can also set
`embedding.allow_metadata_override = true` in `ov.conf` to keep the
existing vectors and just rewrite the recorded metadata.

**Path B — rebuild embeddings under the new config.** This
re-embeds every resource, memory, and skill. The cost is one full
embed pass over your indexed content, billed against whatever
embedding provider you have configured.

1. **Back up `vectordb/context/`.** Inside your data directory
(host: `~/.openviking`, container: `/app/.openviking`), rename
`data/vectordb/context/` to something like
`data/vectordb/context.bak-<date>/`, or copy it elsewhere. Do
**not** delete it yet — you want a fallback if the rebuild fails
halfway.
2. **Delete only `data/vectordb/context/`.** Do not delete other
directories under `data/`. The AGFS tree (resources, memories,
skills, sessions) lives outside `vectordb/` and is what we are
trying to preserve. Removing anything else risks losing the very
data you are rebuilding embeddings for.
3. **Start the server with the new `ov.conf`.** It will create a
fresh `vectordb/context/` collection that matches the new
embedding configuration. The server should now come up and pass
`/health`.
4. **Reindex your namespaces.** Use the CLI to re-embed the content
that previously had vectors:

```bash
ov reindex viking://resources --mode vectors_only --wait true
ov reindex viking://user/memories --mode vectors_only --wait true
ov reindex viking://agent/memories --mode vectors_only --wait true
ov reindex viking://agent/skills --mode vectors_only --wait true
```

Run only the namespaces you actually use. `--mode vectors_only`
re-embeds against the existing semantic summaries (L0/L1) and is
the right choice when only the embedding configuration changed.
If your semantic-summary configuration also changed, use
`--mode semantic_and_vectors` instead — that re-runs L0/L1
summarization as well and costs additional VLM calls.
5. **Verify search works.** Run a query you know the answer to
against a representative URI:

```bash
ov find "<known-string>" --target-uri viking://resources/
```

Once you are satisfied, delete the `context.bak-<date>/` backup.

## Sanity checks after a successful upgrade

Run these against the upgraded container before pointing production
traffic at it.

- `curl http://localhost:1933/health` returns a healthy response.
- `ov tree viking://resources -L 1` lists the resources you expect to
see — confirms the AGFS tree survived the upgrade.
- `ov find <known-string>` returns the hits you expect — confirms the
vector index is populated and queryable.
- The Studio UI loads at the same port you used before (default
`1933` for direct access, or `1934` if you go through Caddy).

## What to do if you are stuck

If none of the above resolves the failure, file an issue with:

- The full server logs from the start of the failing run (everything
from container start through the first stack trace).
- Your `ov.conf`, with API keys and other secrets redacted.
- The exact version you upgraded **from** and **to** (image tag is
fine).
- The output of `ls data/vectordb/` from the data directory you are
pointing at.

Tag the issue with `upgrade` so the maintainers can route it. See
also the related migration note for the User / Peer model in
[migration/01-user-peer-model.md](../migration/01-user-peer-model.md)
if you are crossing the 0.3.x → 0.4.0 boundary.
10 changes: 10 additions & 0 deletions docs/zh/faq/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,15 @@ OpenViking 使用分数传播机制:
3. **使用本地存储**:开发阶段使用 `local` 后端减少网络延迟
4. **异步操作**:充分利用 `AsyncOpenViking` / `AsyncHTTPClient` 的异步特性

### Docker 容器升不上去,怎么办?

升级失败的报告主要集中在两个错误:

- `ModuleNotFoundError: No module named 'openviking.console.bootstrap'` —— 自 v0.3.19 起 Web Studio 已打包进 `openviking-server`,仍然启动独立 bootstrap 模块的 `command:` 会立即退出。删除该行即可。
- `EmbeddingRebuildRequiredError` —— `ov.conf` 中的 embedding 模型、provider 或维度与已有的 `vectordb/context` 集合不再匹配。你可以选择把 embedding 配置回滚,或者备份并仅重建 `vectordb/context/`,然后逐个 namespace 运行 `ov reindex`。

完整的分步恢复流程,包括 `docker-compose.yml` 改动前后对照以及具体的 `ov reindex` 命令,请参阅[升级与迁移](../guides/14-upgrades-and-migrations.md)。

## 部署相关

### 嵌入式模式和服务模式有什么区别?
Expand Down Expand Up @@ -392,3 +401,4 @@ client = ov.AsyncHTTPClient(url="http://localhost:1933", api_key="your-key")
- [架构概述](../concepts/01-architecture.md) - 深入理解系统设计
- [检索机制](../concepts/07-retrieval.md) - 检索流程详解
- [配置指南](../guides/01-configuration.md) - 完整配置参考
- [升级与迁移](../guides/14-upgrades-and-migrations.md) - 处理升级时的启动失败
Loading