Source: Source pull request number: 225 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: feat(embedding): configurable local model, cache dir, and HF mirror | 嵌入模型可配置
Author: mechanic-Q
State: open
Draft: no
Merged: no
Head: mechanic-Q/agentmemory:feature/configurable-embedding @ a8f9d89
Base: main @ 1c8713f
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-05-02T07:26:35Z
Updated: 2026-05-17T09:42:42Z
Closed: (not closed)
Merged at: (not merged)
Original PR body:
Summary | 概述
Make the local embedding provider configurable via environment variables, enabling multilingual embedding models (bge-m3, multilingual-e5, etc.) and supporting users behind firewalls or in regions with restricted HuggingFace access.
通过环境变量使本地嵌入模型可配置,支持切换多语言模型和配置 HuggingFace 镜像。
Motivation | 动机
The local embedding provider was hardcoded to all-MiniLM-L6-v2 (384-dim English-only). Users with non-English content (Chinese, Japanese, Korean, multilingual codebases) had no way to use a multilingual embedding model without forking the codebase. Additionally, users behind corporate firewalls or in regions with slow/unreachable HuggingFace access had no way to configure a mirror endpoint.
本地嵌入模型被硬编码为 all-MiniLM-L6-v2(仅支持英文)。非英文用户无法切换多语言模型。部署在防火墙后或在 HuggingFace 受限地区的用户无法配置镜像。
Changes | 改动
Three new environment variables (all optional, defaults unchanged):
AGENTMEMORY_LOCAL_EMBEDDING_MODEL — model name (default: Xenova/all-MiniLM-L6-v2). Set to Xenova/bge-m3 for multilingual 1024-dim embeddings supporting 100+ languages including Chinese.
AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR — custom cache directory for pre-downloaded models (useful in air-gapped/offline environments).
HF_ENDPOINT — HuggingFace mirror endpoint (e.g. https://hf-mirror.com).
Auto-detects dimensions from a known list; falls back to 384 for unknown models.
Usage | 用法
# Multilingual Chinese support via bge-m3
AGENTMEMORY_LOCAL_EMBEDDING_MODEL=Xenova/bge-m3
# Offline/pre-downloaded models
AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/models/transformers
# HF mirror for China/firewall users
HF_ENDPOINT=https://hf-mirror.com
Backwards Compatibility | 向后兼容
All env vars are optional. Default behavior unchanged — all-MiniLM-L6-v2 with default HF endpoint.
Summary by CodeRabbit
- New Features
- Embedding provider now supports dynamic model and dimension configuration through environment variables instead of fixed defaults
- Users can specify a custom local model cache directory for embedding models
- Added support for Hugging Face endpoint configuration
Local branch:
Fork PR:
Fork decision:
Verification:
Notes:
Source: Source pull request number: 225 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: feat(embedding): configurable local model, cache dir, and HF mirror | 嵌入模型可配置
Author: mechanic-Q
State: open
Draft: no
Merged: no
Head: mechanic-Q/agentmemory:feature/configurable-embedding @ a8f9d89
Base: main @ 1c8713f
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-05-02T07:26:35Z
Updated: 2026-05-17T09:42:42Z
Closed: (not closed)
Merged at: (not merged)
Original PR body:
Summary | 概述
Make the local embedding provider configurable via environment variables, enabling multilingual embedding models (bge-m3, multilingual-e5, etc.) and supporting users behind firewalls or in regions with restricted HuggingFace access.
通过环境变量使本地嵌入模型可配置,支持切换多语言模型和配置 HuggingFace 镜像。
Motivation | 动机
The local embedding provider was hardcoded to
all-MiniLM-L6-v2(384-dim English-only). Users with non-English content (Chinese, Japanese, Korean, multilingual codebases) had no way to use a multilingual embedding model without forking the codebase. Additionally, users behind corporate firewalls or in regions with slow/unreachable HuggingFace access had no way to configure a mirror endpoint.本地嵌入模型被硬编码为 all-MiniLM-L6-v2(仅支持英文)。非英文用户无法切换多语言模型。部署在防火墙后或在 HuggingFace 受限地区的用户无法配置镜像。
Changes | 改动
Three new environment variables (all optional, defaults unchanged):
AGENTMEMORY_LOCAL_EMBEDDING_MODEL— model name (default: Xenova/all-MiniLM-L6-v2). Set toXenova/bge-m3for multilingual 1024-dim embeddings supporting 100+ languages including Chinese.AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR— custom cache directory for pre-downloaded models (useful in air-gapped/offline environments).HF_ENDPOINT— HuggingFace mirror endpoint (e.g.https://hf-mirror.com).Auto-detects dimensions from a known list; falls back to 384 for unknown models.
Usage | 用法
Backwards Compatibility | 向后兼容
All env vars are optional. Default behavior unchanged —
all-MiniLM-L6-v2with default HF endpoint.Summary by CodeRabbit
Local branch:
Fork PR:
Fork decision:
Verification:
Notes: