From abe8c26bd9571f8252139561dafb17aaa162bf57 Mon Sep 17 00:00:00 2001 From: Willi Budzinski Date: Wed, 17 Jun 2026 19:49:37 +0200 Subject: [PATCH 1/2] feat: configure local embedding cache and mirror --- .env.example | 2 + README.md | 6 +- .../plan.md | 176 ++++++++++++++++++ .../todo.md | 84 +++++++++ plugin/skills/agentmemory-config/REFERENCE.md | 3 +- src/providers/transformers.ts | 25 ++- test/embedding-provider.test.ts | 74 +++++++- 7 files changed, 364 insertions(+), 6 deletions(-) create mode 100644 docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md create mode 100644 docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md diff --git a/.env.example b/.env.example index c52ebbfba..9ea6c08ff 100644 --- a/.env.example +++ b/.env.example @@ -84,6 +84,8 @@ # EMBEDDING_PROVIDER=local # local | openai | voyage | cohere | gemini | openrouter # LOCAL_EMBEDDING_MODEL=Xenova/paraphrase-multilingual-MiniLM-L12-v2 # Primary local model override # EMBEDDING_MODEL=Xenova/bge-large-zh-v1.5 # Local fallback alias when LOCAL_EMBEDDING_MODEL is unset +# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory +# HF_ENDPOINT=https://hf-mirror.com # Optional transformers.js remoteHost mirror; local provider remains local_files_only # VOYAGE_API_KEY=pa-... # Optimised for code embeddings diff --git a/README.md b/README.md index ea9c6109e..17998d6a7 100644 --- a/README.md +++ b/README.md @@ -1001,7 +1001,7 @@ npm install @xenova/transformers | Cohere | `embed-english-v3.0` | Free trial | `EMBEDDING_PROVIDER=cohere` + `COHERE_API_KEY`; general purpose | | OpenRouter | Any model | Varies | `EMBEDDING_PROVIDER=openrouter` + `OPENROUTER_API_KEY`; set `OPENROUTER_EMBEDDING_DIMENSIONS` for non-1536 models | -`LOCAL_EMBEDDING_MODEL` should name a Xenova feature-extraction model. agentmemory derives dimensions for common 384/512/768/1024-dimensional Xenova models and otherwise falls back to 384 unless `OPENAI_EMBEDDING_DIMENSIONS` is set. The dimension guard rejects mismatched vectors instead of silently corrupting the vector index. Local model loading uses transformers.js offline/local-file mode, so selected models must already be available in the transformers.js model cache. +`LOCAL_EMBEDDING_MODEL` should name a Xenova feature-extraction model. agentmemory derives dimensions for common 384/512/768/1024-dimensional Xenova models and otherwise falls back to 384 unless `OPENAI_EMBEDDING_DIMENSIONS` is set. The dimension guard rejects mismatched vectors instead of silently corrupting the vector index. Local model loading uses transformers.js offline/local-file mode, so selected models must already be available in the transformers.js model cache. Set `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` to point transformers.js local model lookup and filesystem cache at a prepared directory. Set `HF_ENDPOINT` to configure transformers.js `remoteHost` for mirror/proxy setups; the local provider still passes `local_files_only: true`, so this does not enable remote downloads by itself. --- @@ -1376,7 +1376,7 @@ Reasoning-class models (`o1`-style with `` blocks) can return empty `cont OpenRouter reasoning models can be configured with `OPENROUTER_REASONING_EFFORT=xhigh|high|medium|low|minimal|none`. Set `OPENROUTER_INCLUDE_REASONING=true` to ask supported OpenRouter models to return reasoning output when they expose it. -Local embeddings are available via `@xenova/transformers` — set `EMBEDDING_PROVIDER=local` to use `paraphrase-multilingual-MiniLM-L12-v2` entirely on-device, or set `LOCAL_EMBEDDING_MODEL` to another Xenova feature-extraction model. Common 384/512/768/1024-dimensional local models are recognized automatically; set `OPENAI_EMBEDDING_DIMENSIONS` for custom local models. With no `EMBEDDING_PROVIDER`, agentmemory uses BM25+Graph search and does not call a text embedding provider. +Local embeddings are available via `@xenova/transformers` — set `EMBEDDING_PROVIDER=local` to use `paraphrase-multilingual-MiniLM-L12-v2` entirely on-device, or set `LOCAL_EMBEDDING_MODEL` to another Xenova feature-extraction model. Common 384/512/768/1024-dimensional local models are recognized automatically; set `OPENAI_EMBEDDING_DIMENSIONS` for custom local models. Set `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` for a prepared transformers.js local model/cache directory, and `HF_ENDPOINT` to set transformers.js `remoteHost` for mirror/proxy environments. The current local provider keeps `local_files_only: true`, so configured models must still be available locally. With no `EMBEDDING_PROVIDER`, agentmemory uses BM25+Graph search and does not call a text embedding provider. ### Cost-aware model selection @@ -1564,6 +1564,8 @@ Create `~/.agentmemory/.env`: # EMBEDDING_PROVIDER=local # LOCAL_EMBEDDING_MODEL=Xenova/paraphrase-multilingual-MiniLM-L12-v2 # EMBEDDING_MODEL=Xenova/bge-large-zh-v1.5 # Fallback alias for local embeddings when LOCAL_EMBEDDING_MODEL is unset +# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory +# HF_ENDPOINT=https://hf-mirror.com # Optional transformers.js remoteHost mirror; local provider remains local_files_only # VOYAGE_API_KEY=... # OPENAI_API_KEY=sk-... # OPENAI_BASE_URL=https://api.openai.com # Override for Azure / vLLM / LM Studio / proxies diff --git a/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md new file mode 100644 index 000000000..b9f4eca76 --- /dev/null +++ b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md @@ -0,0 +1,176 @@ +# Issue 798 Local Embedding Cache And HF Mirror Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add the remaining local embedding cache-directory and Hugging Face mirror configuration from Issue 798 while preserving the Issue 917 model/dimension behavior. + +**Architecture:** Keep the change inside the local transformers configuration boundary. Extend `src/providers/transformers.ts` so every transformers.js load applies Node-safe WASM flags plus local model/cache and remote-host settings from environment, then cover it through existing embedding provider tests. + +**Tech Stack:** TypeScript ESM, Vitest, `@xenova/transformers` v2.17.2 env configuration. + +--- + +## Files + +- Modify: `src/providers/transformers.ts` +- Modify: `test/embedding-provider.test.ts` +- Modify: `README.md` +- Modify: `.env.example` +- Modify: `docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md` +- Modify: `docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md` + +Spec path: none. Source of truth is Issue 798, the current user delegation, the task record, and local repo behavior. + +GitHub PR prep: mandatory local branch prep after implementation per `github-feature-loop`; no fetch, pull, push, or PR creation is approved. + +Security-sensitive surfaces: user-controlled filesystem path and remote host configuration for a local embedding provider. No auth, secret, dependency, REST, MCP, schema, or persistence changes planned. + +## Task 1: Add failing config tests + +**Files:** +- Modify: `test/embedding-provider.test.ts` + +- [ ] **Step 1: Add env cleanup keys** + +Add `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` and `HF_ENDPOINT` to `ENV_KEYS`. + +- [ ] **Step 2: Add direct transformer configuration tests** + +Add tests under `describe("configureTransformersForNode", ...)` that import the real `src/providers/transformers.ts` through `freshTransformersModule()` and pass a fake module object: + +```ts +const transformers = { + pipeline: vi.fn(), + env: { + localModelPath: "/models/", + cacheDir: "/cache/", + remoteHost: "https://huggingface.co/", + backends: { + onnx: { + wasm: { numThreads: 4 }, + }, + }, + }, +}; +``` + +- [ ] **Step 3: Add cache/local dir tests** + +Add one test that sets `process.env.AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models`, calls `configureTransformersForNode(transformers)`, and expects both `transformers.env.localModelPath` and `transformers.env.cacheDir` to equal `/opt/agentmemory-models`. + +Add one `.env`-backed test by writing `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/tmp/agentmemory-dotenv-models` into `${sandboxHome}/.agentmemory/.env`, calling `configureTransformersForNode(transformers)`, and expecting both fields to equal `/tmp/agentmemory-dotenv-models`. + +- [ ] **Step 4: Add HF mirror test** + +Add a test that sets `HF_ENDPOINT=https://hf-mirror.com`, calls `configureTransformersForNode(transformers)`, and expects `transformers.env.remoteHost` to equal `https://hf-mirror.com/`. + +- [ ] **Step 5: Verify RED** + +Run: + +```bash +corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts +``` + +Expected before implementation: fails on the new cache/mirror expectations. + +## Task 2: Implement transformers env configuration + +**Files:** +- Modify: `src/providers/transformers.ts` + +- [ ] **Step 1: Extend `TransformersModule` env type** + +Add optional fields under `env`: `localModelPath`, `cacheDir`, and `remoteHost`. + +- [ ] **Step 2: Add env helpers** + +Read `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` and `HF_ENDPOINT` through `getEnvVar()`. Trim blank values; normalize `HF_ENDPOINT` to a trailing slash. + +- [ ] **Step 3: Configure transformers.js env** + +In `configureTransformersForNode`, after ONNX thread config: + +```ts +const modelDir = getEnvVar("AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR")?.trim(); +if (modelDir && transformers.env) { + transformers.env.localModelPath = modelDir; + transformers.env.cacheDir = modelDir; +} + +const hfEndpoint = getEnvVar("HF_ENDPOINT")?.trim(); +if (hfEndpoint && transformers.env) { + transformers.env.remoteHost = hfEndpoint.endsWith("/") + ? hfEndpoint + : `${hfEndpoint}/`; +} +``` + +- [ ] **Step 4: Verify GREEN** + +Run the focused Vitest command from Task 1 and expect all tests in `test/embedding-provider.test.ts` to pass. + +## Task 3: Update docs + +**Files:** +- Modify: `README.md` +- Modify: `.env.example` + +- [ ] **Step 1: README embedding provider docs** + +Extend the local embedding paragraph to mention: +- `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` points transformers.js local model lookup and filesystem cache at a prepared directory. +- `HF_ENDPOINT` sets transformers.js `env.remoteHost` for mirror/proxy setups, but the current local provider still passes `local_files_only: true`. +- Existing offline/local-file behavior means selected models still need to be present locally when local-only loading is used. + +- [ ] **Step 2: Environment example** + +Add commented examples near local embedding config: + +```dotenv +# AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/agentmemory-models # Optional transformers.js local model/cache directory +# HF_ENDPOINT=https://hf-mirror.com # Optional transformers.js remoteHost mirror; local provider remains local_files_only +``` + +- [ ] **Step 3: Inspect stale references** + +Run: + +```bash +rg -n "AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR|HF_ENDPOINT|local model/cache|transformers.js model cache" README.md .env.example test/embedding-provider.test.ts src/providers/transformers.ts +``` + +Expected: references are consistent and no docs claim unimplemented behavior. + +## Task 4: Verification and local PR prep + +**Files:** +- All task-owned files above. + +- [ ] **Step 1: Run focused verification** + +```bash +corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts +git diff --check +``` + +- [ ] **Step 2: Run security/static checks required by touched surface** + +Run Semgrep on changed code/docs/task files: + +```bash +semgrep scan --config p/default --error --metrics=off src/providers/transformers.ts test/embedding-provider.test.ts README.md .env.example docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md +``` + +OSV is not required unless dependency, lockfile, container, vendored, or package-manager surfaces change. + +- [ ] **Step 3: Run review chain and prepare local commit** + +Use passive security review, focused simplification, implementation review, and verification-before-completion before staging. Stage only task-owned files and create a factual commit if all required checks pass. + +## Self-Review + +- Spec coverage: Issue 798 model configurability is already covered by Issue 917; this plan covers the remaining cache directory and HF mirror gaps plus docs/tests. +- Placeholder scan: no TBD/TODO placeholders remain. +- Plan review corrections: tests now target the real `configureTransformersForNode`, implementation uses `getEnvVar`, and HF mirror documentation preserves the existing offline `local_files_only` contract. +- Type consistency: tests and implementation use `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR`, `HF_ENDPOINT`, `localModelPath`, `cacheDir`, and `remoteHost` consistently. diff --git a/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md new file mode 100644 index 000000000..ac24e7942 --- /dev/null +++ b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md @@ -0,0 +1,84 @@ +# Issue 798 Local Embedding Cache And HF Mirror + +Scope: `wbugitlab1/agentmemory#798`, upstream PR 225 mirror. Worktree `/Users/A1538552/.codex/worktrees/05fb/agentmemory`; origin `https://github.com/wbugitlab1/agentmemory.git`. + +## Sprint Contract + +Goal: Finish the still-actionable local embedding configuration gaps from Issue 798. + +Scope: +- Validate whether Issue 798 is stale or still actionable after the already-closed Issue 917 work. +- Keep the existing fork contract: `LOCAL_EMBEDDING_MODEL` remains primary, `EMBEDDING_MODEL` remains fallback, and local loading remains offline-first. +- Add support for a local model/cache directory and Hugging Face mirror endpoint when configured. +- Update focused tests and user-facing docs for the new knobs. + +Non-goals: +- No upstream/rohitg00 target work. +- No fetch, pull, push, PR creation, publish, migration, dependency changes, or remote branch updates. +- No change to embedding provider selection or vector persistence schema. +- No automatic remote download enablement beyond transformers.js defaults when `HF_ENDPOINT` is set. + +Acceptance criteria: +- `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` configures transformers.js local model lookup/cache behavior for local embeddings. +- `HF_ENDPOINT` configures transformers.js remote host, while local embeddings keep the existing offline/local-file loading behavior unless remote loading is separately changed in a future task. +- Existing local model precedence, dimension, and offline options stay intact. +- README and `.env.example` describe supported local model/cache/mirror behavior without claiming unsupported transformers.js cache layouts. +- Focused embedding-provider tests pass. + +Intended verification: +- RED/GREEN `corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts`. +- `git diff --check`. +- Targeted security/static checks if the final diff touches security-sensitive filesystem/network config behavior. + +Known boundaries: +- This changes local provider configuration only. It does not add dependencies, REST endpoints, MCP tools, auth, schema, storage migrations, or external services. +- `HF_ENDPOINT` is read as a user-supplied URL string and applied to transformers.js `env.remoteHost`; tests must cover normalization. +- `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` is a local filesystem path. No directories are created by agentmemory. + +Stop conditions: +- Need to change auth/security behavior, provider selection semantics, vector schema, dependency versions, or remote state. +- Tests indicate current transformers.js behavior cannot support the requested cache/mirror knobs without dependency changes. +- Required verification cannot run and no targeted substitute covers the changed surface. + +## Validity Decision + +Partially valid/actionable. Issue 917 already closed the model/dimension/offline-load portions: `src/providers/embedding/local.ts` supports `LOCAL_EMBEDDING_MODEL`, `EMBEDDING_MODEL`, known dimensions, `OPENAI_EMBEDDING_DIMENSIONS`, and `{ local_files_only: true, quantized: false }`. Issue 798 still has cache/HF mirror gaps: no code, tests, README, or `.env.example` references currently cover `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` or `HF_ENDPOINT`. + +## Feature / Verification Matrix + +| Change | Verification method | Status | Evidence | +|---|---|---|---| +| Validity review | Local provider/docs/tests inspection, Issue 798/917 read-only issue metadata, Explorer subagent | complete | Model/dimension pieces found implemented; cache/HF mirror missing locally. Explorer independently reached the same partial-validity decision. | +| Local model/cache dir env | RED/GREEN focused Vitest | complete | RED focused Vitest failed on missing local model/cache directory configuration; GREEN focused Vitest passed 46/46 after `src/providers/transformers.ts` used `getEnvVar`. | +| HF mirror env | RED/GREEN focused Vitest | complete | RED focused Vitest failed on unchanged `remoteHost`; GREEN focused Vitest passed after `HF_ENDPOINT` normalization to trailing slash. | +| Docs | README, `.env.example`, generated plugin reference inspection | complete | README and `.env.example` document `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR`, `HF_ENDPOINT`, and the preserved `local_files_only` caveat. `corepack pnpm run skills:gen` updated `plugin/skills/agentmemory-config/REFERENCE.md`. | +| Final verification | Focused Vitest, full Vitest, diff check, Semgrep | complete | Focused Vitest 46/46 passed; `corepack pnpm test` passed 172 files / 2254 tests after generator update; `git diff --check` passed; Semgrep passed with 0 findings across 7 changed tracked files. | +| Pre-implementation plan review | Read-only plan reviewer subagent | complete | Three valid Medium findings accepted: tests must exercise real `configureTransformersForNode`, env reads must use `getEnvVar`, and HF mirror wording must preserve `local_files_only` behavior. | + +## Subagent Ledger + +| Workstream | Allowed scope | Edits allowed | Expected output | Result | Residual risk | +|---|---|---:|---|---|---| +| Issue 798 validity explorer | Read-only local code/docs/tests and public/read-only issue evidence | no | Validity decision, files/commands/evidence, cache/HF gaps | complete: partially valid; cache/HF mirror gaps remain | Main agent verified critical conclusions against repo evidence. | +| Plan reviewer | Read-only task plan and local repo convention review | no | High/Medium findings or ACCEPT | complete: three Medium findings accepted and folded into plan | None after plan correction. | + +## Progress Notes + +- Repo instructions read from `AGENTS.md`. +- Initial required `git status -sb --untracked-files=all`: `## HEAD (no branch)`. +- Remote target confirmed: `origin https://github.com/wbugitlab1/agentmemory.git`. +- Local branch created from detached HEAD for edits: `github-pr/issue-798-local-embedding-cache-hf-71eceb08`. +- `github-feature-loop` read and applied within current user boundaries: no fetch/pull/push/PR creation without separate confirmation. Local branch-prep/commit remains allowed only for task-owned surfaces. +- Current local `origin/main` freshness is unverified; no fetch was run. +- `@xenova/transformers@2.17.2` package source confirms `env.localModelPath`, `env.cacheDir`, and `env.remoteHost`; public docs confirm local model path/cache settings. `HF_ENDPOINT` itself is not a documented transformers.js env variable, so agentmemory will map it to `env.remoteHost`. +- RED: `corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts` failed 3 expected tests: missing local model/cache dir from process env, missing local model/cache dir from `.env`, and missing `HF_ENDPOINT` remote host normalization. +- Dependency setup: first focused test attempt was blocked by pnpm ignored-build hardening; `corepack pnpm install --frozen-lockfile --ignore-scripts` completed without manifest/lockfile changes. Generated `allowBuilds` placeholder churn in `pnpm-workspace.yaml` was removed because it was task-caused setup noise, not an approved dependency-policy change. +- GREEN focused: `corepack pnpm exec vitest run --exclude test/integration.test.ts test/embedding-provider.test.ts` passed 1 file / 46 tests. +- Generator drift: first `corepack pnpm test` failed only because `plugin/skills/agentmemory-config/REFERENCE.md` needed regeneration for the new `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR`; `corepack pnpm run skills:gen` regenerated it. +- Full verification: `corepack pnpm test` passed 172 files / 2254 tests. +- Security/static verification: `semgrep scan --config p/default --error --metrics=off src/providers/transformers.ts test/embedding-provider.test.ts README.md .env.example plugin/skills/agentmemory-config/REFERENCE.md docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/plan.md` passed with 0 findings. OSV not run because no dependency, lockfile, container, vendored, or package-manager surfaces changed. +- Passive security-best-practices: no matching non-web TypeScript/Node reference exists. Applied general secure-default review to env input, filesystem path config, and remote host config; no critical or important issue found. The implementation does not create directories, does not read secrets, and does not change remote-download behavior. +- Focused simplification pass: no simplification applied; the diff is already constrained to the transformers boundary, targeted tests, docs, generated plugin env reference, and task record. +- Final independent review lanes: Test Coverage ACCEPT, Security/Boundary ACCEPT, Maintainability/Integration ACCEPT. No Critical or Important findings. +- `codex-security:security-diff-scan` was inspected because the diff touches filesystem/network-adjacent config. It was not run because the skill is a full artifact-producing multi-phase scan; for this narrow config patch, the local security gate used Semgrep plus independent Security/Boundary review instead. No user approval was given to create broad scan artifacts. +- Commit preflight: no `core.hooksPath`, no commit signing config, no `.husky`/`.githooks`; common Git hook dir contains only sample hooks. No staged files before task-owned staging. diff --git a/plugin/skills/agentmemory-config/REFERENCE.md b/plugin/skills/agentmemory-config/REFERENCE.md index 5d622a4fe..d540a2e36 100644 --- a/plugin/skills/agentmemory-config/REFERENCE.md +++ b/plugin/skills/agentmemory-config/REFERENCE.md @@ -3,7 +3,7 @@ Generated by scanning `src/` for `AGENTMEMORY_*` usage. Do not edit the block below by hand; run `corepack pnpm run skills:gen` after adding or removing a variable. Internal markers ending in two underscores are excluded. -Configuration is read from the environment and from `~/.agentmemory/.env` (no `export` prefix). 54 recognized variables: +Configuration is read from the environment and from `~/.agentmemory/.env` (no `export` prefix). 55 recognized variables: - `AGENTMEMORY_AGENT_SCOPE` - `AGENTMEMORY_ALLOW_AGENT_SDK` @@ -38,6 +38,7 @@ Configuration is read from the environment and from `~/.agentmemory/.env` (no `e - `AGENTMEMORY_IMAGE_STORE_MAX_BYTES` - `AGENTMEMORY_INJECT_CONTEXT` - `AGENTMEMORY_LLM_TIMEOUT_MS` +- `AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR` - `AGENTMEMORY_MCP_BLOCK` - `AGENTMEMORY_PREFER_CODEX_SDK` - `AGENTMEMORY_PROBE_TIMEOUT_MS` diff --git a/src/providers/transformers.ts b/src/providers/transformers.ts index 0bd5f219f..2af1e0316 100644 --- a/src/providers/transformers.ts +++ b/src/providers/transformers.ts @@ -1,5 +1,10 @@ +import { getEnvVar } from "../config.js"; + export type TransformersModule = { env?: { + localModelPath?: string | null; + cacheDir?: string | null; + remoteHost?: string; backends?: { onnx?: { wasm?: { @@ -30,7 +35,23 @@ export function configureTransformersForNode( } const wasm = transformers.env?.backends?.onnx?.wasm; - if (!wasm) return; + if (wasm) { + wasm.numThreads = 1; + } + + const env = transformers.env; + if (!env) return; - wasm.numThreads = 1; + const modelDir = getEnvVar("AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR")?.trim(); + if (modelDir) { + env.localModelPath = modelDir; + env.cacheDir = modelDir; + } + + const hfEndpoint = getEnvVar("HF_ENDPOINT")?.trim(); + if (hfEndpoint) { + env.remoteHost = hfEndpoint.endsWith("/") + ? hfEndpoint + : `${hfEndpoint}/`; + } } diff --git a/test/embedding-provider.test.ts b/test/embedding-provider.test.ts index 95e2d4f6a..0b497ce25 100644 --- a/test/embedding-provider.test.ts +++ b/test/embedding-provider.test.ts @@ -1,5 +1,5 @@ import { describe, it, expect, vi, beforeEach, afterEach } from "vitest"; -import { mkdtempSync, rmSync } from "node:fs"; +import { mkdtempSync, mkdirSync, rmSync, writeFileSync } from "node:fs"; import { tmpdir } from "node:os"; import { join } from "node:path"; import type { EmbeddingProvider } from "../src/types.js"; @@ -12,8 +12,10 @@ const ENV_KEYS = [ "OPENROUTER_API_KEY", "EMBEDDING_PROVIDER", "AGENTMEMORY_EMBEDDING_PROVIDER", + "AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR", "LOCAL_EMBEDDING_MODEL", "EMBEDDING_MODEL", + "HF_ENDPOINT", "OPENAI_BASE_URL", "OPENAI_EMBEDDING_BASE_URL", "OPENAI_EMBEDDING_API_KEY", @@ -83,6 +85,28 @@ async function freshTransformersModule() { return await import("../src/providers/transformers.js"); } +function localTransformersModule() { + return { + pipeline: vi.fn(), + env: { + localModelPath: "/models/", + cacheDir: "/cache/", + remoteHost: "https://huggingface.co/", + backends: { + onnx: { + wasm: { numThreads: 4 }, + }, + }, + }, + }; +} + +function writeAgentMemoryEnv(contents: string) { + const dir = join(sandboxHome, ".agentmemory"); + mkdirSync(dir, { recursive: true }); + writeFileSync(join(dir, ".env"), contents); +} + describe("configureTransformersForNode", () => { it("disables threaded ONNX WASM on Node", async () => { const { configureTransformersForNode } = await freshTransformersModule(); @@ -124,6 +148,53 @@ describe("configureTransformersForNode", () => { }, }); }); + + it("configures transformers local model and cache directories from process env", async () => { + process.env["AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR"] = + "/opt/agentmemory-models"; + const { configureTransformersForNode } = await freshTransformersModule(); + const transformers = localTransformersModule(); + + configureTransformersForNode(transformers); + + expect(transformers.env.localModelPath).toBe("/opt/agentmemory-models"); + expect(transformers.env.cacheDir).toBe("/opt/agentmemory-models"); + expect(transformers.env.backends.onnx.wasm.numThreads).toBe(1); + }); + + it("configures transformers local model and cache directories from the agentmemory env file", async () => { + writeAgentMemoryEnv( + "AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/tmp/agentmemory-dotenv-models", + ); + vi.doMock("node:os", async () => ({ + ...(await vi.importActual("node:os")), + homedir: () => sandboxHome, + })); + vi.resetModules(); + const { getEnvVar } = await import("../src/config.js"); + expect(getEnvVar("AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR")).toBe( + "/tmp/agentmemory-dotenv-models", + ); + const { configureTransformersForNode } = await freshTransformersModule(); + const transformers = localTransformersModule(); + + configureTransformersForNode(transformers); + + expect(transformers.env.localModelPath).toBe( + "/tmp/agentmemory-dotenv-models", + ); + expect(transformers.env.cacheDir).toBe("/tmp/agentmemory-dotenv-models"); + }); + + it("configures transformers remote host from HF_ENDPOINT", async () => { + process.env["HF_ENDPOINT"] = "https://hf-mirror.com"; + const { configureTransformersForNode } = await freshTransformersModule(); + const transformers = localTransformersModule(); + + configureTransformersForNode(transformers); + + expect(transformers.env.remoteHost).toBe("https://hf-mirror.com/"); + }); }); describe("loadTransformers", () => { @@ -168,6 +239,7 @@ afterEach(() => { rmSync(sandboxHome, { recursive: true, force: true }); vi.doUnmock("@xenova/transformers"); vi.doUnmock("../src/providers/transformers.js"); + vi.doUnmock("node:os"); vi.restoreAllMocks(); }); From 5dc2fc9e74424080572bd488d461019999f69e14 Mon Sep 17 00:00:00 2001 From: Willi Budzinski Date: Wed, 17 Jun 2026 19:50:17 +0200 Subject: [PATCH 2/2] docs: record issue 798 verification --- .../2026-06-17-issue-798-local-embedding-cache-hf/todo.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md index ac24e7942..ec88a85a1 100644 --- a/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md +++ b/docs/todos/2026-06-17-issue-798-local-embedding-cache-hf/todo.md @@ -82,3 +82,7 @@ Partially valid/actionable. Issue 917 already closed the model/dimension/offline - Final independent review lanes: Test Coverage ACCEPT, Security/Boundary ACCEPT, Maintainability/Integration ACCEPT. No Critical or Important findings. - `codex-security:security-diff-scan` was inspected because the diff touches filesystem/network-adjacent config. It was not run because the skill is a full artifact-producing multi-phase scan; for this narrow config patch, the local security gate used Semgrep plus independent Security/Boundary review instead. No user approval was given to create broad scan artifacts. - Commit preflight: no `core.hooksPath`, no commit signing config, no `.husky`/`.githooks`; common Git hook dir contains only sample hooks. No staged files before task-owned staging. +- Staged secret scan: `gitleaks protect --staged --redact` scanned ~22.01 KB and found no leaks. +- Commit: `abe8c26b` (`feat: configure local embedding cache and mirror`). +- PR prep base: existing local `refs/remotes/origin/main` at `71eceb085336b6170cfac3ecf22d98d64bf33a35`. No fetch was run, so freshness is unverified. `origin/main` is the merge-base and already an ancestor of HEAD; base merge is a no-op. +- Push/PR creation: not performed; no current-turn remote-write approval.