Skip to content

feat: make embedding dimensions configurable via EMBEDDINGS_DIMENSIONS#222

Open
syn-zhu wants to merge 2 commits intoagentregistry-dev:mainfrom
syn-zhu:feat/configurable-embedding-dimensions
Open

feat: make embedding dimensions configurable via EMBEDDINGS_DIMENSIONS#222
syn-zhu wants to merge 2 commits intoagentregistry-dev:mainfrom
syn-zhu:feat/configurable-embedding-dimensions

Conversation

@syn-zhu
Copy link
Contributor

@syn-zhu syn-zhu commented Feb 25, 2026

/kind feature

Description

Fixes #221 - the EMBEDDINGS_DIMENSIONS env var exists but is locked to 1536 by a hardcoded validation check, making it impossible to use embedding providers like Voyage AI (1024) or Cohere (1024).

Approach

Instead of per-dimension migration files, this adds startup dimension reconciliation:

  1. Removes the hardcoded != 1536 validation in validate.go
  2. Adds ensureVectorDimensions() that runs after migrations in NewPostgreSQL
  3. It queries pg_attribute.atttypmod to get the current column dimension
  4. If config differs from schema, it atomically (single transaction):
    • Drops HNSW indexes
    • Clears existing embeddings (incompatible with new dimension)
    • ALTERs both servers.semantic_embedding and agents.semantic_embedding to vector(N)
    • Recreates HNSW indexes with vector_cosine_ops

Callers pass the configured dimension (or 0 to skip reconciliation for CLI tools and tests).

Why this approach

  • Zero migration files to maintain per provider
  • Any dimension works, not a whitelist
  • Existing 1536 deployments are unaffected (no-op when dimensions match)
  • Single env var change to switch providers:
    EMBEDDINGS_DIMENSIONS=1024
    EMBEDDINGS_PROVIDER=voyageai
    EMBEDDINGS_MODEL=voyage-3
    

Changes

File Change
config/validate.go Remove hardcoded != 1536 check
database/postgres.go Add embeddingDimensions param + ensureVectorDimensions()
registry_app.go Pass cfg.Embeddings.Dimensions
cli/import.go, cli/export.go Pass 0 (skip reconciliation)
database/testutil.go Pass 0 (tests use schema default)

Test plan

  • Fresh install with default config (1536) - no reconciliation, schema unchanged
  • Fresh install with EMBEDDINGS_DIMENSIONS=1024 - schema reconciled to vector(1024)
  • Existing 1536 deployment, change to 1024 - columns altered, embeddings cleared, indexes recreated
  • Existing 1024 deployment, restart with same config - no-op
  • CLI import/export with EMBEDDINGS_DIMENSIONS=1024 - no reconciliation (passes 0)
  • Embeddings disabled - no reconciliation (dimensions param is 0 when not set)
The EMBEDDINGS_DIMENSIONS env var now actually works. Operators can configure any vector dimension to match their embedding provider (e.g., 1024 for Voyage AI, 1536 for OpenAI). The database schema is automatically reconciled on startup when the dimension changes.

The EMBEDDINGS_DIMENSIONS env var exists but was locked to 1536 by a
hardcoded validation check, making it impossible to use embedding
providers with different dimensions (e.g., Voyage AI at 1024).

Changes:
- Remove the hardcoded 1536 validation in config/validate.go
- Add ensureVectorDimensions() that runs after migrations on startup
- It compares the configured dimension with the actual pgvector column
  type (via pg_attribute.atttypmod) and, if they differ, atomically:
  1. Drops HNSW indexes
  2. Clears existing embeddings (they must be regenerated)
  3. Alters both vector columns to the new dimension
  4. Recreates HNSW indexes
- Pass embeddingDimensions into NewPostgreSQL (0 = skip reconciliation)

This means operators can switch providers by setting a single env var:
  EMBEDDINGS_DIMENSIONS=1024
  EMBEDDINGS_PROVIDER=voyageai
  EMBEDDINGS_MODEL=voyage-3

On next startup the schema is reconciled automatically. Existing
deployments using the default 1536 are unaffected (no-op reconciliation).
@syn-zhu syn-zhu force-pushed the feat/configurable-embedding-dimensions branch from 9b3ad0e to bd6eb98 Compare February 25, 2026 21:45
@peterj peterj requested a review from ilackarms February 26, 2026 00:03
…ensions

Refactor ensureVectorDimensions to eliminate nestif complexity by
extracting per-table logic into focused helper functions. Fix
TestListServersSemanticSearch by adding NewTestDBWithEmbeddings that
creates the semantic_embedding column with the correct dimension for
embedding tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 5, 2026 09:58
@syn-zhu syn-zhu requested a review from ilackarms March 5, 2026 10:00
@syn-zhu
Copy link
Contributor Author

syn-zhu commented Mar 5, 2026

@ilackarms had to fix a few tests

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for configurable pgvector embedding dimensions via EMBEDDINGS_DIMENSIONS by removing the prior 1536-only validation and introducing startup-time schema reconciliation to create/alter the semantic_embedding vector columns and HNSW indexes.

Changes:

  • Remove hardcoded != 1536 validation for embeddings dimensions.
  • Update initial migration to omit the semantic_embedding vector columns/indexes and instead create/reconcile them at startup (ensureVectorDimensions).
  • Thread an embeddingDimensions parameter through NewPostgreSQL call sites; add test helper NewTestDBWithEmbeddings.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
internal/registry/config/validate.go Removes the 1536-only config validation to allow arbitrary dimensions.
internal/registry/database/migrations/001_initial_schema.sql Stops hardcoding vector(1536) + HNSW indexes in SQL; keeps metadata columns.
internal/registry/database/postgres.go Adds embeddingDimensions param and startup reconciliation that creates/alters vector columns + indexes.
internal/registry/registry_app.go Passes configured embedding dimensions into NewPostgreSQL.
internal/cli/import.go Updates NewPostgreSQL call to pass 0 (skip reconciliation).
internal/cli/export.go Updates NewPostgreSQL call to pass 0 (skip reconciliation).
internal/registry/database/testutil.go Adds NewTestDBWithEmbeddings helper and updates calls for new signature.
internal/registry/api/handlers/v0/servers_test.go Updates semantic-search test to use NewTestDBWithEmbeddings.
internal/registry/api/handlers/v0/servers.go Minor cleanup of tags variable initialization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +129 to +147
if columnExists && currentDim == dimensions {
return nil // Already correct, nothing to do.
}

tx, err := conn.Begin(ctx)
if err != nil {
return fmt.Errorf("failed to begin transaction: %w", err)
}
defer func() {
if err := tx.Rollback(ctx); err != nil && !errors.Is(err, pgx.ErrTxClosed) {
log.Printf("Failed to rollback dimension migration: %v", err)
}
}()

for _, table := range []string{"servers", "agents"} {
if err := reconcileTableEmbedding(ctx, tx, table, columnExists, currentDim, dimensions); err != nil {
return err
}
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existence/dimension detection is only done against public.servers.semantic_embedding, but the result is applied to both servers and agents. If the two tables ever get out of sync (e.g., one has the column and the other doesn’t, or dimensions differ), this can either no-op incorrectly or attempt an ALTER on a missing column. Consider checking existence + current dimension per table (or at least verifying both tables match before returning early).

Copilot uses AI. Check for mistakes.
if _, err := tx.Exec(ctx, fmt.Sprintf("DROP INDEX IF EXISTS %s", idx)); err != nil {
return fmt.Errorf("failed to drop index %s: %w", idx, err)
}
if _, err := tx.Exec(ctx, fmt.Sprintf("UPDATE %s SET semantic_embedding = NULL", table)); err != nil {
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE %s SET semantic_embedding = NULL will rewrite every row, even when semantic_embedding is already NULL, causing unnecessary bloat and potentially long startup times on large tables. Restrict this to rows that actually have embeddings (e.g., WHERE semantic_embedding IS NOT NULL) or use an ALTER COLUMN ... TYPE ... USING NULL approach to avoid a full-table update step.

Suggested change
if _, err := tx.Exec(ctx, fmt.Sprintf("UPDATE %s SET semantic_embedding = NULL", table)); err != nil {
if _, err := tx.Exec(ctx, fmt.Sprintf("UPDATE %s SET semantic_embedding = NULL WHERE semantic_embedding IS NOT NULL", table)); err != nil {

Copilot uses AI. Check for mistakes.
}
} else {
baseDB, err := internaldb.NewPostgreSQL(ctx, cfg.DatabaseURL, authz)
baseDB, err := internaldb.NewPostgreSQL(ctx, cfg.DatabaseURL, authz, cfg.Embeddings.Dimensions)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewPostgreSQL expects embeddingDimensions to be 0 when embeddings are disabled (per its docstring), but this call always passes cfg.Embeddings.Dimensions (default 1536 even when Embeddings.Enabled is false). That means fresh installs will always create the vector columns/indexes and could trigger reconciliation even when embeddings are disabled. Consider passing 0 unless cfg.Embeddings.Enabled is true (or introducing an explicit “reconcile vectors” knob).

Suggested change
baseDB, err := internaldb.NewPostgreSQL(ctx, cfg.DatabaseURL, authz, cfg.Embeddings.Dimensions)
embeddingDimensions := 0
if cfg.Embeddings.Enabled {
embeddingDimensions = cfg.Embeddings.Dimensions
}
baseDB, err := internaldb.NewPostgreSQL(ctx, cfg.DatabaseURL, authz, embeddingDimensions)

Copilot uses AI. Check for mistakes.
authz := auth.Authorizer{Authz: nil}

db, err := database.NewPostgreSQL(ctx, cfg.DatabaseURL, authz)
db, err := database.NewPostgreSQL(ctx, cfg.DatabaseURL, authz, 0)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CLI path passes embeddingDimensions=0, which skips creating the semantic_embedding vector columns. If --generate-embeddings is enabled, the importer later calls UpsertServerEmbedding/SetServerEmbedding which updates semantic_embedding and will fail on a fresh schema (the vector column is no longer created in the initial migration). Consider passing cfg.Embeddings.Dimensions when embeddings generation is requested (or otherwise ensuring ensureVectorDimensions runs before storing embeddings).

Suggested change
db, err := database.NewPostgreSQL(ctx, cfg.DatabaseURL, authz, 0)
embeddingDimensions := 0
if importGenerateEmbeddings {
embeddingDimensions = cfg.Embeddings.Dimensions
}
db, err := database.NewPostgreSQL(ctx, cfg.DatabaseURL, authz, embeddingDimensions)

Copilot uses AI. Check for mistakes.
Comment on lines +86 to +94
// After migrations, ensure the semantic_embedding vector columns exist with
// the configured dimension. The initial schema deliberately omits these
// columns so the dimension is never hardcoded in SQL. This function creates
// them on first run and reconciles the dimension if it changes later.
if embeddingDimensions > 0 {
if err := ensureVectorDimensions(ctx, conn.Conn(), embeddingDimensions); err != nil {
return nil, fmt.Errorf("failed to reconcile vector dimensions: %w", err)
}
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new startup reconciliation path (ensureVectorDimensions invoked from NewPostgreSQL) introduces non-trivial schema mutation behavior (create vs alter, index drop/recreate, clearing embeddings). There don’t appear to be database tests covering these branches yet; adding integration tests for (1) fresh DB creates columns/indexes, (2) dimension change alters both tables and recreates indexes, and (3) no-op when already correct would reduce regression risk.

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +130
AND a.atttypmod > 0
`).Scan(&currentDim)
if err != nil {
// Column doesn't exist yet (fresh install or pre-embeddings schema).
columnExists = false
}

if columnExists && currentDim == dimensions {
return nil // Already correct, nothing to do.
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensureVectorDimensions treats any error from the pg_attribute query as “column doesn’t exist”. That can mask real failures (e.g., permission issues) and then attempts to ADD COLUMN, which may fail or do the wrong thing. Consider handling pgx.ErrNoRows as the “doesn’t exist” case and returning the error for anything else (and also handle the case where the column exists but has no typmod / atttypmod <= 0).

Suggested change
AND a.atttypmod > 0
`).Scan(&currentDim)
if err != nil {
// Column doesn't exist yet (fresh install or pre-embeddings schema).
columnExists = false
}
if columnExists && currentDim == dimensions {
return nil // Already correct, nothing to do.
`).Scan(&currentDim)
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
// Column doesn't exist yet (fresh install or pre-embeddings schema).
columnExists = false
} else {
return fmt.Errorf("failed to query semantic_embedding column: %w", err)
}
}
if columnExists {
// Column exists but has no valid typmod / dimension; this is an unexpected state
// that we can't safely reconcile automatically.
if currentDim <= 0 {
return fmt.Errorf("semantic_embedding column exists but has no valid dimension typmod")
}
if currentDim == dimensions {
return nil // Already correct, nothing to do.
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

EMBEDDINGS_DIMENSIONS env var is locked to 1536, cannot use other embedding providers

3 participants