Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
1032e24
chore: add Phase 11 (Local/Embedded Client) to roadmap for #111
oss-amikos Apr 1, 2026
b3e2cd8
docs(04): capture phase context for embedding ecosystem
oss-amikos Apr 1, 2026
3680a0c
docs(state): record phase 04 context session
oss-amikos Apr 1, 2026
2924b6b
docs(04): research phase embedding ecosystem
oss-amikos Apr 1, 2026
3e742d5
docs(04): add research and validation strategy for embedding ecosystem
oss-amikos Apr 1, 2026
0958799
docs(04): create phase plan for embedding ecosystem
oss-amikos Apr 1, 2026
98aeb07
docs(04): fix validation map to match actual plan numbers
oss-amikos Apr 1, 2026
53e4fc2
feat(04-01): add sparse/content embedding interfaces, content types, …
oss-amikos Apr 1, 2026
303a717
feat(04-02): add RerankingFunction interface and Cohere/Jina providers
oss-amikos Apr 1, 2026
4d0aeac
test(04-01): add unit tests for sparse/content embedding interfaces a…
oss-amikos Apr 1, 2026
b94c9ee
test(04-02): add WireMock tests for reranking providers
oss-amikos Apr 1, 2026
e55f6bf
docs(04-01): complete embedding foundation interfaces plan
oss-amikos Apr 1, 2026
366aeb0
feat(04-03): add Gemini, Bedrock, and Voyage dense embedding providers
oss-amikos Apr 1, 2026
05d0c78
docs(04-02): complete reranking interface and providers plan
oss-amikos Apr 1, 2026
1ff4825
test(04-03): add unit tests for Gemini, Bedrock, and Voyage providers
oss-amikos Apr 1, 2026
afa05d0
docs(04-03): complete dense embedding providers plan
oss-amikos Apr 1, 2026
35ae747
Merge branch 'worktree-agent-a5764d7f' into feature/phase-04-embeddin…
oss-amikos Apr 1, 2026
fa5c607
Merge branch 'worktree-agent-a5c33608' into feature/phase-04-embeddin…
oss-amikos Apr 1, 2026
98be4d5
Merge branch 'worktree-agent-a15dd46d' into feature/phase-04-embeddin…
oss-amikos Apr 1, 2026
05e14ec
feat(04-04): implement BM25 sparse embedding pipeline
oss-amikos Apr 1, 2026
747f5b4
test(04-04): add ChromaCloudSplade provider and unit tests for BM25 +…
oss-amikos Apr 1, 2026
180339b
docs(04-04): complete BM25 and ChromaCloudSplade sparse providers plan
oss-amikos Apr 1, 2026
850f070
test(04-05): add failing tests for EmbeddingFunctionRegistry
oss-amikos Apr 1, 2026
e99e1f3
feat(04-05): implement EmbeddingFunctionRegistry with 3 factory maps
oss-amikos Apr 1, 2026
2c65802
docs(04-05): complete EmbeddingFunctionRegistry plan
oss-amikos Apr 1, 2026
eab740f
fix(04-05): restore backward-compatible error messages in EmbeddingFu…
oss-amikos Apr 1, 2026
24e1aa4
test(04): complete UAT - 9 passed, 0 issues
oss-amikos Apr 1, 2026
9da04f1
Fix embedding ecosystem review findings
oss-amikos Apr 1, 2026
cd77946
Fix second-round embedding review issues
oss-amikos Apr 2, 2026
4c0c3f4
Fix Bedrock and Gemini embedding consistency
oss-amikos Apr 2, 2026
16326d2
Harden embedding response parsing edge cases
oss-amikos Apr 2, 2026
5b9509d
fix(ci): relax error message assertion in cloud parity test
oss-amikos Apr 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .planning/REQUIREMENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@ Requirements for the current milestone. Each maps to roadmap phases.

### Embedding Ecosystem

- [ ] **EMB-05**: User can use sparse embedding functions (BM25, Chroma Cloud Splade) through a `SparseEmbeddingFunction` interface.
- [x] **EMB-05**: User can use sparse embedding functions (BM25, Chroma Cloud Splade) through a `SparseEmbeddingFunction` interface.
- [ ] **EMB-06**: User can use multimodal embedding functions (image+text) through a `MultimodalEmbeddingFunction` interface.
- [ ] **EMB-07**: User can use at least 3 additional dense embedding providers (Gemini, Bedrock, Voyage prioritized) through the existing `EmbeddingFunction` contract.
- [x] **EMB-07**: User can use at least 3 additional dense embedding providers (Gemini, Bedrock, Voyage prioritized) through the existing `EmbeddingFunction` contract.
- [ ] **EMB-08**: User can rely on an `EmbeddingFunctionRegistry` to auto-wire embedding functions from server-side collection configuration.
- [ ] **RERANK-01**: User can rerank query results using a `RerankingFunction` interface with at least one provider (Cohere or Jina).

Expand Down Expand Up @@ -66,9 +66,9 @@ Deferred to future milestones.
| SEARCH-02 | Phase 3 | Complete |
| SEARCH-03 | Phase 3 | Complete |
| SEARCH-04 | Phase 3 | Complete |
| EMB-05 | Phase 4 | Pending |
| EMB-05 | Phase 4 | Complete |
| EMB-06 | Phase 4 | Pending |
| EMB-07 | Phase 4 | Pending |
| EMB-07 | Phase 4 | Complete |
| EMB-08 | Phase 4 | Pending |
| RERANK-01 | Phase 4 | Pending |
| CLOUD-01 | Phase 5 | Complete |
Expand Down
33 changes: 29 additions & 4 deletions .planning/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Decimal phases appear between their surrounding integers in numeric order.
- [ ] **Phase 8: API DX Improvements** — Add Consumer lambda overloads for collection creation and Schema convenience factories (#143, #144).
- [ ] **Phase 9: Logging Bridges** — Implement SLF4J and JUL bridges for ChromaLogger (#141, #142).
- [ ] **Phase 10: Documentation Update** — Refresh docs site with DX improvements, logging bridges, and any API changes from Phases 8-9.
- [ ] **Phase 11: Local/Embedded Client** — Add local/embedded client mode with JNI/JNA bindings or managed server lifecycle (#111).

## Phase Details

Expand Down Expand Up @@ -76,17 +77,24 @@ Plans:
- [x] 03-03-PLAN.md — Create unit tests, integration tests, and update PublicInterfaceCompatibilityTest

### Phase 4: Embedding Ecosystem
**Goal:** Expand the embedding ecosystem with sparse/multimodal interfaces, reranking functions, additional providers, and an auto-wiring registry.
**Goal:** Expand the embedding ecosystem with sparse/content interfaces, reranking functions, additional dense providers, and an auto-wiring registry.
**Depends on:** Nothing (independent of Phases 1-3)
**Requirements:** [EMB-05, EMB-06, EMB-07, EMB-08, RERANK-01]
**Issues:** #106, #107, #108, #109
**Success Criteria** (what must be TRUE):
1. SparseEmbeddingFunction and MultimodalEmbeddingFunction interfaces exist with at least one provider each.
1. SparseEmbeddingFunction and ContentEmbeddingFunction interfaces exist with at least one provider each.
2. RerankingFunction interface exists with at least one provider (Cohere or Jina).
3. At least 3 new dense embedding providers implemented (prioritize Gemini, Bedrock, Voyage).
4. EmbeddingFunctionRegistry supports registering and auto-wiring providers from server-side collection config.
5. All providers have unit tests; integration tests where API keys are available.
**Plans:** TBD
**Plans:** 5 plans

Plans:
- [x] 04-01-PLAN.md — Sparse/Content interfaces, content value types, and bidirectional adapters
- [x] 04-02-PLAN.md — RerankingFunction interface with Cohere and Jina providers
- [ ] 04-03-PLAN.md — Dense providers: Gemini, Bedrock, Voyage with Maven deps
- [ ] 04-04-PLAN.md — BM25 and ChromaCloudSplade sparse providers
- [ ] 04-05-PLAN.md — EmbeddingFunctionRegistry with auto-wiring and ChromaHttpCollection integration

### Phase 5: Cloud Integration Testing
**Goal:** Build deterministic cloud parity test suites that validate search, schema/index, and array metadata behavior against Chroma Cloud.
Expand Down Expand Up @@ -116,7 +124,7 @@ Phase 4 can execute in parallel with Phases 1-3 (independent).
| 1. Result Ergonomics & WhereDocument | 2/3 | In Progress| |
| 2. Collection API Extensions | 2/2 | Complete | 2026-03-21 |
| 3. Search API | 3/3 | Complete | 2026-03-22 |
| 4. Embedding Ecosystem | 0/TBD | Pending | — |
| 4. Embedding Ecosystem | 0/5 | Planned | — |
| 5. Cloud Integration Testing | 2/3 | In Progress| |

### Phase 6: Documentation Site
Expand Down Expand Up @@ -195,3 +203,20 @@ Plans:

Plans:
- [ ] TBD (run /gsd:plan-phase 10 to break down)

### Phase 11: Local/Embedded Client

**Goal:** Add a local/embedded client mode that runs Chroma without requiring a separate server, similar to Go client's `NewLocalClient`.
**Depends on:** Nothing (independent — can be developed in parallel with other phases)
**Requirements:** TBD
**Issues:** #111
**Success Criteria** (what must be TRUE):
1. `ChromaClient.local()` builder API exists with `persistDirectory` configuration.
2. At least one runtime mode works (JNI/JNA embedded or managed server lifecycle).
3. Persistence to disk supported with configurable path.
4. Unit and integration tests verify local client CRUD operations match server client behavior.
5. Graceful lifecycle management (startup, shutdown, cleanup).
**Plans:** 0 plans

Plans:
- [ ] TBD (run /gsd:plan-phase 11 to break down)
23 changes: 15 additions & 8 deletions .planning/STATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
gsd_state_version: 1.0
milestone: v1.5
milestone_name: milestone
status: "Phase 06 shipped — PR #145"
stopped_at: "Completed 06-04 Task 1; checkpoint:human-verify at Task 2"
last_updated: "2026-04-01T10:06:39.889Z"
status: "Phase 04 shipped — PR #146"
stopped_at: Completed 04-04-PLAN.md (BM25 & ChromaCloudSplade sparse providers)
last_updated: "2026-04-02T07:03:38.118Z"
progress:
total_phases: 14
completed_phases: 12
total_plans: 31
completed_plans: 31
completed_phases: 13
total_plans: 36
completed_plans: 36
---

# Project State
Expand Down Expand Up @@ -74,6 +74,8 @@ Plan: Not started
| Phase 06-documentation-site P03 | 7 | 2 tasks | 11 files |
| Phase 06-documentation-site P02 | 4 | 2 tasks | 12 files |
| Phase 06-documentation-site P04 | 5 | 1 tasks | 9 files |
| Phase 04-embedding-ecosystem P03 | 8min | 2 tasks | 9 files |
| Phase 04-embedding-ecosystem PP04 | 6min | 2 tasks | 12 files |

## Accumulated Context

Expand Down Expand Up @@ -155,6 +157,11 @@ Recent decisions affecting current work:
- [Phase 06-documentation-site]: All guide pages use --8<-- named section snippet inclusions (no inline copy-pasted code blocks) per D-09
- [Phase 06-documentation-site]: Examples stubs use 'coming soon' admonition with link to relevant guide page — Phase 7 fills content without touching nav config
- [Phase 06-documentation-site]: mkdocs.yml Examples nav uses section syntax with java-examples/index.md as section index per navigation.indexes feature
- [Phase 04-embedding-ecosystem]: Jackson version aligned to 2.17.2 via dependencyManagement to resolve nd4j/GenAI SDK conflict
- [Phase 04-embedding-ecosystem]: Voyage WireMock tests use WithParam.baseAPI() constructor injection instead of static field reflection
- [Phase 04-embedding-ecosystem]: Gemini/Bedrock use lazy double-checked locking for SDK client init to avoid load at construction time
- [Phase 04-embedding-ecosystem]: englishStemmer class name is lowercase in snowball-stemmer 1.3.0.581.1
- [Phase 04-embedding-ecosystem]: BM25StopWords contains 179 NLTK English stop words (not 174); ChromaCloudSplade uses Bearer token auth

### Roadmap Evolution

Expand All @@ -172,6 +179,6 @@ None.

## Session Continuity

Last session: 2026-03-24T15:42:20.817Z
Stopped at: Completed 06-04 Task 1; checkpoint:human-verify at Task 2
Last session: 2026-04-01T12:59:42.478Z
Stopped at: Completed 04-04-PLAN.md (BM25 & ChromaCloudSplade sparse providers)
Resume file: None
Loading
Loading