Skip to content

feat(about_app): catalog entries for embedding provider selection (#2583 follow-up)#2656

Open
justinhsu1477 wants to merge 1 commit into
tinyhumansai:mainfrom
justinhsu1477:feat/about-app-embedding-providers
Open

feat(about_app): catalog entries for embedding provider selection (#2583 follow-up)#2656
justinhsu1477 wants to merge 1 commit into
tinyhumansai:mainfrom
justinhsu1477:feat/about-app-embedding-providers

Conversation

@justinhsu1477
Copy link
Copy Markdown
Contributor

@justinhsu1477 justinhsu1477 commented May 26, 2026

Closes the explicit follow-up listed in #2583's PR body:

Follow-up: about_app capability catalog entries for embeddings provider selection

#2583 added Settings > AI > Embeddings — users can now pick between managed cloud (Voyage), OpenAI, Cohere, local Ollama, or a custom OpenAI-compatible endpoint, with per-provider API key + model + dimension knobs. The in-app feature catalog (`about_app::catalog`) is how those affordances become discoverable to:

  • Users via Settings search
  • Agents via `openhuman.about_app_capabilities` RPC
  • The Privacy surface via per-capability `privacy` metadata

Without an entry, the new panel is invisible to that whole layer.

What's added

Two capability records under the new `embeddings` domain, slotted into the Intelligence category so they sit alongside `memory_tree_retrieval` / `mcp_server` rather than getting orphaned in a fresh domain bucket.

ID Privacy What it covers
`intelligence.embedding_provider_config` `LOCAL_CREDENTIALS` (`leaves_device = false`) The Settings panel itself — provider selection, API key entry (encrypted local keyring under `embeddings:`), model / dimension knobs.
`intelligence.embedding_provider_test` `DERIVED_TO_BACKEND` (`leaves_device = true`) The "Test Connection" action — fires a small probe at the configured provider before committing it to ingestion.

The privacy split matters: the in-app Privacy surface aggregates per-capability annotations to answer "what leaves my computer when I touch this screen?". Collapsing both into one annotation under-reports the probe-time network call.

Tests

3 new tests added to `src/openhuman/about_app/catalog_tests.rs`, all 23/23 pass:

  • `catalog_includes_additional_user_facing_surfaces` — extended with both new ids.
  • `embedding_provider_capabilities_share_domain_and_category` — pins both to `domain = "embeddings"`, asserts same category, and pins the `how_to` breadcrumbs at "Settings > … > Embeddings" so a UI rename surfaces here.
  • `embedding_provider_capabilities_split_privacy_correctly` — asserts config has `leaves_device == false` and test has `leaves_device == true`. Defends against an accidental consolidation that would flatten the two privacy signals.

Test plan

Pure metadata addition — no production-code change beyond two new const records and a regression test trio.

Refs #2583.

Summary by CodeRabbit

  • New Features

    • Added embedding provider configuration with local credential storage capability
    • Added embedding provider testing capability for pre-ingestion validation
  • Tests

    • Added verification tests for embedding provider capabilities and privacy protections

Review Change Stack

…nyhumansai#2583 follow-up)

Closes the explicit follow-up listed in the tinyhumansai#2583 PR body:
> Follow-up: about_app capability catalog entries for embeddings provider
> selection

tinyhumansai#2583 added Settings > AI > Embeddings — users can now pick between
managed cloud (Voyage), OpenAI, Cohere, local Ollama, or a custom
OpenAI-compatible endpoint, with per-provider API key, model, and
dimension knobs. The in-app feature catalog (about_app::catalog) is
how those affordances become discoverable to the user (via Settings
search), to other agents (via `openhuman.about_app_capabilities`), and
to the in-app Privacy surface (via per-capability `privacy` metadata).
Without an entry, the new panel is invisible to that whole layer.

Two new capability records under the `embeddings` domain, slotted in
the Intelligence category so they sit alongside `memory_tree_retrieval`
/ `mcp_server` rather than getting orphaned in a fresh domain bucket:

  * `intelligence.embedding_provider_config` — the Settings panel
    itself. Privacy: `LOCAL_CREDENTIALS` (API keys are written to the
    local keyring under `embeddings:<slug>` and never leave the
    device).
  * `intelligence.embedding_provider_test` — the "Test Connection"
    action. Privacy: `DERIVED_TO_BACKEND` (a small probe payload is
    sent to whichever provider is selected; default = OpenHuman
    backend / TinyHumans Neocortex via Voyage).

The split (config = local credentials, test = derived-to-backend)
matters because the Privacy surface in-app aggregates per-capability
annotations to answer "what leaves my computer when I touch this
screen?". Collapsing both into a single annotation under-reports one
of the two flows.

Tests (3 new in `about_app::catalog::tests`, 23/23 pass):

  * `catalog_includes_additional_user_facing_surfaces` extended with
    both new ids.
  * `embedding_provider_capabilities_share_domain_and_category` pins
    both to `domain = "embeddings"`, asserts same category, and pins
    the `how_to` breadcrumbs at "Settings > … > Embeddings" so a UI
    move surfaces here.
  * `embedding_provider_capabilities_split_privacy_correctly` asserts
    config has `leaves_device == false` and test has
    `leaves_device == true` — defends against an accidental
    consolidation that would flatten the two privacy signals.

No production-code changes — pure metadata addition. `cargo check
--lib` + `cargo fmt --check` + `cargo test --tests --no-run` all
clean.
@justinhsu1477 justinhsu1477 requested a review from a team May 26, 2026 02:26
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

📝 Walkthrough

Walkthrough

This PR adds two embedding provider capabilities (intelligence.embedding_provider_config and intelligence.embedding_provider_test) to the capability catalog with metadata, privacy labels, and test coverage verifying their registration and privacy behavior.

Changes

Embedding Provider Capabilities

Layer / File(s) Summary
Embedding provider capabilities with test verification
src/openhuman/about_app/catalog.rs, src/openhuman/about_app/catalog_tests.rs
Two new capabilities added to the CAPABILITIES array: config (local credentials, no device egress) and test (derived data, backend egress). Tests verify both capabilities are registered, share the same domain and category, and have correct privacy annotations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested labels

feature, rust-core, working

Poem

🐰 Two embedding minds now dwell in the catalog fair,
One keeps secrets local with careful care,
The other tests the waters, sends probes to the sky—
Privacy split between them, a harmonious tie! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding catalog entries for embedding provider selection to the about_app module, with clear reference to the parent issue (#2583 follow-up).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team. labels May 26, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/about_app/catalog.rs`:
- Around line 332-339: The privacy destinations currently set to
DERIVED_TO_BACKEND under-represent that the probe text can be sent to external
model providers (OpenAI, Cohere, custom endpoints); change the privacy
declaration for this capability (the privacy field next to status and the
CapabilityStatus::Beta line) to include external/model destinations in addition
to DERIVED_TO_BACKEND—e.g., add the existing DERIVED_TO_MODEL or
DERIVED_TO_THIRD_PARTY flag (or define and add a new DERIVED_TO_EXTERNAL
destination in the privacy enum if missing) so the catalog accurately records
that the probe may leave the backend to third‑party model endpoints.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b8b2f868-d12b-482e-b4bc-0b3806c1b041

📥 Commits

Reviewing files that changed from the base of the PR and between e05cab9 and 7eb7da4.

📒 Files selected for processing (2)
  • src/openhuman/about_app/catalog.rs
  • src/openhuman/about_app/catalog_tests.rs

Comment on lines +332 to +339
// Test payload is a short fixed string ('OpenHuman connectivity \
// probe'-style) sent to whichever provider is selected — Voyage via \
// the OpenHuman backend, OpenAI, Cohere, or a custom endpoint. \
// `DERIVED_TO_BACKEND` is the right label for the default (managed \
// cloud) path; the destination list reflects that this is *derived* \
// signal (the probe text), not raw user content.
status: CapabilityStatus::Beta,
privacy: DERIVED_TO_BACKEND,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Privacy destinations are too narrow for multi-provider test flow.

At Line 332-335 the description says the probe may go to OpenAI/Cohere/custom endpoints, but Line 339 uses DERIVED_TO_BACKEND (destinations only backend/neocortex). This under-reports possible off-device destinations in the privacy catalog.

Suggested fix
+const DERIVED_TO_CONFIGURED_EMBEDDING_PROVIDER: Option<CapabilityPrivacy> = Some(CapabilityPrivacy {
+    leaves_device: true,
+    data_kind: PrivacyDataKind::Derived,
+    destinations: &[
+        "OpenHuman backend (managed cloud)",
+        "Configured embedding provider endpoint (e.g., OpenAI/Cohere/custom)",
+    ],
+});
...
-        privacy: DERIVED_TO_BACKEND,
+        privacy: DERIVED_TO_CONFIGURED_EMBEDDING_PROVIDER,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Test payload is a short fixed string ('OpenHuman connectivity \
// probe'-style) sent to whichever provider is selected — Voyage via \
// the OpenHuman backend, OpenAI, Cohere, or a custom endpoint. \
// `DERIVED_TO_BACKEND` is the right label for the default (managed \
// cloud) path; the destination list reflects that this is *derived* \
// signal (the probe text), not raw user content.
status: CapabilityStatus::Beta,
privacy: DERIVED_TO_BACKEND,
const DERIVED_TO_CONFIGURED_EMBEDDING_PROVIDER: Option<CapabilityPrivacy> = Some(CapabilityPrivacy {
leaves_device: true,
data_kind: PrivacyDataKind::Derived,
destinations: &[
"OpenHuman backend (managed cloud)",
"Configured embedding provider endpoint (e.g., OpenAI/Cohere/custom)",
],
});
// Test payload is a short fixed string ('OpenHuman connectivity \
// probe'-style) sent to whichever provider is selected — Voyage via \
// the OpenHuman backend, OpenAI, Cohere, or a custom endpoint. \
// `DERIVED_TO_BACKEND` is the right label for the default (managed \
// cloud) path; the destination list reflects that this is *derived* \
// signal (the probe text), not raw user content.
status: CapabilityStatus::Beta,
privacy: DERIVED_TO_CONFIGURED_EMBEDDING_PROVIDER,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/about_app/catalog.rs` around lines 332 - 339, The privacy
destinations currently set to DERIVED_TO_BACKEND under-represent that the probe
text can be sent to external model providers (OpenAI, Cohere, custom endpoints);
change the privacy declaration for this capability (the privacy field next to
status and the CapabilityStatus::Beta line) to include external/model
destinations in addition to DERIVED_TO_BACKEND—e.g., add the existing
DERIVED_TO_MODEL or DERIVED_TO_THIRD_PARTY flag (or define and add a new
DERIVED_TO_EXTERNAL destination in the privacy enum if missing) so the catalog
accurately records that the probe may leave the backend to third‑party model
endpoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant