Skip to content

Add generated tool provenance and admission checks#2549

Open
vaddisrinivas wants to merge 6 commits into
tinyhumansai:mainfrom
vaddisrinivas:codex/oh-2542-generated-tool-admission
Open

Add generated tool provenance and admission checks#2549
vaddisrinivas wants to merge 6 commits into
tinyhumansai:mainfrom
vaddisrinivas:codex/oh-2542-generated-tool-admission

Conversation

@vaddisrinivas
Copy link
Copy Markdown
Contributor

@vaddisrinivas vaddisrinivas commented May 23, 2026

Summary

  • Extend generated tool definitions with provider, capability, digest, risk, and policy-surface metadata.
  • Add an admission helper that can enforce trusted providers, disabled providers/capabilities, source digests, safe names, and duplicate-name checks.
  • Keep legacy generated-tool behavior unchanged when provenance enforcement is disabled.

Problem

  • Generated tools can be wrapped generically, but callers do not yet have a central way to reject untrusted, duplicate, or missing-provenance definitions before registration.
  • Build/admission-time enforcement needs structured metadata that is still provider-agnostic.

Solution

  • Add GeneratedToolAdmissionConfig, GeneratedToolAdmissionReport, and admit_generated_tool_definitions.
  • Normalize generated-tool provenance fields before admission and wrapper construction.
  • Mark generated tools with external side effects when their declared risk requires it.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy: trusted admission, disabled legacy path, untrusted provider, duplicate names, missing risk, unsafe names, and external-effect risk.
  • Diff coverage ≥ 80% — local full coverage not run; focused Rust tests cover the changed module and CI coverage gate will enforce final diff coverage.
  • Coverage matrix updated — N/A: generic generated-tool infrastructure, no existing feature row added/removed/renamed.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related: N/A, no coverage-matrix feature ID applies.
  • No new external network dependencies introduced (mock backend used per Testing Strategy): no network dependency added.
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md): N/A, no release-cut UI/runtime smoke surface changed.
  • Linked issue closed via Closes #NNN in the ## Related section.

Impact

  • Runtime/platform impact: generated-tool callers can opt into stricter admission without changing default registration behavior.
  • Security impact: enables provenance and risk checks before generated tools become executable Tool objects.
  • Compatibility: existing GeneratedToolDefinition::new callers continue to work with metadata fields defaulted to None.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

  • Branch: codex/oh-2542-generated-tool-admission
  • Commit SHA: 46631365d44929e2eef03935020cdcc5690abca6

Validation Run

  • pnpm --filter openhuman-app format:check: N/A, Rust-only backend tool-infrastructure change.
  • pnpm typecheck: N/A, Rust-only backend tool-infrastructure change.
  • Focused tests: cargo test --manifest-path Cargo.toml generated_tool --lib; cargo test --manifest-path Cargo.toml admission_ --lib
  • Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml --all --check; git diff --check
  • Tauri fmt/check (if changed): N/A, no Tauri code changed.
  • PR checklist: node scripts/check-pr-checklist.mjs /tmp/openhuman-pr-body-2542.md

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: callers can opt into generated-tool admission enforcement.
  • User-visible effect: None by default; provenance enforcement is disabled unless a caller enables it.

Parity Contract

  • Legacy behavior preserved: yes, admission with default config admits existing generated tools.
  • Guard/fallback/dispatch parity checks: adapter mismatch and existing generated-tool validation continue to run before wrapping.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): None found for vaddisrinivas:codex/oh-2542-generated-tool-admission.
  • Canonical PR: this PR.
  • Resolution (closed/superseded/updated): N/A.

Summary by CodeRabbit

  • New Features

    • Metadata for generated tools (provenance, capability, policy, risk) and risk classification that identifies external-effectful tools.
    • Admission pipeline to normalize, validate, and admit/reject tool definitions with configurable enforcement, deduplication, provider/capability allow/deny checks, and explicit rejection reasons.
  • Tests

    • Unit tests covering admission behavior, enforcement toggles, rejection scenarios, and external-effect detection.

Review Change Stack

@vaddisrinivas vaddisrinivas requested a review from a team May 23, 2026 15:46
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

📝 Walkthrough

Walkthrough

Adds optional provenance/capability/risk/policy fields to generated tool definitions, a GeneratedToolRisk enum, an admission pipeline that normalizes and validates definitions against configurable policies, Tool::external_effect() mapping from risk, and unit tests covering admission outcomes and risk behavior.

Changes

Generated Tool Provenance and Admission

Layer / File(s) Summary
Data model: provenance & risk metadata
src/openhuman/tools/generated.rs
GeneratedToolDefinition gains optional provider_id, capability_id, source_digest, policy_surface, and risk: Option<GeneratedToolRisk>. GeneratedToolRisk enum and public admission structs (GeneratedToolAdmissionConfig, GeneratedToolAdmissionRejection, GeneratedToolAdmissionReport) are added.
Admission entry: admit_generated_tool_definitions
src/openhuman/tools/generated.rs
Adds admit_generated_tool_definitions(...) which normalizes incoming definitions (trims required and optional fields) and returns a GeneratedToolAdmissionReport with admitted and rejected lists.
Validation and helpers: validate_admission & utilities
src/openhuman/tools/generated.rs
validate_admission(...) enforces schema checks, safe-name rules, deduplication vs existing names, provider allow/deny lists, capability deny list, and requires risk+source_digest when enforcement is enabled; includes provider-id normalization and name-safety helpers.
Tool trait: external_effect implementation
src/openhuman/tools/generated.rs
impl Tool for GeneratedTool now implements external_effect() by treating certain GeneratedToolRisk variants as external-effectful.
Tests and sample data
src/openhuman/tools/generated.rs
sample_definition() updated with new metadata; adds admission_config() helper and unit tests for trusted acceptance, legacy acceptance when enforcement disabled, invalid/untrusted provider rejection, duplicate-name rejection, enforcement-required-field rejections, unsafe-name rejection, and external_effect() checks for ExternalWrite and Execute.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2333: Related changes to src/openhuman/tools/generated.rs around generated-tool normalization and adapter-id validation; overlap in generated tool handling.

Suggested reviewers

  • graycyrus

Poem

🐰 I trimmed the names and checked each trait,
I marked the risks that wander out the gate.
Providers stamped and digests neat,
Admissions sort the safe from what’s not sweet.
Hooray — the tool garden hops, a little safer on its feet!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.84% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately and concisely describes the main change: adding provenance metadata and admission validation for generated tools.
Linked Issues check ✅ Passed The PR implements all coding requirements from #2542: provenance metadata fields, admission validation logic, diagnostics reporting, and test coverage for allowed/rejected cases.
Out of Scope Changes check ✅ Passed All changes are directly related to #2542 objectives. The external_effect() trait implementation for risk classification is within scope as it supports admission enforcement.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 23, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/openhuman/tools/generated.rs (1)

206-212: ⚡ Quick win

Add structured debug logs for admission decisions.

This branch decides whether a generated tool enters the registry, but accepted/rejected outcomes are silent today. A single structured debug! on each arm would make policy failures much easier to diagnose in production.

🪵 Suggested logging shape
         match validate_admission(&definition, config, &mut seen) {
-            Ok(()) => admitted.push(definition),
-            Err(reason) => rejected.push(GeneratedToolAdmissionRejection { tool_name, reason }),
+            Ok(()) => {
+                log::debug!(
+                    "[generated_tools] admission accepted tool_name={} provider_id={:?} capability_id={:?}",
+                    definition.name,
+                    definition.provider_id,
+                    definition.capability_id
+                );
+                admitted.push(definition);
+            }
+            Err(reason) => {
+                log::debug!(
+                    "[generated_tools] admission rejected tool_name={} provider_id={:?} capability_id={:?} reason={}",
+                    tool_name,
+                    definition.provider_id,
+                    definition.capability_id,
+                    reason
+                );
+                rejected.push(GeneratedToolAdmissionRejection { tool_name, reason });
+            }
         }

As per coding guidelines, "Use log / tracing at debug or trace level on RPC entry and exit, error paths, state transitions, and any branch that is hard to infer from tests alone."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/tools/generated.rs` around lines 206 - 212, Add structured
debug logging for each admission decision in the loop that iterates over
definitions: after computing tool_name and calling
validate_admission(&definition, config, &mut seen), emit a debug-level
structured log in both arms — for Ok(()) log that the tool was admitted with
tool_name and any relevant fields (e.g., definition.id/name/version or
normalized state via normalize_definition), and for Err(reason) log that the
tool was rejected with tool_name and the reason and also create the
GeneratedToolAdmissionRejection as before; place these logs adjacent to the
match arms that push into admitted and rejected so the state transitions
(admitted/rejected vectors) are observable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 66-69: The is_external_effect impl for GeneratedToolRisk currently
only matches ExternalWrite and Dangerous, so the Execute variant is incorrectly
considered non-external; update the matches! in
GeneratedToolRisk::is_external_effect to include Self::Execute (and mirror the
same change in the other identical implementation around the 177-181 area) so
Execute returns true, and add a small regression test next to
generated_tool_marks_external_risk_as_external_effect asserting that Execute is
treated as external.

---

Nitpick comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 206-212: Add structured debug logging for each admission decision
in the loop that iterates over definitions: after computing tool_name and
calling validate_admission(&definition, config, &mut seen), emit a debug-level
structured log in both arms — for Ok(()) log that the tool was admitted with
tool_name and any relevant fields (e.g., definition.id/name/version or
normalized state via normalize_definition), and for Err(reason) log that the
tool was rejected with tool_name and the reason and also create the
GeneratedToolAdmissionRejection as before; place these logs adjacent to the
match arms that push into admitted and rejected so the state transitions
(admitted/rejected vectors) are observable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b6b4243d-8336-441a-ab29-f7de4542517d

📥 Commits

Reviewing files that changed from the base of the PR and between f8c9698 and 6226ed4.

📒 Files selected for processing (1)
  • src/openhuman/tools/generated.rs

Comment thread src/openhuman/tools/generated.rs
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 24, 2026
@YOMXXX
Copy link
Copy Markdown
Contributor

YOMXXX commented May 24, 2026

Review note on provider/provenance coupling: this PR trims generated-tool provider_id, but admission compares it directly against raw trusted_providers / disabled_providers sets. That is fine as a local helper, but it means provider-id normalization and duplicate-provider semantics can drift from #2541/#2548/#2551.\n\nI would avoid merging this independently of the provider registry decision. Once #2541 has a canonical registry, admission should consume normalized provider records from that registry rather than maintaining a parallel raw BTreeSet<String> trust model. That keeps enforcement behavior stable for case/whitespace/separator edge cases and reduces reviewer surface in #2542/#2543.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/openhuman/tools/generated.rs (2)

330-341: 💤 Low value

Silent filtering of invalid provider IDs in config sets may cause confusion.

normalize_provider_set uses filter_map which silently drops entries that fail normalization. If trusted_providers contains only invalid entries (e.g., ["bad/provider"]), the normalized set becomes empty and all providers will be rejected as "not trusted" — without any indication that the config itself was malformed.

Consider logging a warning during admission when config entries are dropped, or validate the config at construction time. Given the planned provider registry in #2541, this is minor, but a debug log would help diagnose misconfigurations.

 fn normalize_provider_set(values: &BTreeSet<String>) -> BTreeSet<String> {
     values
         .iter()
-        .filter_map(|value| normalize_provider_id(value))
+        .filter_map(|value| {
+            let normalized = normalize_provider_id(value);
+            if normalized.is_none() {
+                log::debug!(
+                    "[generated_tools] invalid provider_id in config set dropped value={value}"
+                );
+            }
+            normalized
+        })
         .collect()
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/tools/generated.rs` around lines 330 - 341,
normalize_provider_set silently drops entries that fail normalization which can
make config errors (e.g., trusted_providers = ["bad/provider"]) invisible and
cause unexpected "not trusted" rejections; update the admission logic around
normalize_provider_set (used for config.trusted_providers and
config.disabled_providers when checking provider_id in generated.rs) to detect
and report dropped/invalid entries: compute the set of invalid inputs (or
compare input length vs normalized length) and emit a warning/debug log naming
the affected config field (trusted_providers/disabled_providers) and the invalid
entries, or alternatively perform this validation at config construction time so
malformed provider IDs are surfaced earlier.

424-665: 💤 Low value

Consider extracting tests to a sibling generated_test.rs file.

The file is 666 lines, exceeding the ~500 line guideline. The test module (~240 lines) can be moved to a sibling file per the coding guidelines for Rust test organization.

// At end of generated.rs, replace the inline test module with:
#[cfg(test)]
#[path = "generated_test.rs"]
mod tests;

This keeps the implementation focused and reduces cognitive load when reviewing production code.

As per coding guidelines: "When extracting Rust tests out of an implementation file, prefer a sibling *_test.rs file wired in with #[cfg(test)] #[path = "..._test.rs"] mod tests;" and "File size should not exceed approximately 500 lines."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/tools/generated.rs` around lines 424 - 665, The inline
#[cfg(test)] mod tests { ... } in generated.rs (~240 lines) should be moved into
a sibling file named generated_test.rs containing the entire test module body
(keeping imports like use super::* and the EchoAdapter, sample_definition,
admission_config, and all tests unchanged); then replace the removed inline
module in generated.rs with a single module declaration wired to the sibling
test file: #[cfg(test)] #[path = "generated_test.rs"] mod tests; and confirm all
symbols referenced (GeneratedTool, GeneratedToolDefinition,
admit_generated_tool_definitions, etc.) remain accessible (adjust pub/use
visibility if the compiler complains).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 330-341: normalize_provider_set silently drops entries that fail
normalization which can make config errors (e.g., trusted_providers =
["bad/provider"]) invisible and cause unexpected "not trusted" rejections;
update the admission logic around normalize_provider_set (used for
config.trusted_providers and config.disabled_providers when checking provider_id
in generated.rs) to detect and report dropped/invalid entries: compute the set
of invalid inputs (or compare input length vs normalized length) and emit a
warning/debug log naming the affected config field
(trusted_providers/disabled_providers) and the invalid entries, or alternatively
perform this validation at config construction time so malformed provider IDs
are surfaced earlier.
- Around line 424-665: The inline #[cfg(test)] mod tests { ... } in generated.rs
(~240 lines) should be moved into a sibling file named generated_test.rs
containing the entire test module body (keeping imports like use super::* and
the EchoAdapter, sample_definition, admission_config, and all tests unchanged);
then replace the removed inline module in generated.rs with a single module
declaration wired to the sibling test file: #[cfg(test)] #[path =
"generated_test.rs"] mod tests; and confirm all symbols referenced
(GeneratedTool, GeneratedToolDefinition, admit_generated_tool_definitions, etc.)
remain accessible (adjust pub/use visibility if the compiler complains).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44ec2ef0-670f-4d01-ae27-bbeba94f4f13

📥 Commits

Reviewing files that changed from the base of the PR and between 21a7bf3 and 4663136.

📒 Files selected for processing (1)
  • src/openhuman/tools/generated.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 24, 2026
@senamakel
Copy link
Copy Markdown
Member

Okay getting more eyes on this @graycyrus can you review this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add generated tool provenance and admission checks

4 participants