Add generated tool provenance and admission checks#2549
Conversation
📝 WalkthroughWalkthroughAdds optional provenance/capability/risk/policy fields to generated tool definitions, a GeneratedToolRisk enum, an admission pipeline that normalizes and validates definitions against configurable policies, Tool::external_effect() mapping from risk, and unit tests covering admission outcomes and risk behavior. ChangesGenerated Tool Provenance and Admission
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/tools/generated.rs (1)
206-212: ⚡ Quick winAdd structured debug logs for admission decisions.
This branch decides whether a generated tool enters the registry, but accepted/rejected outcomes are silent today. A single structured
debug!on each arm would make policy failures much easier to diagnose in production.🪵 Suggested logging shape
match validate_admission(&definition, config, &mut seen) { - Ok(()) => admitted.push(definition), - Err(reason) => rejected.push(GeneratedToolAdmissionRejection { tool_name, reason }), + Ok(()) => { + log::debug!( + "[generated_tools] admission accepted tool_name={} provider_id={:?} capability_id={:?}", + definition.name, + definition.provider_id, + definition.capability_id + ); + admitted.push(definition); + } + Err(reason) => { + log::debug!( + "[generated_tools] admission rejected tool_name={} provider_id={:?} capability_id={:?} reason={}", + tool_name, + definition.provider_id, + definition.capability_id, + reason + ); + rejected.push(GeneratedToolAdmissionRejection { tool_name, reason }); + } }As per coding guidelines, "Use log / tracing at debug or trace level on RPC entry and exit, error paths, state transitions, and any branch that is hard to infer from tests alone."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/generated.rs` around lines 206 - 212, Add structured debug logging for each admission decision in the loop that iterates over definitions: after computing tool_name and calling validate_admission(&definition, config, &mut seen), emit a debug-level structured log in both arms — for Ok(()) log that the tool was admitted with tool_name and any relevant fields (e.g., definition.id/name/version or normalized state via normalize_definition), and for Err(reason) log that the tool was rejected with tool_name and the reason and also create the GeneratedToolAdmissionRejection as before; place these logs adjacent to the match arms that push into admitted and rejected so the state transitions (admitted/rejected vectors) are observable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 66-69: The is_external_effect impl for GeneratedToolRisk currently
only matches ExternalWrite and Dangerous, so the Execute variant is incorrectly
considered non-external; update the matches! in
GeneratedToolRisk::is_external_effect to include Self::Execute (and mirror the
same change in the other identical implementation around the 177-181 area) so
Execute returns true, and add a small regression test next to
generated_tool_marks_external_risk_as_external_effect asserting that Execute is
treated as external.
---
Nitpick comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 206-212: Add structured debug logging for each admission decision
in the loop that iterates over definitions: after computing tool_name and
calling validate_admission(&definition, config, &mut seen), emit a debug-level
structured log in both arms — for Ok(()) log that the tool was admitted with
tool_name and any relevant fields (e.g., definition.id/name/version or
normalized state via normalize_definition), and for Err(reason) log that the
tool was rejected with tool_name and the reason and also create the
GeneratedToolAdmissionRejection as before; place these logs adjacent to the
match arms that push into admitted and rejected so the state transitions
(admitted/rejected vectors) are observable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b6b4243d-8336-441a-ab29-f7de4542517d
📒 Files selected for processing (1)
src/openhuman/tools/generated.rs
|
Review note on provider/provenance coupling: this PR trims generated-tool |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/openhuman/tools/generated.rs (2)
330-341: 💤 Low valueSilent filtering of invalid provider IDs in config sets may cause confusion.
normalize_provider_setusesfilter_mapwhich silently drops entries that fail normalization. Iftrusted_providerscontains only invalid entries (e.g.,["bad/provider"]), the normalized set becomes empty and all providers will be rejected as "not trusted" — without any indication that the config itself was malformed.Consider logging a warning during admission when config entries are dropped, or validate the config at construction time. Given the planned provider registry in
#2541, this is minor, but a debug log would help diagnose misconfigurations.fn normalize_provider_set(values: &BTreeSet<String>) -> BTreeSet<String> { values .iter() - .filter_map(|value| normalize_provider_id(value)) + .filter_map(|value| { + let normalized = normalize_provider_id(value); + if normalized.is_none() { + log::debug!( + "[generated_tools] invalid provider_id in config set dropped value={value}" + ); + } + normalized + }) .collect() }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/generated.rs` around lines 330 - 341, normalize_provider_set silently drops entries that fail normalization which can make config errors (e.g., trusted_providers = ["bad/provider"]) invisible and cause unexpected "not trusted" rejections; update the admission logic around normalize_provider_set (used for config.trusted_providers and config.disabled_providers when checking provider_id in generated.rs) to detect and report dropped/invalid entries: compute the set of invalid inputs (or compare input length vs normalized length) and emit a warning/debug log naming the affected config field (trusted_providers/disabled_providers) and the invalid entries, or alternatively perform this validation at config construction time so malformed provider IDs are surfaced earlier.
424-665: 💤 Low valueConsider extracting tests to a sibling
generated_test.rsfile.The file is 666 lines, exceeding the ~500 line guideline. The test module (~240 lines) can be moved to a sibling file per the coding guidelines for Rust test organization.
// At end of generated.rs, replace the inline test module with: #[cfg(test)] #[path = "generated_test.rs"] mod tests;This keeps the implementation focused and reduces cognitive load when reviewing production code.
As per coding guidelines: "When extracting Rust tests out of an implementation file, prefer a sibling
*_test.rsfile wired in with#[cfg(test)] #[path = "..._test.rs"] mod tests;" and "File size should not exceed approximately 500 lines."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/generated.rs` around lines 424 - 665, The inline #[cfg(test)] mod tests { ... } in generated.rs (~240 lines) should be moved into a sibling file named generated_test.rs containing the entire test module body (keeping imports like use super::* and the EchoAdapter, sample_definition, admission_config, and all tests unchanged); then replace the removed inline module in generated.rs with a single module declaration wired to the sibling test file: #[cfg(test)] #[path = "generated_test.rs"] mod tests; and confirm all symbols referenced (GeneratedTool, GeneratedToolDefinition, admit_generated_tool_definitions, etc.) remain accessible (adjust pub/use visibility if the compiler complains).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 330-341: normalize_provider_set silently drops entries that fail
normalization which can make config errors (e.g., trusted_providers =
["bad/provider"]) invisible and cause unexpected "not trusted" rejections;
update the admission logic around normalize_provider_set (used for
config.trusted_providers and config.disabled_providers when checking provider_id
in generated.rs) to detect and report dropped/invalid entries: compute the set
of invalid inputs (or compare input length vs normalized length) and emit a
warning/debug log naming the affected config field
(trusted_providers/disabled_providers) and the invalid entries, or alternatively
perform this validation at config construction time so malformed provider IDs
are surfaced earlier.
- Around line 424-665: The inline #[cfg(test)] mod tests { ... } in generated.rs
(~240 lines) should be moved into a sibling file named generated_test.rs
containing the entire test module body (keeping imports like use super::* and
the EchoAdapter, sample_definition, admission_config, and all tests unchanged);
then replace the removed inline module in generated.rs with a single module
declaration wired to the sibling test file: #[cfg(test)] #[path =
"generated_test.rs"] mod tests; and confirm all symbols referenced
(GeneratedTool, GeneratedToolDefinition, admit_generated_tool_definitions, etc.)
remain accessible (adjust pub/use visibility if the compiler complains).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 44ec2ef0-670f-4d01-ae27-bbeba94f4f13
📒 Files selected for processing (1)
src/openhuman/tools/generated.rs
|
Okay getting more eyes on this @graycyrus can you review this? |
Summary
Problem
Solution
GeneratedToolAdmissionConfig,GeneratedToolAdmissionReport, andadmit_generated_tool_definitions.Submission Checklist
## Related: N/A, no coverage-matrix feature ID applies.docs/RELEASE-MANUAL-SMOKE.md): N/A, no release-cut UI/runtime smoke surface changed.Closes #NNNin the## Relatedsection.Impact
Toolobjects.GeneratedToolDefinition::newcallers continue to work with metadata fields defaulted toNone.Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
codex/oh-2542-generated-tool-admission46631365d44929e2eef03935020cdcc5690abca6Validation Run
pnpm --filter openhuman-app format:check: N/A, Rust-only backend tool-infrastructure change.pnpm typecheck: N/A, Rust-only backend tool-infrastructure change.cargo test --manifest-path Cargo.toml generated_tool --lib;cargo test --manifest-path Cargo.toml admission_ --libcargo fmt --manifest-path Cargo.toml --all --check;git diff --checknode scripts/check-pr-checklist.mjs /tmp/openhuman-pr-body-2542.mdValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
Parity Contract
Duplicate / Superseded PR Handling
vaddisrinivas:codex/oh-2542-generated-tool-admission.Summary by CodeRabbit
New Features
Tests