test(T14-T15): governance extraction and audit engine tests by ogkranthi · Pull Request #18 · ogkranthi/agentshift

ogkranthi · 2026-03-29T02:19:18Z

Summary

Closes T14 and T15 (Week 7 governance framework tests).

T14 — Governance extraction tests (`tests/test_governance_extraction.py`, 68 tests)

GuardrailRule classification: all 7 categories (safety/privacy/compliance/ethical/operational/scope/general), all 4 severity levels, defaults, invalid values rejected
ToolPermission field population: access levels (full/read-only/disabled), deny_patterns, allow_patterns, rate_limit, max_value, notes, enabled flag
PlatformAnnotation (L3): all 4 kinds (content_filter/pii_detection/denied_topics/grounding_check), all 4 platform targets, config dict nesting, storage on Governance/AgentIR
Edge cases: empty governance, missing required fields, extra fields forbidden (extra='forbid'), per-instance default factories, all three layers combined

T15 — Audit engine tests (`tests/test_governance_audit.py`, 73 tests)

GPR-L1: always 1.0 (L1 always preserved), zero-guardrail default = 1.0, platform-invariant
GPR-L2: native-only rate; elevated ≠ preserved; platform matrix (copilot→0, bedrock→preserves disabled_tool, rate_limit always elevated)
GPR-L3: zero default (contrast with L1/L2); bedrock full support=1.0; vertex partial (cf+pii only); claude-code none
GPR-Overall: weighted by artifact count (l1+l2+l3), platform comparison, formula verified
CFS: all four checks, formula ratio = sum/4, range [0,1]
Elevation tracking: disabled tools, deny patterns, allow patterns, rate limits, L3 annotations → L1 instructions; artifact key presence; elevated_instruction non-empty; claude-code deny passthrough (no re-elevation)
CSV export: exact 16-column header match, parent dir creation, 4-decimal float formatting, empty list → header-only
JSON export: top-level keys, l1/l2/l3 subkey structure, elevated_artifacts list + keys, empty list → []
audit_batch: agents×targets Cartesian product, empty agents, empty targets
Edge cases: zero tools, all-denied copilot, perfect bedrock, agent_id defaults to ir.name

Results

782 passed in 44.17s

All 141 new tests pass. All 641 pre-existing tests continue to pass.

… experiments) + clean up duplicate files - Add Governance, Guardrail, ToolPermission, PlatformAnnotation to IR model - Add governance extraction to OpenClaw parser (SOUL.md + tool permissions + L3 annotations) - Add elevation engine (elevate_governance) for L2/L3 → L1 promotion - Add governance_audit module with GPR/CFS scoring, CSV/JSON export, Rich tables - Add `agentshift audit` and `agentshift audit-batch` CLI commands - Integrate elevation into claude_code + copilot emitters - Add experiments/ directory with 12 domain agents for research paper - Remove duplicate sections 2.py and persona-sections-schema 2.md - Mark T13 as merged in BACKLOG.md

T14 — test_governance_extraction.py (68 tests): - GuardrailRule category/severity classification (all 7 categories, 4 severities) - ToolPermission field population: access, deny/allow patterns, rate_limit, max_value - PlatformAnnotation L3 parsing: all 4 kinds, all 4 platform targets, config dict - Edge cases: empty governance, missing required fields, extra fields rejected, defaults verified, combined layers T15 — test_governance_audit.py (73 tests): - GPR-L1 formula: always 1.0 (L1 always preserved), zero-guardrail default - GPR-L2 formula: native-only rate, elevated != preserved, platform matrix verified - GPR-L3 formula: zero default, bedrock full support, vertex partial, claude-code none - GPR-Overall: weighted by artifact count, cross-platform comparison - CFS: identity/tools/memory/schema checks, formula ratio - Elevation tracking: disabled tools, deny patterns, allow patterns, rate limits, L3 annotations; artifact keys, non-empty instructions, claude-code deny passthrough - CSV export: header exact match, parent dir creation, 4-decimal formatting, empty list - JSON export: structure, l1/l2/l3 subkeys, elevated_artifacts keys, empty list - audit_batch: agents×targets, empty cases - Edge cases: zero tools, all-denied copilot, perfect bedrock scores, agent_id defaults

cloudflare-workers-and-pages · 2026-03-29T02:19:20Z

Deploying agentshift with Cloudflare Pages

Latest commit:	`d47abfc`
Status:	✅ Deploy successful!
Preview URL:	https://00a4fbc5.agentshift.pages.dev
Branch Preview URL:	https://agent-tester-t14-t15.agentshift.pages.dev

View logs

…om bedrock/vertex, v0.3.0 D22 — AWS Bedrock parser (src/agentshift/parsers/bedrock.py) - Reads bedrock-agent.json, cloudformation.yaml, instruction.txt, openapi.json, guardrail-config.json (any combination; precedence: bedrock-agent.json > CFN > txt) - Reconstructs persona.system_prompt with section extraction - Extracts tools from OpenAPI action-group schemas (with CFN fallback) - Extracts knowledge sources from AWS::Bedrock::KnowledgeBase CFN resources - Strips AgentShift truncation notice from instruction.txt - Heuristic L1 guardrail extraction from instruction + guardrail-config.json topic policies - Registered under 'bedrock' parser key D23 — Vertex AI parser (src/agentshift/parsers/vertex.py) - Reads agent.json (required) + optional tools.json and README.md - Reconstructs system_prompt from goal + instructions with separator - Recovers structured sections from 'SectionName:\ncontent' linearized patterns - Detects tool kind: function / openapi / data store (routed to ir.knowledge) - Reconstructs ToolAuth from Vertex authentication blocks (apiKey/oauth/serviceAccount) - Heuristic L1 guardrail scan of instruction strings - Registered under 'vertex' parser key Shared utilities (src/agentshift/parsers/utils.py) - slugify, title_case_to_slug, is_todo_placeholder - infer_guardrail_category, infer_guardrail_severity - extract_guardrails_from_text (shared by both parsers) D24 — CLI updates (src/agentshift/cli.py) - Added 'bedrock' and 'vertex' to _PARSERS registry - convert/diff/audit now support --from bedrock and --from vertex - Enhanced _parse_with_errors with bedrock/vertex-specific error hints D25 — Version bump to 0.3.0 - pyproject.toml version: 0.3.0 - __init__.py __version__: 0.3.0 - CHANGELOG.md: v0.3.0 section added - README.md: governance layer docs + cloud parser examples Tests: 782 passed (all existing tests continue to pass)

ogkranthi · 2026-03-30T02:05:47Z

Tests cherry-picked to main directly (commit 27b27ca). T14-T15 merged.

ogkranthi added 3 commits March 28, 2026 22:05

chore: add Week 7 backlog (v0.3 governance framework + cloud parsers)

1933b45

ogkranthi added 2 commits March 28, 2026 22:19

chore: update BACKLOG.md — D22-D25, T14-T15 status to pr-created

d47abfc

ogkranthi closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(T14-T15): governance extraction and audit engine tests#18

test(T14-T15): governance extraction and audit engine tests#18
ogkranthi wants to merge 5 commits into
mainfrom
agent/tester/T14-T15

ogkranthi commented Mar 29, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

ogkranthi commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ogkranthi commented Mar 29, 2026

Summary

T14 — Governance extraction tests (tests/test_governance_extraction.py, 68 tests)

T15 — Audit engine tests (tests/test_governance_audit.py, 73 tests)

Results

Uh oh!

cloudflare-workers-and-pages Bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentshift with Cloudflare Pages

Uh oh!

ogkranthi commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

T14 — Governance extraction tests (`tests/test_governance_extraction.py`, 68 tests)

T15 — Audit engine tests (`tests/test_governance_audit.py`, 73 tests)

cloudflare-workers-and-pages Bot commented Mar 29, 2026 •

edited

Loading