Skip to content

Test engineer plugin#150

Draft
nthompson-bitwarden wants to merge 11 commits into
mainfrom
test-engineer-plugin
Draft

Test engineer plugin#150
nthompson-bitwarden wants to merge 11 commits into
mainfrom
test-engineer-plugin

Conversation

@nthompson-bitwarden

Copy link
Copy Markdown

🎟️ Tracking

https://bitwarden.atlassian.net/browse/QA-1983

📔 Objective

This branch introduces a new bitwarden-test-engineer plugin to the marketplace -a test-engineering toolkit whose first role is the test-strategist agent.

Given a change (feature, bugfix, refactor, or migration), the agent recommends what to test, at which layer, and why, shaped to each repo's actual test practice. It does test planning only - it does not author, run, or maintain tests.

  • Registration - new entry in .claude-plugin/marketplace.json, plugin.json, CHANGELOG.md, README, and cspell additions.
  • One agent (test-strategist) - classifies inputs (Jira ticket via Atlassian MCP, GitHub PR via gh, test-case CSV, or plain description), fans out subagents to gather evidence, then runs two skills in sequence.
  • Two skills:
    • assessing-test-coverage - backward-looking inventory of what's already tested, buckets tests by layer, cites each as a GitHub permalink, flags gaps, emits an HTML coverage report.
    • analyzing-test-stack - maps each behavior to the cheapest sufficient test layer (unit/integration/E2E) per repo's shape, names concrete tooling, surfaces shape-wrong tests (ice-cream-cone, over-testing), emits an HTML report.
  • Supporting references/scripts - report templates, CSS/style tokens, monorepo-layout and test-layer guidance, severity-risk model, and a build-report.sh script.

Key design principle: each behavior is tested at the cheapest layer that buys the needed confidence, with layer weighting decided per repo (unit-heavy pyramid for server/clients, integration/snapshot trophy for ios, all-E2E for the dedicated private test repo). Atlassian integration is optional with graceful degradation.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

🧪 Plugin Validation Report — bitwarden-test-engineer

Result: ✅ All checks passed. No critical or major issues. A few minor, non-blocking suggestions noted below.

PR #150 introduces the bitwarden-test-engineer plugin (1 agent, 2 skills, shared report-rendering references + script). Three validations were run: plugin structure, skill review, and security.


1. Plugin Structure (plugin-validator) — ✅ PASS

Check Result
plugin.json manifest (name, version, author, fields) ✅ Valid JSON, bitwarden-test-engineer / 1.0.0
Directory structure & auto-discovery agents/, skills/<name>/SKILL.md, references/, scripts/ all standard
Agent frontmatter (test-strategist.md) ✅ Valid name, 4 <example> blocks, valid color, system prompt present
Skill frontmatter (both skills) name + description present; dir names match name
README.md present ✅ Comprehensive
CHANGELOG.md format ✅ Keep a Changelog, ## [1.0.0] - 2026-06-15
Version consistency 1.0.0 across plugin.json, marketplace.json:99, agent frontmatter, changelog
Referenced files exist ✅ All ${CLAUDE_PLUGIN_ROOT} refs resolve
Hardcoded credentials ✅ None

Minor / informational (no action required):

  • agents/test-strategist.md:42model: inherit is unusual but intentional and documented in the system prompt. Confirm it's the desired default.
  • No plugin-level LICENSE (consistent with repo convention of relying on the root license). Only relevant if plugins are ever distributed standalone.

2. Skill Review (skill-reviewer) — ✅ PASS (both skills)

Skill Body words Description Frontmatter Refs
analyzing-test-stack ~768 ~690 chars ✅ Valid ✅ All resolve
assessing-test-coverage ~728 ~600 chars ✅ Valid ✅ All resolve

Both skills use textbook progressive disclosure — lean SKILL.md bodies with detail offloaded to references/ (including a shared ../../references/ layer with no duplication). All file references and in-document section anchors resolve. The two descriptions are explicitly disambiguated from each other (forward-looking recommendation vs. backward-looking inventory) with reciprocal "for that, use X" pointers — exactly right for a sibling pair that could otherwise cross-trigger.

Minor suggestions (should-fix, not blocking):

  • analyzing-test-stack/SKILL.md:3 — Description ~690 chars exceeds the ~500-char guideline; the opening clause is dense. Recommend trimming the input enumeration (covered in the body's Inputs section). (Highest-value suggestion.)
  • assessing-test-coverage/SKILL.md:3 — Description ~600 chars, modestly over guideline; optional trim of the opening clause.
  • Both bodies (~728–768 words) sit below the 1,000-word target. Acceptable here given the strong, lossless progressive disclosure — noted only for awareness.
  • analyzing-test-stack/SKILL.md:33-35 — "Principles" section has a single bullet; consider promoting 1–2 invariants ("cheapest sufficient layer wins", "rank gaps by severity") from the workflow.
  • Neither skill has an examples/ directory. For artifact-producing skills, a worked sample (e.g. a filled coverage-inventory record or a rendered report fragment) would aid reliability. Low priority.

3. Security Review (reviewing-claude-config) — ✅ CLEAN

All 17 changed files scanned. No critical, major, or minor security issues.

  • Secrets/credentials: None. Case-insensitive scan (api_key, password, token, secret, Bearer, ghp_, sk-, AKIA, PRIVATE KEY) returned only benign vocabulary ("style tokens", "token cost").
  • scripts/build-report.sh: Secure. set -euo pipefail (line 43); --slug validated against ^[a-zA-Z0-9._-]+$ (lines 81–84) before building paths, blocking traversal/metacharacter injection; --date validated against a strict date pattern (lines 85–88); CSS is spliced via awk -v css=.../getline as data, not code (lines 105–114); sed/awk substitution args are hardcoded literals, not user input. No rm -rf, chmod 777, eval, or piped curl|sh. Input files are existence-checked and read-only.
  • Agent tool scoping (test-strategist.md): Least-privilege. Bash limited to specific read-only gh/git subcommands plus the plugin's own script — no blanket Bash(*). MCP tools limited to read-only Atlassian getters/searchers; no mutation tools. Write is used only for report fragments.
  • No settings.local.json, settings.json, or .mcp.json among the changed files.

Summary

Severity Count
🔴 Critical 0
🟠 Major 0
🟡 Minor (should-fix) 2 (skill description lengths)
ℹ️ Informational 5

Recommendation: ✅ Approve. The plugin is structurally sound, both skills validate, version/changelog discipline satisfies repo requirements, and the security posture is strong (validated inputs, least-privilege tooling, no secrets). The only actionable items are cosmetic skill-description trims.

Note: The repo's validate-plugin-structure.sh / validate-marketplace.sh scripts could not be executed in this environment (interactive approval unavailable), but the plugin-validator agent performed equivalent checks — manifest correctness, structure, version consistency across all three files, and changelog format — all of which passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant