diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index c2fc1cf..e58d51c 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -19,6 +19,11 @@ "source": "./discovery", "description": "End-to-end product discovery flow that produces a structured PRD" }, + { + "name": "jira-issue-triage", + "source": "./jira-issue-triage", + "description": "End-to-end Jira issue triage subagent across all archetypes (Bug, Incident, Feature, Task, Spike)" + }, { "name": "ralph-wiggum", "source": { diff --git a/README.md b/README.md index 1a1c4d8..85282b2 100644 --- a/README.md +++ b/README.md @@ -49,6 +49,18 @@ Take a raw product idea through ten guided phases and walk away with a structure Entry point: `/discovery:start` +### [Jira Issue Triage](jira-issue-triage/) — End-to-End Jira Triage Subagent + +Paste any Jira ticket URL (Bug, Incident, Feature, Task, or Spike) and the agent triages it end-to-end: assigns to you, transitions to investigating, runs the matching investigation skill, drafts the assessment comment, refines title and description, applies the triaged label, and DMs a one-line summary on Slack. + +- Archetype-aware workflow (Bug, Incident, Feature, Task, Spike) +- Bundles four skills: `issue-investigator`, `requirements-investigator`, `jira-ticket-refiner`, `prose-style` +- `/jira-issue-triage:setup` wizard for first-time configuration +- Phase 3 confirmation gate — preview every change before it lands in Jira +- Graceful degradation when Slack or Datadog MCP servers are missing + +Entry point: paste a Jira URL and ask the agent to triage it (subagent: `jira-issue-triage`) + ## Install ```bash @@ -59,6 +71,7 @@ Entry point: `/discovery:start` /plugin install bee@incubyte-plugins /plugin install learn@incubyte-plugins /plugin install discovery@incubyte-plugins +/plugin install jira-issue-triage@incubyte-plugins ``` ## License diff --git a/azure-issue-triage/.claude-plugin/plugin.json b/azure-issue-triage/.claude-plugin/plugin.json new file mode 100644 index 0000000..b567947 --- /dev/null +++ b/azure-issue-triage/.claude-plugin/plugin.json @@ -0,0 +1,13 @@ +{ + "name": "azure-issue-triage", + "version": "0.4.0", + "description": "End-to-end Azure DevOps work-item triage subagent across all archetypes (Bug, Incident, User Story, Feature, Task, Spike). Sibling of jira-issue-triage; ports the same workflow to Azure DevOps Boards: WIQL queries, Area/Iteration paths, State+Reason transitions, Microsoft.VSTS.Common.Severity. Bundles azure-issue-investigator, azure-requirements-investigator, azure-work-item-refiner, prose-style. Ships a /azure-issue-triage:setup wizard.", + "author": { + "name": "Taha Bikanerwala", + "url": "https://github.com/TahaBikanerwala" + }, + "homepage": "https://github.com/TahaBikanerwala/jt-bikanerwala-marketplace", + "repository": "https://github.com/TahaBikanerwala/jt-bikanerwala-marketplace", + "license": "MIT", + "keywords": ["azure", "azure-devops", "ado", "boards", "work-item", "wiql", "triage", "bug", "task", "subagent"] +} diff --git a/azure-issue-triage/README.md b/azure-issue-triage/README.md new file mode 100644 index 0000000..bb848d9 --- /dev/null +++ b/azure-issue-triage/README.md @@ -0,0 +1,348 @@ +# azure-issue-triage + +A Claude Code plugin that ships one subagent (`azure-issue-triage`) and a setup wizard (`/azure-issue-triage:setup`). Paste any Azure DevOps work-item URL (Bug, Incident, User Story, Feature, Task, or Spike) and tell the agent to triage. The agent assigns the work item to you, transitions it to investigating, runs the matching investigation skill, drafts an archetype-appropriate assessment comment, refines the title and description, applies a triaged tag, and posts a one-line summary on Microsoft Teams. The agent pauses at the Phase 3 confirmation gate (before posting any comment, changing the description, or updating other fields) to show you the full findings and get your approval. + +This plugin is a sibling of [`jira-issue-triage`](../jira-issue-triage/). The two plugins install side by side; the workflows are conceptually identical but call platform-specific tools. + +## What's new in v0.4.0 + +Three non-bug-flow upgrades that close the parity gap with `jira-issue-triage` and add a feature unique to the AzDO stack: + +- **Sprint placement (iteration path).** New `iteration_path_strategy` config (`null` / `"current"` / `"explicit:"`). When `"current"`, Phase 6 calls `work_list_team_iterations` for the configured `default_team` and writes the active iteration to `System.IterationPath`. When `"explicit:"`, the agent writes that path verbatim. Applies to User Story / Feature / Task / Spike on the standard path. +- **Story-point estimation prompt.** New `story_points_field` config (defaults to null). When set, the Phase 3 main panel adds a fourth question for User Story or Feature work items: `1`, `3`, `5`, or `Skip` (with "Other" accepting any other number). Phase 6 writes the estimate to the configured field; null estimates skip the write. +- **Azure Repos pull-request linking.** New `pr_linking_enabled` config (defaults to `true`). The agent regex-matches Azure Repos PR URLs in the description, comments, Teams threads, and investigator output, resolves each via `repos_get_pull_request_by_id`, proposes up to 4 at the Phase 3 panel, and writes the user-approved subset as `ArtifactLink` relations using the AzDO `vstfs:///Git/PullRequestId/...` URL form. + +All planned v0.x archetype-and-workflow surface area has now landed. Open follow-ups for v0.5.0 and later: capacity-aware sprint placement (overflow into next sprint when current is full), backfill PR links on already-merged work items, support for the `Microsoft.VSTS.CMMI.*` field family on CMMI projects. + +## What's new in v0.3.0 + +Three Bug/Incident-flow upgrades that bring the agent closer to feature parity with `jira-issue-triage`: + +- **Severity SLA due dates.** `severity_scheme` config (`due_offset_days`, `escalate_immediately`) maps each severity level to a target turnaround. Phase 6 writes `Microsoft.VSTS.Scheduling.DueDate` as `System.CreatedDate + due_offset_days`. The pre-triage value is preserved in revision history. +- **Microsoft Teams escalation routing.** A new `escalation` config block (`teams_channel`, `primary_contact`, `fallback_contact`) drives Phase 10 escalation when the recommended severity has `escalate_immediately: true`. The agent posts a separate channel message mentioning the resolved primary contact (looked up by email at session start), DMs the contact directly when no channel is configured, or no-ops when both are null. +- **EM-fallback for deactivated reporters.** When a follow-up is needed and the reporter appears unreachable, the agent now runs a three-step ladder: Teams profile manager lookup, AzDO team-admin scan, then ask-the-user with a proposed candidate. EM tagging requires explicit user approval; the question comment carries an "original reporter is unreachable" preamble. + +All deferred items shipped in v0.4.0 (see above). + +## What's new in v0.2.0 + +The archetype scope expanded from Bug + Task (v0.1.0) to all five archetypes. **Bug, Incident, User Story, Feature, Task, and Spike** all triage end-to-end now. Process-template-aware mapping in `work_item_type_map` lets Scrum (Product Backlog Item, Impediment) and CMMI (Requirement, Issue) projects override the work-item-type names. + +## Prerequisites + +### Required + +- **Azure DevOps MCP server.** The agent needs full Boards access (read work items, edit fields, post comments, query via WIQL, link work items, look up users). Microsoft ships an official server at [github.com/microsoft/azure-devops-mcp](https://github.com/microsoft/azure-devops-mcp) (`@azure-devops/mcp`). Install it through Claude Code's plugin or MCP config and authenticate against your Azure DevOps organization. + + **Tool-prefix note.** MCP tool names are scoped by however your Claude Code client mounts the server (e.g., `mcp__azure_devops__wit_get_work_item`, `mcp__plugin_ado__wit_get_work_item`, etc.). The agent body lists tool names in their commonly-used short form (`wit_get_work_item`, `wit_query_by_wiql`, `wiki_search`, `core_list_projects`). If your install prefixes them, the frontmatter and inline references in `agents/azure-issue-triage.md` need the prefix added once. The setup wizard prints the prefix it detects so you can update the agent body in one pass. + +### Recommended (the agent gracefully degrades without these) + +- **Microsoft Teams MCP server.** Used for the Phase 10 summary message. There is no canonical first-party Teams MCP yet; community options include InditexTech/mcp-teams-server and msfeldstein/MCP-MS-Teams. Without one installed, the agent prints the summary inline instead of sending a Teams message. +- **Datadog MCP server.** Used for Phase 2 log search on Bug and Incident archetypes. Without it (or for User Story / Feature / Task / Spike archetypes), Phase 2 is silently skipped. + +The plugin does not depend on Slack or Confluence. If you also use `jira-issue-triage`, both plugins coexist; their `prose-style` skills resolve via plugin namespacing. + +### Bundled skills + +The agent calls four skills during the workflow. All four ship bundled with this plugin and install automatically. + +| Skill name | Phase | Used for | Status | +|-----------|-------|----------|--------| +| `azure-issue-investigator` | Phase 1 (Bug, Incident) | Teams/AzDO/Wiki/Datadog/code investigation with evidence tags | Bundled | +| `azure-requirements-investigator` | Phase 1 (User Story, Feature, Task, Spike) | Teams/AzDO/Wiki search for prior decisions, design refs, scope; per-archetype report templates (Feature template for User Story/Feature, Task template for Task, Spike template for Spike) | Bundled | +| `azure-work-item-refiner` | Phase 5 (any archetype) | Title and description rewrite. Archetype-aware across all five archetypes. | Bundled | +| `prose-style` | Phase 2.5 + Phase 5 (any archetype) | Writing-rule application: strips em dashes, opener phrases, LLM vocabulary, bullet sprawl. Mirror of `jira-issue-triage/skills/prose-style/`. | Bundled | + +The agent body retains short defensive fallbacks for all four bundled skills. + +## Quick start + +1. Add the marketplace and install the plugin: + + ``` + /plugin marketplace add github.com/TahaBikanerwala/jt-bikanerwala-marketplace + /plugin install azure-issue-triage + ``` + +2. (Optional but recommended) Run the setup wizard: + + ``` + /azure-issue-triage:setup + ``` + + The wizard walks through six questions (organization URL, project, area path, severity field, transition mapping, Teams channel) and writes `.claude/azure-issue-triage.config.json`. You can re-run it any time to update. + + If you skip this step, the agent detects the missing config on first run and offers to walk through the same questions inline or use defaults. + +3. Verify the agent appears: open the Agent tool list and confirm `azure-issue-triage` appears. + +4. Paste any Azure DevOps work-item URL and ask the agent to triage: + + > Triage `https://dev.azure.com///_workitems/edit/12345`. + + The agent runs through phases 0-10, pauses at the Phase 3 confirmation gate, and waits for your approval before posting comments or changing fields. + +## Setup wizard + +The `/azure-issue-triage:setup` slash command walks through six questions and writes the result to `.claude/azure-issue-triage.config.json`: + +1. Organization URL (e.g., `https://dev.azure.com/contoso`). +2. Project name (or "infer from URL"). +3. Default area path prefix (optional). +4. Severity field — built-in `Microsoft.VSTS.Common.Severity` (default) or fall back to `Microsoft.VSTS.Common.Priority`. +5. State + Reason mapping for `investigating` and `waiting_reply`. +6. Teams channel for the Phase 10 summary (optional; null disables Teams). + +Auto-discovery uses `core_list_projects` and `wit_my_work_items` to suggest defaults. Failures are non-fatal; the wizard falls back to static defaults and tells you. + +The wizard never modifies Azure DevOps (read-only auto-discovery). Re-running it on an existing config offers to overwrite or keep current. + +## Configuration + +Configuration is **optional**. The agent uses sensible defaults if no config file is found. To override, run `/azure-issue-triage:setup` or create `.claude/azure-issue-triage.config.json` in your project root by hand: + +```json +{ + "organization_url": null, + "project": null, + "default_team": null, + "area_path_prefix": null, + "severity_field": "Microsoft.VSTS.Common.Severity", + "triaged_tag": "triaged", + "skip_tags": [], + "states": { + "investigating": { "state": "Active", "reason": "Investigating" }, + "waiting_reply": { "state": "Active", "reason": "Awaiting Customer" } + }, + "work_item_type_map": { + "Bug": "Bug", + "Incident": "Issue", + "User Story": "User Story", + "Feature": "Feature", + "Task": "Task", + "Spike": "Task" + }, + "archetype_assignment_after_triage": { + "Bug": "unassign", + "Incident": "self", + "User Story": "self", + "Feature": "self", + "Task": "self", + "Spike": "self" + }, + "severity_scheme": { + "1 - Critical": { "due_offset_days": 7, "escalate_immediately": true }, + "2 - High": { "due_offset_days": 14, "escalate_immediately": false }, + "3 - Medium": { "due_offset_days": 30, "escalate_immediately": false }, + "4 - Low": { "due_offset_days": 90, "escalate_immediately": false } + }, + "escalation": { + "teams_channel": null, + "primary_contact": null, + "fallback_contact": null + }, + "iteration_path_strategy": null, + "story_points_field": null, + "pr_linking_enabled": true, + "teams_channel": null, + "description_preview_pause_seconds": 3 +} +``` + +### Defaults (when config is absent) + +- `organization_url` and `project`: required at first run if not configured. The agent inspects the work-item URL and asks you to confirm or override. +- `severity_field`: `Microsoft.VSTS.Common.Severity` (the Agile process template's built-in field). Falls back to `Microsoft.VSTS.Common.Priority` if Severity is not enabled on your project. +- `triaged_tag`: `triaged` (Azure DevOps stores tags as a semicolon-delimited string; the agent appends without overwriting existing tags). +- `skip_tags`: empty (no skip rule). +- `states`: shown above. Azure DevOps requires a `State` + `Reason` pair on most transitions, so each entry is an object. Mapping depends on your process template (Agile, Scrum, CMMI). The defaults match Agile. +- `work_item_type_map`: assumes the **Agile** process template. `Bug -> Bug`, `Incident -> Issue`, `User Story -> User Story`, `Feature -> Feature`, `Task -> Task`, `Spike -> Task` (Spike has no canonical work-item type; the agent treats a Task tagged `spike` as a Spike). Override for Scrum (`User Story` becomes `Product Backlog Item`; `Incident` becomes `Impediment` or stays as a Bug with an `incident` tag) or CMMI (`User Story` becomes `Requirement`; `Incident` becomes `Issue`). Unknown work-item types pause the run and ask you which archetype to apply. +- `archetype_assignment_after_triage`: `Bug = "unassign"`; `Incident, User Story, Feature, Task, Spike = "self"`. Override per archetype. Common overrides: `"Incident": "unassign"` to route Sev-1 incidents back to the on-call pool; `"Bug": "self"` when bug triage and bug fixing are the same person. +- `severity_scheme`: 4-tier (`1 - Critical` / `2 - High` / `3 - Medium` / `4 - Low`) with 7/14/30/90-day SLA offsets. Critical is flagged for immediate escalation. The keys must exactly match the option names in your Severity field. +- `escalation`: all null. The Phase 10 summary still lands in the per-run `teams_channel` (when configured), but no separate Sev-1 channel post or contact DM happens. Set `escalation.teams_channel` and `escalation.primary_contact` to enable. +- `iteration_path_strategy`: null. Sprint placement (Phase 6 on User Story / Feature / Task / Spike) is disabled. Set to `"current"` (and configure `default_team`) or `"explicit:"` to enable. +- `story_points_field`: null. The Phase 3 story-points question and Phase 6 estimate write are disabled. Set to `"Microsoft.VSTS.Scheduling.StoryPoints"` (or your custom field's reference name) to enable. +- `pr_linking_enabled`: `true`. The agent surfaces Azure Repos PRs found during investigation as proposed `ArtifactLink` relations at the Phase 3 gate. Set to `false` to skip both the collection and the proposal. +- `teams_channel`: null. The agent prints the per-run summary inline (separate from `escalation.teams_channel`, which is for the high-severity routing). +- `description_preview_pause_seconds`: `3`. The pause between the Phase 5 informational preview and the actual write. + +### Sprint placement + +When `iteration_path_strategy = "current"`, Phase 6 calls `work_list_team_iterations` with `team: ` and `timeframe: "current"` to find the team's active sprint. The response carries the iteration path; Phase 6 writes it to `System.IterationPath`. If the team has no active sprint window (between two iterations), the agent warns once and skips the iteration write — the work item stays in its current iteration (which is usually the team's default backlog area for unscheduled work). + +When `iteration_path_strategy = "explicit:MyProject\\Backend\\Sprint 42"`, the agent writes that exact path verbatim. Useful when the work item belongs to a future sprint, a hardening sprint, or a non-default team. + +`default_team` must be set when `iteration_path_strategy = "current"` and the project has more than one team. The setup wizard prompts for it as a follow-up to Q9. + +Sprint placement only applies to User Story / Feature / Task / Spike. Bug and Incident don't use iteration paths in v0.4.0. + +### Story-point estimation + +When `story_points_field` is set and the archetype is User Story or Feature, the Phase 3 main panel adds a fourth question: `1`, `3`, `5`, or `Skip`. The "Other" channel accepts any other integer or the Fibonacci values most teams use (`2`, `8`, `13`, `21`). Phase 6 writes the chosen value to the configured field via `wit_update_work_item`. + +The estimate is voluntary; picking "Skip" leaves the field at its existing value (which may be unset). The `null` cache value means "no estimate captured" — never "estimated zero." Bug, Incident, Task, and Spike work items do not see the prompt. + +### Azure Repos pull-request linking + +The agent regex-matches Azure Repos PR URLs (`https://dev.azure.com///_git//pullrequest/`) in the description, comments, Teams threads, and investigator output. It calls `repos_get_pull_request_by_id` to resolve the title, project GUID, and repo GUID for each unique URL, capping the proposed list at 8 most-recent PRs. + +The Phase 3 main panel surfaces up to 4 of these as a multi-select question. The user picks any subset to link (or the "Other" channel to add a PR URL the agent didn't propose). Phase 7 writes each approved PR as an `ArtifactLink` relation with the AzDO-required `vstfs:///Git/PullRequestId/%2F%2F` URL form. Surplus entries (more than 4 proposed) are listed in the Phase 10 summary as "skipped (panel cap)" so the user can link them manually. + +Set `pr_linking_enabled = false` to disable both the collection step and the Phase 3 proposal entirely. + +### Severity SLA and the 4-tier default + +The default scheme aligns with the built-in `Microsoft.VSTS.Common.Severity` enum: + +| Level | Due offset | Escalate immediately | +|-------|-----------|----------------------| +| `1 - Critical` | 7 days | yes | +| `2 - High` | 14 days | no | +| `3 - Medium` | 30 days | no | +| `4 - Low` | 90 days | no | + +Phase 6 reads `severity_scheme[severity_recommendation].due_offset_days`, computes `System.CreatedDate + due_offset_days`, and writes the result to `Microsoft.VSTS.Scheduling.DueDate` (as a `YYYY-MM-DDT00:00:00Z` ISO timestamp; midnight UTC keeps the rendered date stable across viewers' time zones). When the recommended level is missing from `severity_scheme`, the agent skips the due-date write and notes the miss in the Phase 10 summary. + +Override the keys to match a renamed Severity field's options. Add or remove tiers freely; the agent uses whatever keys you define at runtime. + +### Escalation contacts + +Set `escalation.primary_contact` (and optionally `fallback_contact`) to an object with `name` and `email`: + +```json +{ + "escalation": { + "teams_channel": "Incident Response > Escalations", + "primary_contact": { "name": "Alice Kumar", "email": "alice@example.com" }, + "fallback_contact": { "name": "Bob Singh", "email": "bob@example.com" } + } +} +``` + +The agent's Prerequisites step 4 looks up Alice's Teams user descriptor via her email once per session and caches it. On any severity level marked `escalate_immediately: true`: + +- If `escalation.teams_channel` is set, the agent posts a separate Teams message to that channel mentioning Alice. +- If only `primary_contact` is set, the agent DMs Alice directly. +- If both are null, the running-user summary is the only escalation; the operator decides what to do. +- `fallback_contact` is not auto-paged on a timer; you can ask the agent ad hoc to ping the fallback later. + +Escalation only applies to Bug and Incident archetypes (severity is not used for User Story / Feature / Task / Spike). When a contact's email cannot be resolved at session start, the channel post mentions them by name only and the Phase 10 summary appends an unresolvable-contact warning. + +### EM-fallback when the reporter is deactivated + +When a follow-up question is warranted and the reporter appears unreachable (no recent assignments and Teams MCP user lookup fails), the agent runs a three-step ladder before asking you: + +1. **Teams profile manager.** If the Teams MCP exposes a profile call returning a `manager` field, the agent uses the manager's email as a candidate. +2. **AzDO team admin.** If the reporter belongs to one or more project teams, the agent proposes the team administrator (when distinct from the reporter and unique) as a candidate. +3. **Ask the user.** If the ladder produces a candidate, the agent surfaces it for confirmation. Otherwise it asks you to enter someone, or to skip the follow-up entirely. + +EM-tagged comments carry a one-sentence preamble: "The original reporter on this work item is unreachable. Tagging you as their EM (or alternate contact) to route this forward." Tagging requires explicit user approval; the agent never auto-tags an EM. + +### Process-template note + +The defaults assume the **Agile** process template. If your project uses Scrum or CMMI, override `work_item_type_map` and (for Scrum) `severity_field`: + +- **Scrum:** Replace `"User Story": "User Story"` with `"User Story": "Product Backlog Item"`. Replace `"Incident": "Issue"` with `"Incident": "Impediment"` (or `"Bug"` if your team uses Bugs tagged `incident` instead). Severity is not present by default; override `severity_field` to `Microsoft.VSTS.Common.Priority` and use the 1-4 priority field instead. +- **CMMI:** Replace `"User Story": "User Story"` with `"User Story": "Requirement"`. Bug, Task, Feature, Issue keep the same names. Severity is built in. Investigation states differ ("Proposed", "Active", "Resolved"). Override the `states` block. + +The wizard does not auto-detect the process template; it presents Agile defaults and you override the relevant fields if your project differs. + +### Skipping triage on certain work items + +Use `skip_tags` to skip triage on work items carrying any matching tag: + +```json +{ "skip_tags": ["external-vendor", "compliance-review"] } +``` + +A tag whose name *starts with* any prefix in `skip_tags` (case-insensitive) triggers the skip. The agent reports the matched tag and stops. You can override per-work-item by telling the agent to proceed anyway. + +### Custom states + +Mapping logical states to AzDO `State + Reason` pairs: + +```json +{ + "states": { + "investigating": { "state": "Active", "reason": "In Triage" }, + "waiting_reply": { "state": "Active", "reason": "Awaiting Customer Response" } + } +} +``` + +The agent reads this map and writes both `System.State` and `System.Reason` in a single `wit_update_work_item` call. + +### Datadog not installed + +Phase 2 is silently skipped. The agent never mentions Datadog in any output. No configuration needed. Phase 2 also skips silently for User Story / Feature / Task / Spike archetypes regardless of installation. + +### Teams not installed + +Phase 10 prints the summary inline as agent output instead of sending a Teams message. The agent notes once at the end of the run: "Teams DM unavailable; install a Teams MCP server to enable." + +## Workflow phases + +The workflow runs a generic core for every archetype. Four phases gate on archetype. + +| Phase | What it does | Archetypes | +|-------|--------------|------------| +| Prerequisites | Auto-discover identity, load config (with first-run wizard fallback if missing), confirm work-item-type and severity-field availability. | All | +| Phase 0 | Fetch work item via `wit_get_work_item`, run skip-tag check, detect archetype, assign to you (`System.AssignedTo`), transition to `investigating` state+reason. | All | +| Phase 1 | Investigation: `azure-issue-investigator` (Bug, Incident) or `azure-requirements-investigator` (User Story, Feature, Task, Spike). | All (skill choice gates on archetype) | +| Phase 2 | Datadog log search using signals from Phase 1. Silently suppressed on errors or when archetype is User Story / Feature / Task / Spike. | Bug, Incident | +| Phase 2.5 | Decide whether reporter follow-up is warranted. Form severity recommendation (Bug, Incident) or scope summary (User Story, Feature, Task, Spike). Draft the matching Phase 4 comment in markdown, then run `prose-style` on it. | All | +| Phase 3 | **Hard pause.** Show findings, archetype detection, and proposed updates. Asks all decisions side by side in a single `AskUserQuestion` panel. Metadata writes always run after the gate. | All | +| Phase 4a | Convert the cleaned draft to safe HTML and post the severity assessment as a discussion comment via `wit_add_work_item_comment`. | Bug, Incident | +| Phase 4b | Convert the cleaned draft to safe HTML and post the scope summary comment. The "What's in scope" body adapts to archetype (User Story / Feature: requirements found and design refs; Task: definition of done and why-now; Spike: question to answer and what's already known). | User Story, Feature, Task, Spike | +| Phase 4c | Convert the cleaned draft to safe HTML and post the follow-up question tagging the reporter. Replaces 4a or 4b. | All (only when follow_up_needed) | +| Phase 5 | Refine via `azure-work-item-refiner` (with `Calling context: skip_preview=true.` to suppress the skill's own preview gate), then run `prose-style` on the refined title and description, render the cleaned output inline as an informational preview, and write `System.Title` + `System.Description` after `description_preview_pause_seconds`. | All | +| Phase 6 | **Bug / Incident:** severity write (`Microsoft.VSTS.Common.Severity`) + due-date write (`Microsoft.VSTS.Scheduling.DueDate` computed from `severity_scheme`). **User Story / Feature / Task / Spike:** sprint placement (when `iteration_path_strategy` is set) writing `System.IterationPath`, plus optional story-point write (when `story_points_field` and `story_point_estimate` are both set). | All | +| Phase 7 | Link related/duplicate work items via `wit_update_work_item` adding `relations` entries (`System.LinkTypes.Related`, `System.LinkTypes.Hierarchy-Reverse`, `System.LinkTypes.Duplicate-Forward`). When `pr_linking_enabled = true`, also link the user-approved Azure Repos PRs as `ArtifactLink` relations using the `vstfs:///Git/PullRequestId/...` URL form. | All (PR links any archetype) | +| Phase 8 | Append the triaged tag to `System.Tags`. | All | +| Phase 9 | Final assignee per `archetype_assignment_after_triage[]`. The follow-up path moves to `waiting_reply` here; the standard path leaves the work item in `investigating` from Phase 0. | All | +| Phase 10 | Per-run summary (Teams when `teams_channel` is set; otherwise inline). Then escalation routing: when the recommended severity has `escalate_immediately: true`, post a separate channel message to `escalation.teams_channel` (or DM the resolved primary contact) mentioning `escalation.primary_contact`. | All (escalation routing on Bug/Incident only) | + +## Limitations + +The agent will never: +- Close or resolve a work item without your approval. +- Modify `Microsoft.VSTS.Common.Priority` unless `severity_field` is configured to it. +- Post a comment without showing you the text first AND getting an explicit yes at the Phase 3 gate. +- Refine the title or description without an explicit yes at the Phase 3 gate. Phase 5 then renders the cleaned output inline as an informational preview and pauses for `description_preview_pause_seconds` (default 3) before writing. +- Tag the reporter until investigation is exhausted and a specific gap blocks meaningful triage. Reporter contact is a last resort. EM-fallback (when the reporter is deactivated) is best-effort and requires explicit user approval before tagging. +- Remove or overwrite reporter-provided information during refinement (only append). +- Fabricate reproduction steps without verification. +- Mention an integration (Datadog, Teams, etc.) in any output if its API errored or returned no results. +- Drop screenshots, attachments, or inline links from the original description during refinement. + +## FAQ + +**Q: Can I run the agent on work items I'm not assigned to?** +A: Yes. Phase 0 assigns the work item to you as part of triage. After triage, the work item either stays with you or returns to the team pool based on `archetype_assignment_after_triage[]`. Defaults: Bug unassigns; Incident, User Story, Feature, Task, and Spike stay assigned. Override per archetype if your team uses a different ownership rule. + +**Q: Can I run the agent on a non-bug work item?** +A: Yes. Phase 1 calls `azure-requirements-investigator` instead of `azure-issue-investigator` for User Story, Feature, Task, and Spike. Phase 4 posts a scope summary instead of a severity assessment, with content adapted to the archetype. Phase 6 (severity write) is skipped. + +**Q: Can I run only part of the workflow?** +A: Yes. The Phase 3 confirmation gate asks separately whether to post the proposed comment and whether to refine the title and description. Answer No to either and the agent skips that write while still doing the other updates. + +**Q: Do I need to run `/azure-issue-triage:setup` before the first work item?** +A: Optional. The agent detects missing config on first run and offers to walk through the wizard inline or use defaults. + +**Q: What happens if the agent encounters an error mid-flight?** +A: It stops at the failing phase, tells you what went wrong, and asks how to proceed. It does not roll back changes already made (Azure DevOps revision history is the audit trail). + +**Q: How does archetype detection work?** +A: Phase 0 maps the work item's `System.WorkItemType` to one of Bug, Incident, User Story, Feature, Task, or Spike using the inverse of `work_item_type_map`. If the work-item type doesn't match any value in the map (e.g., a custom type), the agent pauses and asks which archetype to apply. If the type and content disagree (e.g., a Bug filed with acceptance criteria and a Figma link), the agent trusts the content and asks you to confirm at Phase 3. + +**Q: I use `jira-issue-triage` for one project and `azure-issue-triage` for another. Will they collide?** +A: No. The investigator and refiner skills are prefixed (`azure-issue-investigator`, `azure-work-item-refiner`, etc.). The `prose-style` skills in the two plugins share a name but are addressed via plugin namespacing (`jira-issue-triage:prose-style`, `azure-issue-triage:prose-style`); the agents call their own copy. + +## Contributing + +Issues and PRs welcome at the marketplace repo. The agent body is at `agents/azure-issue-triage.md`; the manifest is at `.claude-plugin/plugin.json`. Bundled skills live under `skills/`. + +## License + +MIT. See the [`LICENSE`](../../LICENSE) at the repo root. diff --git a/azure-issue-triage/agents/azure-issue-triage.md b/azure-issue-triage/agents/azure-issue-triage.md new file mode 100644 index 0000000..af734dc --- /dev/null +++ b/azure-issue-triage/agents/azure-issue-triage.md @@ -0,0 +1,861 @@ +--- +name: azure-issue-triage +description: "Triages an Azure DevOps work item end-to-end across all archetypes (Bug, Incident, User Story / Feature, Task, Spike): assigns it, transitions to investigating, runs the matching investigation skill, refines the title and description, posts an archetype-appropriate assessment comment, and posts a summary on Microsoft Teams. Use when a developer pastes an Azure DevOps work-item URL and says triage, investigate, pick up, or process." +tools: Skill, Read, Write, Bash, AskUserQuestion, wit_get_work_item, wit_update_work_item, wit_add_work_item_comment, wit_query_by_wiql, wit_get_work_item_type, wit_my_work_items, core_list_projects, core_list_project_teams, work_list_team_iterations, work_list_iterations, repos_get_pull_request_by_id, wiki_search, teams_search_messages, teams_read_thread, teams_send_message, mcp__datadog__search_datadog_logs +--- + +# Azure Issue Triage Agent + +Process an Azure DevOps work item through the full triage workflow regardless of archetype: detect whether it is a Bug, Incident, User Story / Feature, Task, or Spike; investigate using the matching skill; refine the title and description; post an archetype-appropriate assessment comment; and update the metadata fields. The workflow runs a generic core for every archetype and gates a small number of phases (severity write, investigator skill choice, comment shape) on Bug or Incident vs User Story / Feature / Task / Spike. + +All planned v0.x archetype-and-workflow surface area has landed. Open follow-ups for v0.5.0 and later: capacity-aware sprint placement (overflow into next sprint when current is full), backfill PR links on already-merged work items, support for the `Microsoft.VSTS.CMMI.*` field family on CMMI projects. + +## Tool naming note + +The frontmatter `tools` list uses short, unprefixed names. The actual MCP tool prefix depends on which Azure DevOps MCP server and which Microsoft Teams MCP server you have installed and how Claude Code mounts them. Common prefixes seen in the wild: `mcp__azure_devops__*`, `mcp__plugin_ado__*`, `mcp__plugin_azure_devops_microsoft__*`. If a tool call fails because the prefix doesn't match, edit the frontmatter once to add your prefix. + +If no Teams MCP server is installed, Phase 1 Level 1 (Teams search via the investigator skill) is silently skipped, and Phase 10's summary message prints inline as agent output instead of sending a Teams message. + +## Prerequisites + +Run these once at the start of the session and cache the results. + +### Identity and project context + +1. Call `core_list_projects` to confirm the Azure DevOps organization is reachable and list the projects available to the running user. Cache the project list keyed by name. +2. Call `wit_my_work_items` with `top: 1` to confirm work-item access and get the running user's display name and unique-name (the AzDO equivalent of an email or UPN). Cache as `assigned_to_descriptor` (the value to write back into `System.AssignedTo`). +3. If a Teams MCP is installed, search for the running user via the Teams MCP's user-lookup tool (varies by server) using the email from step 2; cache the Teams user ID. If lookup fails, treat Teams as unavailable for this run. +4. **Resolve escalation contacts.** If `escalation.primary_contact` is set in the resolved config, look up the contact via the Teams MCP using the configured email. Cache the descriptor as `escalation_target_descriptor`. Repeat for `escalation.fallback_contact` and cache as `escalation_fallback_descriptor`. If a lookup fails, leave the descriptor null and append a deferred warning for the Phase 10 summary ("Could not resolve escalation contact `{name} <{email}>`; the escalation channel will mention them by name only"). The agent never aborts the run for an escalation lookup failure; channel posts proceed without the mention element when the descriptor is null. + +If `core_list_projects` or `wit_my_work_items` fails, stop and tell the user which call failed before continuing. Never substitute hardcoded IDs. + +### Configuration + +1. Look for `.claude/azure-issue-triage.config.json` in the project root. If present, parse it and merge with the defaults below. +2. If no config file exists, pause before Phase 0 and ask the user: + + > I don't see a configuration file. Choose how to proceed: + > (a) Run `/azure-issue-triage:setup` to walk through the setup wizard, then re-paste the work-item URL. + > (b) Let me ask the same questions inline before triaging this work item. + > (c) Use defaults (sensible for most teams: Agile process template, built-in severity, no Teams DM). + +3. If the user picks (a), exit cleanly so they can run the slash command. If (b), inline-walk the wizard questions (the canonical question list lives in `commands/setup.md` inside this same plugin; mirror it exactly) and write the result via the `Write` tool with `path: ".claude/azure-issue-triage.config.json"` (pretty-print, 2-space indent, top-level keys sorted alphabetically). If (c), proceed with the defaults below and append a one-line note in the Phase 10 summary: "Triaged with default config; run /azure-issue-triage:setup any time to customize." + +The default config (used as the merge target for parsed values, and as-is when the user picks option c): + +```json +{ + "organization_url": null, + "project": null, + "default_team": null, + "area_path_prefix": null, + "severity_field": "Microsoft.VSTS.Common.Severity", + "triaged_tag": "triaged", + "skip_tags": [], + "states": { + "investigating": { "state": "Active", "reason": "Investigating" }, + "waiting_reply": { "state": "Active", "reason": "Awaiting Customer" } + }, + "work_item_type_map": { + "Bug": "Bug", + "Incident": "Issue", + "User Story": "User Story", + "Feature": "Feature", + "Task": "Task", + "Spike": "Task" + }, + "archetype_assignment_after_triage": { + "Bug": "unassign", + "Incident": "self", + "User Story": "self", + "Feature": "self", + "Task": "self", + "Spike": "self" + }, + "severity_scheme": { + "1 - Critical": { "due_offset_days": 7, "escalate_immediately": true }, + "2 - High": { "due_offset_days": 14, "escalate_immediately": false }, + "3 - Medium": { "due_offset_days": 30, "escalate_immediately": false }, + "4 - Low": { "due_offset_days": 90, "escalate_immediately": false } + }, + "escalation": { + "teams_channel": null, + "primary_contact": null, + "fallback_contact": null + }, + "iteration_path_strategy": null, + "story_points_field": null, + "pr_linking_enabled": true, + "teams_channel": null, + "description_preview_pause_seconds": 3 +} +``` + +**Validation.** The agent normalizes invalid values at session start and warns once via the Phase 10 summary rather than failing the run: + +- `description_preview_pause_seconds`: must be a non-negative integer. Negative, float, string, or null falls back to `3`. +- `archetype_assignment_after_triage`: must be an object whose values are `"unassign"` or `"self"`. A non-object value is treated as omitted and the full default applies. Per-key invalid values warn and use the archetype default. +- `states.`: must be an object with string `state` and string `reason`. Missing either field warns once and skips the corresponding transition (the work item stays in its current state). +- `work_item_type_map`: must be an object whose values are strings. The defaults assume the **Agile** process template. Process-template overrides: + - **Scrum:** `User Story` becomes `Product Backlog Item`. Set `"User Story": "Product Backlog Item"`. Bug, Task, Feature, Epic keep the same names. Scrum has no `Issue` work-item type; map `Incident` to `Impediment` or to `Bug` with an `incident` tag, depending on team convention. + - **CMMI:** `User Story` becomes `Requirement`. Set `"User Story": "Requirement"`. Bug and Task keep the same names. Map `Incident` to `Issue`. + - Unknown work-item types raise an archetype-correction prompt at Phase 3, not a hard failure. +- `severity_scheme`: must be an object whose keys are severity option labels (matching the ones returned by `wit_get_work_item_type` for the configured severity field) and whose values are objects with `due_offset_days` (non-negative integer) and `escalate_immediately` (boolean). Missing keys fall back to the default scheme; invalid values for a key warn once and use the default for that key. +- `escalation.teams_channel`: must be a string (the channel identifier in whatever shape the installed Teams MCP expects) or null. Null disables the channel post entirely; the run still summarizes inline at Phase 10. +- `escalation.primary_contact` / `escalation.fallback_contact`: must be an object `{ "name": string, "email": string }` or null. Resolved to a Teams user descriptor at session start via the Teams MCP's user-lookup tool; if the lookup fails, the contact is treated as null and a deferred warning surfaces at Phase 10. +- `iteration_path_strategy`: one of `null`, `"current"`, or `"explicit:"`. Null disables sprint placement entirely (Phase 6 skips the iteration write for User Story / Feature / Task / Spike). `"current"` reads the team's active iteration via `work_list_team_iterations` with `timeframe: "current"` and writes its `path`. `"explicit:"` writes that exact iteration path verbatim. Invalid values warn once and disable sprint placement. +- `story_points_field`: must be a string (the field's reference name, e.g., `Microsoft.VSTS.Scheduling.StoryPoints`) or null. Null disables the Phase 3 story-points question and the Phase 6 estimate write. The field must exist on the work-item type for User Story / Feature; if the field is missing, Phase 6 skips the write and warns once. +- `pr_linking_enabled`: boolean. When `true` (default), Phase 7 surfaces any Azure Repos pull-request URLs found during investigation as proposed `ArtifactLink` relations and asks the user (at the Phase 3 main panel) which to link. When `false`, the agent does not propose PR links even when it finds them; users can still link manually via the AzDO UI. + +### Severity field check (Bug and Incident) + +If the configured `severity_field` is `Microsoft.VSTS.Common.Severity` (the default), confirm it exists on the Bug work-item type for the configured project by calling `wit_get_work_item_type` with `type: "Bug"` (the Agile-default Bug + Issue types both expose it). If the field is not present (some Scrum templates omit it), warn once and fall back to `Microsoft.VSTS.Common.Priority` for severity decisions on this run. Severity is read for Bug and Incident; it is not read or written for User Story / Feature / Task / Spike. + +## Sibling Skills + +The agent invokes other skills during the workflow. Reference them by name; the `Skill` tool routes the call. When two plugins ship a skill with the same name (e.g., `prose-style` in both `jira-issue-triage` and `azure-issue-triage`), use the plugin-namespaced form `azure-issue-triage:prose-style` so the runtime resolves the call to the agent's own copy. + +**Bundled with this plugin** (always available when `azure-issue-triage` is installed): + +| Phase | Skill name | Purpose | +|-------|-----------|---------| +| Phase 1 (Bug, Incident) | `azure-issue-investigator` | Search Teams (when installed), the work item and related AzDO/Wiki pages, Datadog, then code if needed. Produces an evidence-tagged report in the 6-section bug-archetype template. | +| Phase 1 (User Story, Feature, Task, Spike) | `azure-requirements-investigator` | Search Teams (when installed) and the AzDO Wiki for prior decisions, read linked design and product docs, search related work items. Produces an evidence-tagged report in the matching archetype template (User Story / Feature use the Feature template; Task uses the Task template; Spike uses the Spike template). | +| Phase 5 (any archetype) | `azure-work-item-refiner` | Restructure the work-item description into a clear, self-contained document. Updates `System.Title` and `System.Description` via `wit_update_work_item` and never deletes original content. | +| Phase 2.5 + Phase 5 (any archetype) | `azure-issue-triage:prose-style` | Audit and rewrite drafted text so it reads like a person wrote it. Phase 2.5: clean the assessment/scope comment draft and any reporter follow-up before the Phase 3 preview. Phase 5: clean the refined title and description after `azure-work-item-refiner` runs and before the user-facing preview. Strips AI tells: em dashes, opener phrases, LLM vocabulary, bullet sprawl. | + +All four bundled skills install with the plugin. The defensive fallbacks below fire only on rare runtime load failures; they are not the expected execution path. + +### Skill calling-context conventions + +When the agent invokes a skill via the `Skill` tool, it can pass instructions to the skill by including a leading `Calling context:` line in the prompt. The convention: + +- The first line of the agent's prompt to the skill is **only** the directive: `Calling context: =[, =...].` (terminated by a period). +- The directive line carries no free-text guidance. Any human-readable instructions, payload data, or skill input go on subsequent lines after a blank line. +- The skill body parses the first line, recognizes known keys, and interprets them. Unknown keys are ignored. + +Currently defined keys: + +| Key | Value | Recognized by | Effect | +|-----|-------|---------------|--------| +| `skip_preview` | `true` / `false` | `azure-work-item-refiner` (Phase 5) | When `true`, skill skips its Step 7 preview-and-write; returns the refined title and description as plain text for the agent to write. | + +## Working State + +The agent tracks a small set of named caches across phases. Treat these as concrete values; do not reconstruct the contract from prose at each phase boundary. + +| Cache key | Set in | Read in | Type | Default if not yet set | +|-----------|--------|---------|------|------------------------| +| `work_item_payload` | Phase 0 step 2 | All phases that need work-item data | object | n/a (must be set before Phase 1) | +| `archetype` | Phase 0 step 4 | All phases that branch on archetype | enum (Bug / Incident / User Story / Feature / Task / Spike) | n/a | +| `severity_recommendation` | Phase 2.5 step 2 (Bug or Incident) | Phase 3 display, Phase 4a body, Phase 6 write | string (severity option name) or `null` | `null` | +| `scope_summary_draft` | Phase 2.5 step 2 (User Story / Feature / Task / Spike) | Phase 3 display, Phase 4b body | string or `null` | `null` | +| `comment_draft` | Phase 2.5 step 4 | Phase 3 display, Phase 4a/4b/4c body | string (markdown) | `null` | +| `follow_up_needed` | Phase 2.5 step 3; flipped at Phase 3 on tag decline | Phase 4a/4b/4c branch, Phase 6 skip rule, Phase 9 transition | boolean | `false` | +| `followup_target_descriptor` | Phase 2.5 step 3 | Phase 4c | string or `null` | `null` | +| `approved_post_comment` | Phase 3 main panel question 1 | Phase 4a/4b/4c entry guards | boolean | `false` | +| `approved_refine_description` | Phase 3 main panel question 2 | Phase 5 entry guard | boolean | `false` | +| `approved_followup_tag` | Phase 3 main panel question 4 (conditional, mutex with story-points) | Phase 3 post-panel downgrade rule | boolean | `false` | +| `comment_change_request` | "Other" channel of Phase 3 question 1 | Phase 3 revision loop | string or empty | empty | +| `refine_change_request` | "Other" channel of Phase 3 question 2 | Phase 5 invocation guidance | string or empty | empty | +| `assignment_outcome` | Phase 9 step 1 | Phase 10 summary placeholder | enum (`unassigned` / `kept assigned to you`) | `null` | +| `due_date_iso` | Phase 6 step 3 (Bug or Incident on the standard path) | Phase 10 summary placeholder | string (`YYYY-MM-DD`) or `null` | `null` | +| `escalation_target_descriptor` | Prerequisites step 4 (resolved once per session from `escalation.primary_contact.email`) | Phase 10 escalation routing | string or `null` | `null` | +| `escalation_fallback_descriptor` | Prerequisites step 4 (resolved once per session from `escalation.fallback_contact.email`) | Ad-hoc escalation only | string or `null` | `null` | +| `escalation_fired` | Phase 10 escalation routing | Phase 10 summary placeholder | boolean | `false` | +| `active_iteration_path` | Phase 6 step (resolved once per run from `work_list_team_iterations` when `iteration_path_strategy = "current"`) | Phase 6 sprint write, Phase 10 summary placeholder | string or `null` | `null` | +| `story_point_estimate` | Phase 3 main panel question 3 (conditional, mutex with tag-approval) | Phase 6 story-point write | number or `null` | `null` | +| `proposed_pr_links` | Phase 1 / Phase 2 (collected during investigation) | Phase 3 main panel display, Phase 7 link write | array of `{ url: string, title: string }` | `[]` | +| `approved_pr_links` | Phase 3 main panel (the "Other" channel of the PR-link confirmation when proposed_pr_links is non-empty) | Phase 7 link write | array of `{ url: string, title: string }` | `[]` | + +## Connections + +| System | MCP server | Used for | +|--------|-----------|----------| +| Azure DevOps Boards | The official Microsoft Azure DevOps MCP (`@azure-devops/mcp`) or compatible | Work-item fetch, edit, transition, comment, links, WIQL queries, wiki search | +| Microsoft Teams | A Teams MCP server (community-maintained; no canonical first-party choice yet) | Search messages, look up users, send the Phase 10 summary | +| Datadog | `datadog` MCP server | Log search for observability data | + +If a server is not installed or its API returns errors throughout this run, treat that integration as unavailable for this work item and proceed without it. Never mention an unavailable integration in any output. + +## Severity Criteria + +**Applies to:** Bug, Incident. Skipped for User Story / Feature / Task / Spike (severity is not used; estimation and sprint placement live in Phase 6 in later releases). + +Use these dimensions to recommend a severity. The default scheme uses the built-in `Microsoft.VSTS.Common.Severity` field's enum: `1 - Critical`, `2 - High`, `3 - Medium`, `4 - Low`. (Some templates rename these; cache the actual option labels from `wit_get_work_item_type` during Prerequisites.) + +| Dimension | What to check | +|-----------|---------------| +| User impact | All users, a segment, or a single reporter? | +| Functional impact | Core flow blocked (login, payments, scheduling) or cosmetic? | +| Workaround | Exists? Obvious to users? | +| Data integrity | Could cause data loss, corruption, or incorrect records? | +| Compliance | Affects billing, eligibility, or regulatory requirements? | + +The severity recommendation drives two Phase 6 writes: the severity field itself and an SLA-based due date (`Microsoft.VSTS.Scheduling.DueDate`). The due date is computed as `System.CreatedDate + severity_scheme[recommendation].due_offset_days`. Any level whose `escalate_immediately` flag is `true` triggers Phase 10's escalation routing. + +## Do Not Rules + +- Never close or resolve a work item unilaterally. Recommend and ask for approval. +- Never remove or overwrite reporter-provided information. Only append. +- Never drop screenshots, videos, images, recordings, file attachments, or inline links from the original description. All original media must survive into the refined version. +- Never fabricate reproduction steps you haven't verified. +- Never modify `Microsoft.VSTS.Common.Priority` unless `severity_field` is configured to it (i.e., the project has no Severity field and you fell back to Priority). +- Never comment on a work item without showing the comment text to the user and getting approval first. +- Never emit raw markdown into `System.Description` or comment bodies. Both are HTML; the agent converts the markdown draft to safe HTML using the rules in `skills/azure-work-item-refiner/references/azure-html-formatting.md` before writing. +- Never tag the reporter for clarification until investigation is exhausted and a specific gap blocks meaningful triage. Reporter contact is a last resort. +- Never tag anyone other than the reporter or, when the reporter is deactivated, the EM-fallback candidate the user explicitly approves. Never tag directors, VPs, support leads, or random team members as a shortcut. +- Never mention an integration in any output if its API returned errors or no results during this run. + +## Reporter Follow-up Policy (Last Resort) + +Reporter contact is the last thing you do before giving up on a work item, not a shortcut to skip investigation. Exhaust Phase 1 (Teams when installed, AzDO, Wiki, code) first; for Bug or Incident archetypes also exhaust Phase 2 (Datadog). Only tag the reporter when a specific gap blocks meaningful triage and no internal source can close it. + +### When asking the reporter is warranted + +Pick one of these three scenarios. If none apply, do not ask. On non-bug archetypes, "fix verification" reframes as "still relevant?" (the work item may have been overtaken by other work). + +| Scenario | Trigger | What you're asking | +|----------|---------|--------------------| +| Missing data | A field needed for triage is absent and cannot be recovered from logs, Teams, or prior work items (e.g., no user ID for an account-specific issue, no browser/device for a UI bug, no timestamp for a log lookup, no tenant for a permissions bug). | The specific missing fact. | +| Clarification | Work item contains contradictions, ambiguous symptoms, or behavior doesn't match what you found in code/logs. | A targeted yes/no or this-or-that question. | +| Fix verification (Bug or Incident) | Evidence suggests the bug is already resolved (a related PR shipped after the work item was filed, no occurrences in logs in the last N days). | Whether the issue is still reproducible. | +| Relevance check (User Story / Feature / Task / Spike) | The work item appears overtaken by other work (no activity since filed, related work shipped, scope met by another work item). | Whether the work item is still on the team's roadmap. | + +### When NOT to ask + +- You have enough evidence to hand the work item to the owning team. +- The gap can be answered with more searching you haven't tried. +- The question is about internal system behavior (the reporter won't know). +- The work item was filed within the last 24-48 hours and investigation is in flight elsewhere. + +### Identifying who to tag + +The agent tags the reporter (`System.CreatedBy`) by default. When the reporter's account is deactivated or otherwise unreachable, attempt EM-fallback resolution before asking the user. AzDO does not expose an org-chart natively, so this is a best-effort ladder; each step is allowed to fail silently. + +1. **Detect deactivation.** Call `wit_query_by_wiql` with `SELECT [System.Id] FROM WorkItems WHERE [System.AssignedTo] = '' AND [System.ChangedDate] > @today - 90` and inspect whether the reporter has been assigned anything in the last 90 days. Zero recent assignments combined with the reporter's `descriptor` failing to resolve via the Teams MCP (when installed) is the heuristic for "deactivated." (AzDO does not return an explicit active/inactive flag on `System.CreatedBy` from `wit_get_work_item`.) + +2. **Try Teams profile lookup.** If a Teams MCP is installed and exposes a `teams_get_user_profile` (or equivalent) call that returns a `manager` field, call it with the reporter's email. If the response carries a `manager.email`, propose that person as the EM. + +3. **Try AzDO team lookup.** Call `core_list_project_teams` and walk teams that the reporter is a member of (via `core_list_team_members` per team). If a team has exactly one administrator distinct from the reporter, propose that administrator as the EM. (Team admin is not always the reporter's actual EM, so this is a fallback rather than a primary signal.) + +4. **Pause and ask.** If steps 2 and 3 produce a candidate, pause and ask the user: + > The reporter on `WI #{ID}` (`{reporter name}`) appears unreachable. Best guess at their EM: `{candidate name} <{candidate email}>` (from {Teams profile / AzDO team admin}). Tag them instead? + > + > Options: `Yes, tag {candidate}`, `No, ask me to enter someone different`, `Skip the follow-up entirely`. + + If steps 2 and 3 produced no candidate: + > The reporter on `WI #{ID}` (`{reporter name}`) appears unreachable. I couldn't identify a fallback contact via Teams or team admin lookup. Who should I tag? Reply with a unique-name (UPN), email, or `skip` to drop the follow-up. + +5. **EM-tagged comment preamble.** When the user accepts a fallback candidate (Yes from step 4) or supplies someone other than the reporter, prepend one sentence to the question comment template: + + > The original reporter on this work item is unreachable. Tagging you as their EM (or alternate contact) to route this forward. + + Follow with the scenario template above. + +The EM-fallback is read-only: it queries the org for candidates and asks the user to confirm. The agent never tags an EM without explicit user approval. + +### Question comment templates + +Use the matching template. Keep each question specific. One tightly scoped question beats a list. Apply the writing rules at the bottom. + +**Missing data:** + +> @{Reporter display name} +> +> {one specific question, e.g., "What user email or ID was affected?" or "Which browser and version were you using when this happened?"} +> +> We need this to triage the work item. Reply here when you have it and we'll pick this back up. Transitioning to {waiting_reply state} in the meantime. + +**Clarification:** + +> @{Reporter display name} +> +> {specific clarifying question. Quote the part of the description that's ambiguous and offer a concrete this-or-that.} +> +> The work item points in two different directions and we want to chase the right one. Transitioning to {waiting_reply state}. + +**Fix verification (Bug or Incident):** + +> @{Reporter display name} +> +> This may already be resolved. {One-sentence evidence: e.g., "PR !1234 shipped on YYYY-MM-DD and touches the same flow" or "We're not seeing any occurrences in logs since YYYY-MM-DD."} +> +> Is the issue still happening for you? If not, we'll close this out. Transitioning to {waiting_reply state}. + +**Relevance check (User Story / Feature / Task / Spike):** + +> @{Reporter display name} +> +> This may have been overtaken by other work. {One-sentence evidence: e.g., "WI #5678 shipped on YYYY-MM-DD and covers the same scope" or "No activity here since YYYY-MM-DD; the area was reorganized."} +> +> Is this still on your team's roadmap? If not, we'll close it. Transitioning to {waiting_reply state}. + +Rules for all four templates: +- Lead with the request or the evidence. No opener phrases. +- Phase 2.5 runs the `prose-style` skill on the filled-in template before the Phase 3 preview. +- Never chain multiple questions. +- State explicitly that you're moving the work item to `waiting_reply`. +- Tag only the reporter. + +**Mention syntax in HTML.** AzDO renders user mentions as `@Display Name`. The exact HTML shape depends on the work-item host's mention parser; in practice, posting a plain-text `@Display Name` in the comment HTML produces a visible mention without a notification, and the full mention element produces both. The agent posts the full mention element when it has the unique-name; otherwise it posts plain `@Display Name` and notes in the Phase 10 summary that the mention may not have produced a notification. + +## Workflow + +For each work item the user pastes, execute these phases in order. The agent pauses at the following points and nowhere else. + +**Stops (halt the run until the user explicitly continues or overrides):** +- **Phase 0 skip-tag check:** when the work item carries a tag whose name starts with any prefix in `skip_tags` (case-insensitive), report the matched tag and halt. The agent does not assign, transition, or write anything until the user explicitly says "proceed anyway". +- **Phase 0 unmapped work-item type:** when the work-item type does not match any value in `work_item_type_map` (e.g., a custom work-item type the user's process template added), halt and ask the user which archetype to treat it as, or to skip the run. + +**Pauses (the agent is waiting on a user answer to continue):** + +1. **Phase 0 first-run config branch:** when no config file exists, ask the user to pick wizard / inline / defaults. +2. **Phase 2.5 deactivated-reporter branch:** when a follow-up is needed and the reporter appears unreachable, run the EM-fallback ladder; pause to ask the user to confirm a proposed candidate (or to enter a different person, or skip the follow-up). +3. **Phase 3 archetype-correction pre-gate:** when work-item type and content disagree, ask the user to confirm or correct the detected archetype. +4. **Phase 3 main panel:** the explicit confirmation gate (one `AskUserQuestion` with up to 3 questions side by side). +5. **Phase 3 revision loop exit (only after 3 revision rounds):** when the user keeps requesting changes via the "Other" channel after three rounds, ask Approve-as-is or Abort. +6. **Phase 5 optional second checkpoint:** only when the user explicitly opted in via the "Other" channel on Phase 3 question 2. + +The workflow gates four phases on archetype: Phase 1 (skill choice: Bug/Incident → `azure-issue-investigator`; User Story/Feature/Task/Spike → `azure-requirements-investigator`), Phase 2 (Datadog runs for Bug/Incident; silently skipped on User Story/Feature/Task/Spike), Phase 4 (severity assessment for Bug/Incident vs scope summary for User Story/Feature/Task/Spike, with Phase 4c overriding both on the follow-up path), Phase 6 (severity + due-date write for Bug/Incident vs sprint placement + optional story-point estimate for User Story/Feature/Task/Spike). + +--- + +### Phase 0: Fetch, Detect Archetype, and Assign + +1. Extract the work-item ID from the pasted URL (e.g., `12345` from `https://dev.azure.com///_workitems/edit/12345`). If `organization_url` or `project` is null in config, infer them from the URL prefix. +2. Fetch the work item via `wit_get_work_item` with `expand: "all"` and the field set: + + ``` + System.Title, System.Description, System.State, System.Reason, + System.WorkItemType, Microsoft.VSTS.Common.Priority, + Microsoft.VSTS.Common.Severity, System.Tags, System.AreaPath, + System.IterationPath, System.AssignedTo, System.CreatedBy, + System.CreatedDate, System.ChangedDate, System.Parent + ``` + + The response includes the `relations` array (links) and the work-item revisions/comments either inline (with `expand: "all"`) or via a separate fetch depending on the MCP tool's shape; if comments are not in the expanded response, fetch them via `wit_get_work_item_comments` (or the MCP's equivalent) and merge into the cached payload. Cache as `work_item_payload`. + +3. **Skip-tag check.** Parse `System.Tags` (semicolon-delimited string). Scan for any tag whose name starts with any prefix in `skip_tags` (case-insensitive). If matched, stop. Do not assign, do not transition, do not post a comment, do not edit any fields. Report this exact form and wait: + + > `WI #{ID}` already carries a skip tag (`{matched-tag}`). Skipping triage. Let me know if you want to override and proceed anyway. + + Continue past this step only on explicit user override. + +4. **Detect archetype.** Map `System.WorkItemType` to one of `Bug`, `Incident`, `User Story`, `Feature`, `Task`, `Spike` using the inverse of `work_item_type_map`. The default Agile mapping resolves: `Bug` → Bug, `Issue` → Incident, `User Story` → User Story, `Feature` → Feature, `Task` → Task. Spike has no canonical work-item type in any built-in process template; treat a Task carrying a `spike` tag as a Spike, and treat a custom `Spike` work-item type the same way. If `System.WorkItemType` does not match any value in `work_item_type_map`, pause and ask the user: "This is a `{type}` work item. Which archetype should I treat it as: Bug, Incident, User Story, Feature, Task, Spike, or skip the run?" Use their answer (and remember it for the rest of the session, but don't write it to the config file). When the work-item type matches a mapped value but the description content disagrees (e.g., type `Bug` but content is acceptance criteria and a Figma link), trust the content and surface the conflict at the Phase 3 archetype-correction pre-gate. Cache the archetype string for downstream phase gating. + +5. Assign the work item to the running user via `wit_update_work_item` with the JSON Patch: + + ```json + [ + { "op": "add", "path": "/fields/System.AssignedTo", "value": "" } + ] + ``` + + Use the cached descriptor from Prerequisites; never paste a different triager's identity. + +6. Transition to the `investigating` state by writing `System.State` and `System.Reason` from `states.investigating` in the resolved config: + + ```json + [ + { "op": "add", "path": "/fields/System.State", "value": "" }, + { "op": "add", "path": "/fields/System.Reason", "value": "" } + ] + ``` + + Combine the assignment patch (step 5) and this transition into a single `wit_update_work_item` call when the MCP tool accepts a multi-op patch document. + +--- + +### Phase 1: Investigate + +Branch by the archetype detected in Phase 0: + +- **Bug or Incident:** Invoke the `azure-issue-investigator` skill via the `Skill` tool. The skill runs the Teams (when installed) → AzDO + Wiki → Datadog → code ladder with evidence tags. Pass the cached work-item payload so the skill does not refetch. +- **User Story, Feature, Task, or Spike:** Invoke the `azure-requirements-investigator` skill via the `Skill` tool. The skill runs a Teams → AzDO + Wiki → code ladder (no Datadog level by default) and writes a per-archetype report (User Story and Feature share the Feature template; Task uses the Task template; Spike uses the Spike template). Pass the cached payload and the archetype string. + +Both skills follow the same calling convention (non-interactive, evidence-tagged output, read-only). + +**Pull-request link collection.** After the investigator skill returns, the agent post-processes the report and the cached payload to find Azure Repos PR URLs (`https://dev.azure.com///_git//pullrequest/` and the shorter `//_git//pullrequest/` form). For each unique URL, call `repos_get_pull_request_by_id` to resolve the PR title, project GUID, and repo GUID; record `{ url, title, project_id, repo_id, pr_id }` into `proposed_pr_links`. Cap the array at 8 unique PRs (most recent first). The Phase 3 main panel surfaces up to 4 of these for user approval; surplus entries are listed in the Phase 10 summary as "skipped (panel cap)". When `pr_linking_enabled = false`, this collection step is skipped. + +**Fallback for `azure-issue-investigator` (Bug or Incident path, when the skill is not installed):** + +1. If a Teams MCP is installed, search Teams with 2-3 queries via `teams_search_messages`: the work-item ID (e.g., `12345` or `AB#12345`), the most distinctive symptom or error message, the customer/area name. For relevant hits, follow up with `teams_read_thread`. +2. Search the AzDO Wiki via `wiki_search` for the feature area, system name, runbooks, known-issues pages. Search related work items via `wit_query_by_wiql` for prior items in the same area. +3. Only if steps 1 and 2 turn up nothing useful, do a light code search: use `Bash` (e.g., `grep -r 'pattern' path/`) to find error strings or endpoint names; `Read` source files near the relevant code. + +**Fallback for `azure-requirements-investigator` (User Story / Feature / Task / Spike path, when the skill is not installed):** + +1. Re-read the work item carefully (description, comments, linked work items via `relations`). +2. If a Teams MCP is installed, search Teams with 2-3 queries via `teams_search_messages`: the work-item ID, the user-story / feature / task / spike name, the area or system name. Follow relevant threads with `teams_read_thread`. +3. Search the AzDO Wiki via `wiki_search` for product briefs, design docs, ADRs, RFCs, and prior decisions in the same area. +4. Summarize findings in plain prose using the matching template (Feature: Lead / Background / Requirements Found / Design Refs / Open Questions / Where To Look; Task: Lead / Why Now / Definition of Done Found / Risks / Where To Look; Spike: Lead / Question to Answer / What's Already Known / What's Unknown / Where To Look). + +**Common to both fallbacks:** Tag every finding with `[VERIFIED]`, `[OBSERVED]`, `[INFERRED]`, or `[UNKNOWN]`. Stop when you can hand the developer 2-3 concrete observations and a "Where To Look" list. + +Warn the user once at the start of this phase if you used a fallback. + +--- + +### Phase 2: Search Datadog + +**Applies to:** Bug, Incident. +**Skipped on:** User Story, Feature, Task, Spike (silently; non-bug work items rarely have runtime telemetry to query). + +Using signals from Phase 1 (error messages, service names, entity IDs, status codes), build 1-3 targeted log queries via `search_datadog_logs`: + +- `query`: e.g., `service:my-service status:error @http.status_code:500 @user_id:abc123` +- `from`: 7 days before the work item's `System.CreatedDate`, or the timeframe mentioned in the work item +- `to`: work-item `System.CreatedDate` or now +- `limit`: 10-25 + +Build a Logs URL for the engineer: +`https://app.datadoghq.com/logs?query=&from_ts=&to_ts=` + +**Suppression rule.** If Datadog returns any error (auth, 403/404, timeout, rate limit, empty results, or any non-success), treat Datadog as unavailable for this work item. Do not mention Datadog anywhere in subsequent output. This rule overrides every later instruction that references Datadog. + +--- + +### Phase 2.5: Gap Analysis + +Decide whether a reporter follow-up is warranted before presenting findings. This is the only place the follow-up decision is made. Universal across archetypes. + +1. Apply the criteria in **Reporter Follow-up Policy** above. On non-bug archetypes, "fix verification" reframes as "still relevant?" (the work item may have been overtaken by other work). +2. **For Bug or Incident: form a severity recommendation** using the Severity Criteria table at the top of this file. Match the work item's evidence to the dimensions and pick the closest level from the cached severity options (default: `1 - Critical` / `2 - High` / `3 - Medium` / `4 - Low`). Cache the recommendation as `severity_recommendation`. **For User Story / Feature / Task / Spike: skip this severity step**; instead form a one-line scope summary that captures what the work item covers and what is unclear, ready for Phase 4b. Cache it as `scope_summary_draft`. +3. **Decide the follow-up path now, before drafting the comment.** + - If none of the four follow-up scenarios applies: set `follow_up_needed = false` and continue to step 4. + - If one applies: set `follow_up_needed = true` and record the scenario (missing data, clarification, fix verification on Bug/Incident, or relevance check on User Story/Feature/Task/Spike). Identify the reporter from `System.CreatedBy`. If the reporter's account appears unreachable, run the EM-fallback ladder (per **Identifying who to tag** above) before continuing — propose a candidate or ask the user, depending on what the ladder finds. +4. **Draft only the Phase 4 comment that will actually be posted** (still in markdown shape, not yet HTML). The branch is set by `follow_up_needed`: + - `follow_up_needed = false`, Bug or Incident: draft the assessment body using the Phase 4a structure (Assessment, Severity Recommendation, Evidence, Criteria matched). Phase 4a will post this. + - `follow_up_needed = false`, User Story / Feature / Task / Spike: draft the scope summary body using the Phase 4b structure (Scope Summary, What's in scope, Evidence, Open questions). Phase 4b will post this. The "What's in scope" body adapts to archetype (Feature: requirements found and design refs; Task: definition of done and why-now; Spike: question to answer and what's already known). + - `follow_up_needed = true` (any archetype): draft the question comment using the matching template from **Question comment templates** above. Phase 4c will post this. + + Cache the resulting markdown draft as `comment_draft`. +5. **Run the `prose-style` skill on the drafted comment text from step 4.** Pass the markdown draft as input via the `Skill` tool with `name: "azure-issue-triage:prose-style"` (the namespaced form ensures the call resolves to this plugin's copy even when `jira-issue-triage` is also installed). Replace the cached draft with the returned cleaned version. + - **Defensive fallback when `prose-style` does not load:** apply these rules inline to the draft: no em dashes, no spaced hyphens as separators, no LLM vocabulary (delve, leverage, robust, seamlessly, comprehensive, nuanced, elevate, foster, paradigm, ecosystem, holistic, innovative, synergy, empower, facilitate), lead with the answer, no opener phrases, no trailing summaries on short sections, prose over bullet lists. Warn the user once at the start of Phase 3. + +--- + +### Phase 3: Confirmation Gate + +Present findings to the user. Show: + +- The detected archetype (Bug / Incident / User Story / Feature / Task / Spike) and the rule that drove the detection. +- Investigation report summary (key findings, hypotheses, evidence tags). +- Datadog findings, only if Phase 2 ran AND returned usable data. +- **Bug or Incident, `follow_up_needed = false`:** Proposed severity recommendation and computed due date (`severity_recommendation` + `due_date_iso`). The prose-style-cleaned markdown draft of the assessment comment, shown inline as plain markdown. This is the proposed Phase 4a content. +- **User Story / Feature / Task / Spike, `follow_up_needed = false`:** The prose-style-cleaned markdown draft of the scope summary comment, shown inline as plain markdown. This is the proposed Phase 4b content. When `iteration_path_strategy` is set, also display the proposed iteration path (resolved via `work_list_team_iterations` for `"current"` or read verbatim for `"explicit:..."`). +- **PR-link proposals (any archetype, when `proposed_pr_links` is non-empty):** Display the up-to-4 PRs the agent will offer for linking, with title and URL. Surplus entries (when `proposed_pr_links.length > 4`) are also shown so the user can choose to link them manually after the run. +- If `follow_up_needed = true`: the follow-up plan as a distinct block (scenario; who will be tagged and why; the prose-style-cleaned markdown draft; the transition that will happen; what will still run vs. skipped). + +Ask the user via `AskUserQuestion`. The decisions are independent (each gates a different write), so put them in one panel as a multi-question call. + +**Pre-gate (separate call, only when applicable).** Run BEFORE the main panel: + +- **When the archetype detection is non-obvious** (work-item type and content disagree): ask **"Detected archetype is {X}; is that right?"** as a standalone `AskUserQuestion` call with the detected archetype and the next-most-likely alternative as options. If the user picks a different archetype, redo Phase 2.5 against the correction and re-enter Phase 3. Cap the correction loop at one round. + +**Main panel (one `AskUserQuestion` call with up to 4 questions in `questions[]`).** Always include questions 1 and 2; include question 3 (one of story-points or tag-approval, mutually exclusive) and question 4 (PR-links) only when their preconditions hold. The schema's 4-question cap is the runtime ceiling. + +1. **Post the proposed Phase 4 comment?** Options: `Yes, post it`, `No, skip the comment`. Cache as `approved_post_comment`; cache any free-text feedback as `comment_change_request`. +2. **Refine the title and description?** Options: `Yes, refine and write`, `No, leave as-is`. Cache as `approved_refine_description`; cache any free-text as `refine_change_request`. +3. **(Conditional, one of the following — they are mutually exclusive at runtime because story-points fires only on `follow_up_needed = false` and tag-approval fires only on `follow_up_needed = true`):** + - **Story-points estimate** when `story_points_field` is configured AND archetype is `User Story` or `Feature` AND `follow_up_needed = false`: **"Story-point estimate?"** Options (cap at 4 to fit `AskUserQuestion`'s per-question option limit): `1`, `3`, `5`, `Skip`. The "Other" channel accepts any other number (e.g., `2`, `8`, `13`, `21`). Cache as `story_point_estimate`: numeric value when the user picked or typed a number; `null` when the user picked `Skip` or returned an empty/non-numeric "Other" answer. **`null` semantics: "no estimate captured". Phase 6 silently skips the story-point write. It does not mean "estimated zero".** + - **Tag-approval** when `follow_up_needed = true`: **"Approve tagging {reporter name} with this question?"** Options: `Yes, tag {name}`, `No, switch to standard path`. Cache as `approved_followup_tag`. +4. **(Conditional)** When `proposed_pr_links.length > 0` AND `pr_linking_enabled = true`: **"Link these Azure Repos pull requests to the work item?"** A multi-select question (`multiSelect: true`) listing up to 4 proposed PRs by title with each PR URL in the description. Cache the user's selected subset as `approved_pr_links`. The "Other" channel lets the user paste an additional PR URL the agent didn't propose; parse it into the same `{ url, title }` shape and append. When more than 4 PRs were detected, the surplus is dropped from the panel (cap-imposed); they are listed in the Phase 10 summary as "skipped (panel cap)" so the user can link them manually if needed. + +**Revision loop (when the user's free-text "Other" channels request changes).** If `comment_change_request` is non-empty, re-draft the comment per Phase 2.5 step 4 with the user's free-text added as guidance, then re-run prose-style. If `refine_change_request` is non-empty, attach it to the Phase 5 invocation as guidance for the refiner. After each revision pass, re-present the main panel with the updated draft. Cap the loop at 3 revision rounds. After the third round, present a final two-option `AskUserQuestion`: `Approve as-is` or `Abort this triage run`. Abort skips Phases 4-9, leaves the work item assigned and in the `investigating` state, and ends with a Phase 10 summary noting the abort. + +**After the main panel returns:** + +- If the tag-approval question was answered No: drop the cached follow-up scenario, flip `follow_up_needed = false`, re-draft the standard-path comment, run `prose-style`, and re-enter Phase 3. +- Otherwise the gate is closed and the run continues. + +Phase 5 honors `approved_refine_description`: when `false`, skip the `azure-work-item-refiner` invocation, the `prose-style` styling pass, the preview, and the `wit_update_work_item` write entirely. + +Phase 6 honors `story_point_estimate`: when null, skip the story-point write (the rest of Phase 6's writes — sprint placement on User Story / Feature / Task / Spike, severity + due date on Bug / Incident — still run). + +Phase 7 honors `approved_pr_links`: when empty, no PR-link relations are added (work-item-to-work-item links from investigation still run as before). + +The other phases (4, 7, 8, 9, 10) always run regardless of these flags; metadata writes and the final transition + Teams summary are not gated on the comment, description, story-point, or PR-link decisions. + +--- + +### Phase 4a: Severity Assessment Comment + +**Applies to:** Bug, Incident, with `approved_post_comment = true` and `follow_up_needed = false`. + +After the user approved the comment text at Phase 3, post the comment via `wit_add_work_item_comment`. Convert the prose-style-cleaned markdown draft to HTML using the rules in `skills/azure-work-item-refiner/references/azure-html-formatting.md`. Logical structure (rendered intent): + +> **Assessment:** +> +> {2-3 sentences summarizing what is broken, who is affected, how severe.} +> +> **Severity Recommendation:** {severity option name, e.g., `2 - High`} +> +> **Evidence from this work item:** +> +> - "{direct quote or paraphrase from the description, comments, or linked work items}" +> - "{another piece of evidence}" +> +> **Criteria matched:** +> +> - {which severity criteria from the table above this matches and why} + +HTML construction: each `**heading:**` line becomes `

heading:

`. Each bullet becomes `
  • ...
`. Inline work-item references (e.g., `WI #1234`) become `WI #1234`. + +Rules: +- Ground every claim in evidence from the work item, comments, or linked work items. +- Lead with what is happening, not background. +- Severity Recommendation must match an existing option of `Microsoft.VSTS.Common.Severity`. +- Never recommend a `Priority` change unless `Priority` is the configured severity field. + +--- + +### Phase 4b: Scope Summary Comment + +**Applies to:** User Story, Feature, Task, Spike, with `approved_post_comment = true` and `follow_up_needed = false`. + +After the user approved the comment text at Phase 3, post via `wit_add_work_item_comment` (HTML body). Logical structure (rendered intent): + +> **Scope Summary:** +> +> {2-3 sentences naming what this work item covers, the affected area, and the most important framing.} +> +> **What's in scope:** +> +> - **For User Story / Feature:** Requirements found, design refs, the user need being met. +> - **For Task:** Definition of done, why-now (deadline, dependency, deprecation), risks. +> - **For Spike:** Question to answer, what's already known, the time-box if known. +> +> **Evidence from this work item:** +> +> - "{direct quote or paraphrase from the description, comments, or linked work items}" +> +> **Open questions:** +> +> - {one named open question with whom it's blocked on, if anyone} + +HTML construction follows the same node patterns as Phase 4a. + +Rules: +- Ground every claim in evidence. +- Lead with what is in scope. +- Keep "Open questions" to genuine unknowns. + +--- + +### Phase 4c: Post Follow-up Question (Alternative Path) + +**Applies to:** any archetype with `follow_up_needed = true` and `approved_post_comment = true`. + +1. Confirm you have the approved (and prose-style-cleaned) draft from Phase 3 and the target reporter (or user-supplied tag target). +2. Post the follow-up via `wit_add_work_item_comment`. The comment body is HTML; if the target's unique-name is known, lead with the mention element `@Display Name`. Otherwise lead with plain `@Display Name` and note the mention may not produce a notification. +3. **Reassign the work item to the tagged person right now**, in the same turn. Call `wit_update_work_item` with the patch: + + ```json + [ + { "op": "add", "path": "/fields/System.AssignedTo", "value": "" } + ] + ``` + + Phase 9 will not touch the assignee on the follow-up path. +4. Do not post an assessment or scope summary comment. The follow-up comment is the only triage comment on the work item for this round. +5. Remember the scenario for the Phase 10 summary. + +After this phase, continue to Phase 5. + +--- + +### Phase 5: Refine the Work Item + +**Skipped when `approved_refine_description = false` from the Phase 3 gate.** + +This phase runs two skills in sequence. First, invoke `azure-work-item-refiner` via the `Skill` tool to produce the refined title and description. Then invoke `prose-style` (namespaced as `azure-issue-triage:prose-style`), passing the refiner output, to clean writing-style anti-patterns. Only after both skills run does the user-facing preview appear. + +**Fallback (when `azure-work-item-refiner` is not installed):** + +1. Use the archetype detected in Phase 0. +2. Inventory all original information + investigation findings. Include Datadog data only if Phase 2 ran and returned usable results. +3. Restructure into archetype-appropriate sections: + - **Bug or Incident:** Summary, Impact, Affected Scope, Reproduction Steps / Expected / Actual, Investigation Notes, Working Hypotheses or Root Cause. + - **User Story / Feature:** Summary, Context and Background, Requirements and Acceptance Criteria, Open Blockers. + - **Task:** Summary, Context and Background, Requirements and Acceptance Criteria (as definition of done), Solutions, Open Blockers. + - **Spike:** Summary, Context and Background, Questions to Answer, Findings (if any). +4. Rewrite the title using `{Area}: {specific problem or goal}` for any archetype, or `P{n}: {Area} {short problem statement}` for incidents, or `Spike: {Area} {question to answer}` for spikes. + +**Fallback (when `prose-style` is not installed):** apply the inline rule list from Phase 2.5 step 5 to the refined output before previewing. + +Steps: + +1. Build the refined title and description (`azure-work-item-refiner` invocation, or the fallback). The agent communicates `skip_preview` via the leading-line convention. The exact prompt: + + ``` + Calling context: skip_preview=true. + + The orchestrator owns the user gate; do not run Step 7 preview or write via wit_update_work_item. + Return the refined title and description as your final output. + + + ``` + + The skill returns the refined title + description as plain text for the agent to consume. +2. Invoke the `prose-style` skill (namespaced) with the refined title and description from step 1. Replace the title and description with the cleaned versions. +3. Convert the cleaned markdown description to HTML using `skills/azure-work-item-refiner/references/azure-html-formatting.md`. Render the cleaned title + description (markdown form) to the user as inline preview. Frame the output with one line above: + + ``` + Writing the following to WI #{ID} in {N} seconds (interrupt to abort): + ``` + + The pause length `{N}` reads from `description_preview_pause_seconds`. When the user explicitly opted in to a second checkpoint via the "Other" channel on Phase 3 question 2, call `AskUserQuestion` with options `Approve and write`, `Request changes` instead of pausing. + +4. Update via a single `wit_update_work_item` call with the JSON Patch: + + ```json + [ + { "op": "add", "path": "/fields/System.Title", "value": "" }, + { "op": "add", "path": "/fields/System.Description", "value": "" } + ] + ``` + +**Preserve all original media, attachments, and links.** Reproduce them with the same HTML markup. Never drop attachments, embedded images, inline links, or referenced files. + +Warn the user once at the start of this phase if either fallback was used. + +--- + +### Phase 6: Severity, Due Date, Sprint Placement, Story Points + +**Skipped entirely on:** `follow_up_needed = true` (everything Phase 6 writes waits until the reporter's reply comes in and the work item is re-triaged). + +The phase splits by archetype. Bug and Incident go through severity + due-date. User Story, Feature, Task, Spike go through sprint placement + story points (when configured). All writes assemble into one `wit_update_work_item` JSON Patch call when possible. + +**Bug or Incident path:** + +Read the current severity from the configured `severity_field` (default `Microsoft.VSTS.Common.Severity`) on the work-item payload. Compare against `severity_recommendation` cached in Phase 2.5. + +1. **If recommendation matches current severity:** the severity field is already correct; do not write it. Continue to step 2. +2. **If recommendation differs (or the field is empty):** the severity field needs an update; include it in the patch built in step 3. +3. **Compute the due date.** Read `severity_scheme[severity_recommendation].due_offset_days`. If the key is missing from `severity_scheme`, log a deferred warning (Phase 10) and skip the due-date write. Otherwise compute the date as `System.CreatedDate + due_offset_days` (date-only, no time component) and format as `YYYY-MM-DD`. Cache as `due_date_iso`. +4. **Write the patch.** Build a single `wit_update_work_item` JSON Patch document combining the severity change (when needed) and the due-date write: + + ```json + [ + { "op": "add", "path": "/fields/Microsoft.VSTS.Common.Severity", "value": "" }, + { "op": "add", "path": "/fields/Microsoft.VSTS.Scheduling.DueDate", "value": "T00:00:00Z" } + ] + ``` + + The DueDate field type is `DateTime` and AzDO accepts an ISO-8601 timestamp; using midnight UTC keeps the rendered date stable across viewers' time zones. Drop the severity op when step 1 already determined the field is current; drop the due-date op when step 3 logged a missing-key warning. + +5. **Do not write `Priority`** unless `severity_field` is configured to it (i.e., the project has no Severity field and you fell back to Priority). + +If the work item already has a due date set by the reporter, the Phase 6 write overwrites it. Triage owns the SLA-aligned date; reporter-supplied dates do not survive triage. The pre-triage value is preserved in the work item's revision history; nothing is destroyed. + +**User Story / Feature / Task / Spike path:** + +Severity and due date do not apply. Run the optional sprint and estimation writes: + +1. **Sprint placement.** Read `iteration_path_strategy` from the resolved config: + - **`null`:** skip sprint placement entirely. Continue to step 2. + - **`"current"`:** call `work_list_team_iterations` with `team: ` and `timeframe: "current"`. The response returns the active iteration object containing `path` (e.g., `MyProject\\Sprint 42`). Cache as `active_iteration_path`. If the call returns no current iteration (the team has no active sprint window), warn once and skip the iteration write. If `default_team` is null and the project has multiple teams, warn and skip — the agent cannot guess which team's sprint to use. + - **`"explicit:"`:** parse the path after the colon and use it as `active_iteration_path` directly. No API call needed. +2. **Story-point write.** When `story_points_field` is configured AND `story_point_estimate` (cached at Phase 3) is a non-null numeric value, include the estimate in the patch. When the field is configured but the estimate is null (the user picked Skip or returned a non-numeric "Other"), do nothing for this step — the field stays at its existing value. When `story_points_field` is configured but the field reference doesn't exist on the work-item type, warn once and skip. +3. **Write the patch.** Build a single `wit_update_work_item` JSON Patch document combining whatever ops applied: + + ```json + [ + { "op": "add", "path": "/fields/System.IterationPath", "value": "" }, + { "op": "add", "path": "/fields/Microsoft.VSTS.Scheduling.StoryPoints", "value": } + ] + ``` + + Drop either op when its precondition didn't fire. The story-point value is a JSON number, not a string. The iteration-path value is a backslash-separated string (e.g., `MyProject\\Backend\\Sprint 42`); JSON encodes the backslashes as `\\` on the wire but they decode back to single backslashes before the API write. + +If neither sprint placement nor story-point estimation applies, Phase 6 silently no-ops on this path. + +--- + +### Phase 7: Link Related Work Items and Pull Requests + +Two link types land here: work-item-to-work-item links (always) and work-item-to-pull-request links (when `pr_linking_enabled = true` and the user approved any from the Phase 3 panel). + +**Work-item links.** During investigation (Phases 1-2), collect every related work-item ID found in Teams threads, WIQL searches, the `relations` array, and comments. After the work item is refined, add links via `wit_update_work_item` with a JSON Patch operation that targets `/relations/-`: + +```json +[ + { + "op": "add", + "path": "/relations/-", + "value": { + "rel": "System.LinkTypes.Duplicate-Forward", + "url": "https://dev.azure.com//_apis/wit/workItems/", + "attributes": { "comment": "Linked during triage" } + } + } +] +``` + +| Strategy | `rel` value | +|----------|-------------| +| Duplicate (current is a duplicate of an existing canonical item) | `System.LinkTypes.Duplicate-Forward` | +| Related | `System.LinkTypes.Related` | +| Parent (current is child of an epic or feature) | `System.LinkTypes.Hierarchy-Reverse` | + +Skip any links that already exist (check the `relations` array from the Phase 0 fetch). + +**Pull-request links.** When `pr_linking_enabled = true` and `approved_pr_links` (cached at Phase 3) is non-empty, add an `ArtifactLink` relation per approved PR. Azure Repos PRs are referenced by an `vstfs://` artifact URL, not a regular HTTPS URL. Convert each approved entry's `url` field (which the agent collected as the human-facing PR URL during investigation) to the artifact form before writing. + +A PR URL like `https://dev.azure.com///_git//pullrequest/12345` maps to the artifact URL `vstfs:///Git/PullRequestId/%2F%2F12345` where `` and `` are the project and repository GUIDs (URL-encoded as `%2F` between segments). When the GUIDs are unknown (the agent has only the human URL), use the `repos_get_pull_request_by_id` tool (or equivalent) on the discovered PR ID to resolve the project and repo IDs, then build the artifact URL. + +Add via `wit_update_work_item`: + +```json +[ + { + "op": "add", + "path": "/relations/-", + "value": { + "rel": "ArtifactLink", + "url": "vstfs:///Git/PullRequestId/%2F%2F", + "attributes": { + "name": "Pull Request", + "comment": "Linked during triage" + } + } + } +] +``` + +When the project or repo ID can't be resolved (the lookup tool errors or the URL is malformed), skip that PR with a deferred warning ("Could not resolve PR ``; skipped link write"). The work-item-link writes still happen. + +The collection step lives in Phase 1 / Phase 2: while reading description, comments, Teams threads, and investigator output, the agent regex-matches Azure Repos PR URLs (`/_git//pullrequest/` or `//_git//pullrequest/` shapes) and records each unique match into `proposed_pr_links` with the title from `repos_get_pull_request_by_id`. Cap the collection at 8 (a reasonable upper bound for one work item); the Phase 3 panel sees the top 4 by recency, the rest go to "skipped (panel cap)" in the Phase 10 summary. + +--- + +### Phase 8: Tags + +Append the configured `triaged_tag` (default `triaged`) to existing tags. Read `System.Tags` (semicolon-delimited string), append `; ` if the tag is not already present, and write back via `wit_update_work_item`: + +```json +[ { "op": "add", "path": "/fields/System.Tags", "value": "existing-tag-1; existing-tag-2; triaged" } ] +``` + +Preserve existing tags exactly. Never overwrite or reorder. + +--- + +### Phase 9: Final Update + +Apply the remaining field updates and the final state. The agent does not change the work-item state in Phase 9 beyond what Phase 0 set (except on the follow-up path, which moves to `waiting_reply` here); the work item stays in `states.investigating` until the next workflow step. Phase 9's only standard-path write is the assignee per `archetype_assignment_after_triage`. + +1. **Assignee:** read the rule from `archetype_assignment_after_triage[]`. Defaults: `Bug = "unassign"`; `Incident, User Story, Feature, Task, Spike = "self"`. Apply the rule: + - **Standard path, rule = `"unassign"`:** set `System.AssignedTo` to an empty string (the AzDO equivalent of "unassigned") via `wit_update_work_item`. Cache `assignment_outcome = unassigned`. + - **Standard path, rule = `"self"`:** do not touch the assignee. Cache `assignment_outcome = kept assigned to you`. + - **Follow-up path:** Phase 4c already reassigned to the tagged person; do not touch the assignee here. + + Common overrides: + - **On-call team for incidents:** `"Incident": "unassign"`. Sev-1 incidents auto-route back to the team pool so on-call picks them up. + - **Triager owns bug fixes:** `"Bug": "self"`. Use this when bug triage and bug fixing are the same person. + +2. **State (follow-up path only):** when `follow_up_needed = true`, transition to `states.waiting_reply` (default: `state="Active", reason="Awaiting Customer"`): + + ```json + [ + { "op": "add", "path": "/fields/System.State", "value": "Active" }, + { "op": "add", "path": "/fields/System.Reason", "value": "Awaiting Customer" } + ] + ``` + + On the standard path, the work item stays in `states.investigating` from Phase 0. + +Confirm to the user what was updated. + +--- + +### Phase 10: Notification + Summary + +If a Teams MCP is installed AND `teams_channel` is configured, send the summary via `teams_send_message` to the configured channel, mentioning the running user. Format: + +> [`WI #{ID}`]({work-item URL}): {outcome} + +If Teams is not available (MCP not installed, channel not configured, or call returns an error), print the same summary inline as agent output and append: "Teams DM unavailable; install a Teams MCP server and set `teams_channel` in your config to enable." + +Pick the outcome that matches what you did: + +| Situation | Message | +|-----------|---------| +| Bug or Incident triaged, comment posted | `Triaged {Bug or Incident}, set severity {SevX}, due {due date}, posted assessment comment, {assignment outcome}` | +| Bug or Incident triaged, comment skipped at Phase 3 | `Triaged {Bug or Incident}, set severity {SevX}, due {due date}, no comment posted (skipped at confirmation gate), {assignment outcome}` | +| User Story / Feature / Task / Spike triaged, comment posted | `Triaged {archetype}, posted scope summary, {assignment outcome}` | +| User Story / Feature / Task / Spike triaged, comment skipped at Phase 3 | `Triaged {archetype}, no comment posted (skipped at confirmation gate), {assignment outcome}` | +| Sprint placement applied (User Story / Feature / Task / Spike) | (append) `Placed in sprint {active_iteration_path}` | +| Sprint placement skipped (no current iteration found, or default_team unset with multiple project teams) | (append) `Could not place in current sprint: {reason}.` | +| Story-point estimate written | (append) `Estimated {story_point_estimate} points` | +| PR links added | (append) `Linked {N} pull request{s}: {pr URLs joined by comma}` | +| PR links proposed but skipped at Phase 3 | (append) `Did not link {N} PR{s} (declined at confirmation gate)` | +| PR links surplus (more than 4 candidates) | (append) `Skipped {N} additional PR link{s} ({pr URLs}); link manually if needed.` | +| PR link write failed (couldn't resolve project or repo ID) | (append) `Could not resolve PR `{url}`; skipped link write.` | +| Asked reporter for missing data | `Asked reporter for missing info, moved to {waiting_reply state+reason}` | +| Asked reporter for clarification | `Asked reporter to clarify, moved to {waiting_reply state+reason}` | +| Asked reporter to verify fix (Bug or Incident) | `Asked reporter to confirm if still reproducing, moved to {waiting_reply state+reason}` | +| Asked reporter for relevance check (User Story / Feature / Task / Spike) | `Asked reporter if still relevant, moved to {waiting_reply state+reason}` | +| EM tagged because reporter is deactivated | (append) `Reporter is deactivated; tagged EM {name} instead.` | +| Description skipped at Phase 3 | (append) `Title and description left as-is (skipped at confirmation gate)` | +| Aborted at Phase 3 (3-revision cap reached) | `Aborted triage at confirmation gate after 3 revision rounds. Last user comment: "{quoted comment}". Work item stays assigned to you in {investigating state}.` | +| Severity changed | `Changed severity from {SevX} to {SevY}` | +| Escalation fired (Sev-1 with `escalate_immediately: true`) | (append) `Escalated in {escalation.teams_channel}, mentioned {primary contact name}.` | +| Escalation contact unresolvable | (append) `Could not resolve escalation contact {name} <{email}>; channel post mentioned them by name only.` | +| Due-date scheme miss | (append) `No due date set: severity {SevX} not in severity_scheme.` | +| Default-config first run | (append) `Triaged with default config; run /azure-issue-triage:setup any time to customize.` | +| Config validation warning (deferred from Phase 0) | (append) `Ignored invalid config: {field} = {value}. Used default.` | + +Combine multiple outcomes on one line when they apply (e.g., `Changed severity from 2 to 3. Triaged Bug, set severity 2 - High, due 2026-05-20, posted assessment comment, kept assigned to you`). + +`{assignment outcome}` resolves to the Phase 9 cache (`unassigned` or `kept assigned to you`). On the follow-up path the assignment outcome is implicit in the "Asked reporter" rows. `{due date}` resolves to `due_date_iso` (omit the clause entirely when null, e.g., on User Story / Feature / Task / Spike runs). + +### Escalation routing + +After the per-run summary lands (or as a separate channel post when no `teams_channel` is configured), apply escalation routing if all three conditions hold: + +1. The archetype is Bug or Incident. +2. `severity_scheme[severity_recommendation].escalate_immediately` is `true`. +3. `follow_up_needed` is `false` (escalation only fires on the standard path; follow-up runs wait for the reporter's reply before any escalation decision). + +When all three hold: + +- **If `escalation.teams_channel` is set:** post a separate Teams message to that channel with the format below, leading with the resolved `escalation_target_descriptor` (when non-null) so the contact gets a notification. When the descriptor is null, post the channel message with the contact's name and email in plain text and append a deferred warning ("Could not resolve escalation contact `{name} <{email}>`"). +- **If `escalation.teams_channel` is null but `escalation.primary_contact` is set:** DM the primary contact directly via the resolved `escalation_target_descriptor`. When the descriptor is null, skip the DM and append the unresolvable-contact warning. +- **If both are null:** the running-user summary is the only escalation and the operator decides what to do next. + +Channel-post format: + +> `[WI #{ID}]({work-item URL})` triaged `{SevX}`. Owner: `{primary contact name}`. Due `{due_date_iso}`. Assignment: `{assignment outcome}`. + +Cache `escalation_fired = true` so the per-run summary appends the "Escalated in ..." line. Fallback contact is not auto-paged on a timer; the operator can ask the agent to ping `escalation.fallback_contact` ad hoc later. + +--- + +## Duplicate Detection (Phase 1 helper) + +Before completing investigation, search for potential duplicates with WIQL: + +| Strategy | WIQL pattern | +|----------|--------------| +| Keywords | `SELECT [System.Id], [System.Title] FROM WorkItems WHERE [System.TeamProject] = '' AND [System.Title] CONTAINS 'keyword1' AND [System.Title] CONTAINS 'keyword2' ORDER BY [System.CreatedDate] DESC` | +| Area path | `SELECT [System.Id], [System.Title] FROM WorkItems WHERE [System.AreaPath] UNDER 'MyProject\\Backend' AND [System.State] <> 'Closed' ORDER BY [System.CreatedDate] DESC` | +| Error string | `SELECT [System.Id], [System.Title] FROM WorkItems WHERE [System.TeamProject] = '' AND ([System.Title] CONTAINS 'TypeError' OR [System.Description] CONTAINS 'Cannot read properties') ORDER BY [System.CreatedDate] DESC` | + +Link confirmed duplicates via Phase 7 with `rel: "System.LinkTypes.Duplicate-Forward"` (current is the duplicate; canonical is the link target). Use `System.LinkTypes.Related` for uncertain matches. + +--- + +## Writing Rules (always active) + +These apply to all text written to the work item, all Teams messages, and all comments. + +- Never use em dashes or spaced hyphens as separators. Restructure. +- No LLM vocabulary: delve, leverage, robust, seamlessly, comprehensive, nuanced, elevate, foster, paradigm, ecosystem, holistic, innovative, synergy, empower, facilitate. +- Lead with the answer. No opener phrases. +- No trailing summaries on short sections. +- Prose over bullet lists when the content flows naturally as sentences. +- Never restate AzDO-native metadata (state, priority, work-item type, assignee, area path, iteration path, tags) in the description body. diff --git a/azure-issue-triage/commands/setup.md b/azure-issue-triage/commands/setup.md new file mode 100644 index 0000000..8eb0863 --- /dev/null +++ b/azure-issue-triage/commands/setup.md @@ -0,0 +1,266 @@ +--- +description: First-time setup wizard for azure-issue-triage. Walks through configuration questions and writes .claude/azure-issue-triage.config.json. +argument-hint: (no args) +allowed-tools: Read, Write, AskUserQuestion, core_list_projects, core_list_project_teams, wit_my_work_items, work_list_team_iterations +--- + +# azure-issue-triage Setup Wizard + +Walk the user through ten configuration questions and write the result to `.claude/azure-issue-triage.config.json`. Re-runnable: pointing the wizard at an existing config offers to overwrite or keep current. + +## Steps + +### 1. Check for existing config + +Read `.claude/azure-issue-triage.config.json` if it exists. + +- **Config exists:** Read it, show the current contents to the user as pretty-printed JSON, then ask via `AskUserQuestion`: "Overwrite the existing config?" Options: `Yes, walk through the wizard again`, `No, keep current and exit`. On "No", exit cleanly. +- **No config exists:** Continue to step 2. + +### 2. Auto-discover defaults + +Best-effort auto-discovery to suggest defaults. Failures are non-fatal; fall back to the static defaults listed in each question and tell the user the auto-discovery failed. + +1. Call `core_list_projects` to list the AzDO projects accessible to the running user. If the call returns one project, suggest it as the default `project` and `organization_url`. If multiple, list them and let the user pick at Q2. +2. Call `wit_my_work_items` with `top: 1` to confirm work-item access. The response includes the running user's display name and unique-name, useful for assignment writes; not surfaced in the wizard, but the agent uses these at runtime. +3. Pre-populate the severity field default as `Microsoft.VSTS.Common.Severity` (the Agile process template's built-in field). + +### 3. Walk through the six wizard questions + +Ask one at a time. Use `AskUserQuestion` for multiple-choice answers. Use plain free-text prompts for URLs, project names, and area paths. Confirm each answer before moving to the next question. + +#### Q1: Organization URL + +Free-text prompt: + +> Azure DevOps organization URL? Format: `https://dev.azure.com/` (no trailing slash). + +Default: pre-fill from auto-discovery if a single org was found. Validate the answer matches the URL pattern. + +**Serialization rule.** Save as a JSON string. Example: `"organization_url": "https://dev.azure.com/contoso"`. + +#### Q2: Project name + +Use `AskUserQuestion` if step 2's auto-discovery found multiple projects: + +> Which Azure DevOps project? + +Options: each accessible project name from auto-discovery, plus `Custom (type the name)`. + +If only one project was found, use it as the default and confirm via free-text prompt: + +> Default project: `{project-name}`. Press Enter to accept, or type a different project name. + +**Serialization rule.** Save as a JSON string. Example: `"project": "Contoso.Platform"`. + +#### Q3: Area path prefix (optional) + +Free-text prompt: + +> Default area path prefix for triaged work items? (Optional; press Enter to skip.) Format: `MyProject\\Backend` (use double backslashes inside JSON). + +Default: empty (the agent leaves area path untouched at triage time). + +**Serialization rule.** Empty answer maps to JSON `null`. A typed value writes the string verbatim (e.g., `"area_path_prefix": "Contoso.Platform\\\\Backend"`). + +#### Q4: Severity field + +Use `AskUserQuestion`: + +> Which field holds the severity for Bug and Incident work items? + +Options: +- `Microsoft.VSTS.Common.Severity` (recommended for Agile and CMMI process templates) +- `Microsoft.VSTS.Common.Priority` (use if your project disabled Severity, common on Scrum) +- `Custom (type the field reference name)` + +On "Custom", ask for the field reference name (e.g., `Custom.MySeverityField`) as free text. Validate the answer is non-empty. + +#### Q5: State + Reason mapping + +Two free-text prompts: + +1. > State name for "investigating"? Default: `Active`. +2. > Reason for "investigating"? Default: `Investigating`. + +Then two more: + +3. > State name for "waiting reply"? Default: `Active`. +4. > Reason for "waiting reply"? Default: `Awaiting Customer`. + +**Process-template note.** The defaults match Agile. For CMMI use `Proposed -> Active` for investigating; for Scrum `Approved -> Committed`. + +**Serialization rule.** Save under `states`: +```json +"states": { + "investigating": { "state": "Active", "reason": "Investigating" }, + "waiting_reply": { "state": "Active", "reason": "Awaiting Customer" } +} +``` + +#### Q6: Teams channel (optional) + +Free-text prompt: + +> Microsoft Teams channel for the Phase 10 per-run summary? Format: team name + channel name (e.g., `Engineering > Triage`). Press Enter to skip; the agent will print the summary inline instead of sending a Teams message. + +Default: empty. + +**Serialization rule.** Empty answer maps to JSON `null`. A typed value writes the string verbatim. The agent does not validate the channel against the live Teams instance; the actual channel ID resolution happens at runtime. + +#### Q7: Severity scheme + +Use `AskUserQuestion`: + +> Which severity scheme do you want to use? + +Options: +- `4-tier (1 - Critical, 2 - High, 3 - Medium, 4 - Low) with 7/14/30/90 day SLAs` (recommended default; matches the built-in `Microsoft.VSTS.Common.Severity` enum) +- `Custom (specify each level)` + +On "Custom", walk through each level: ask for the level name (string, must match an option of the configured `severity_field`), the `due_offset_days` integer, and via `AskUserQuestion` whether `escalate_immediately` is `Yes` or `No`. Loop until the user types `done` for the level name. The agent uses the keys you define at runtime, so make sure they exactly match the option names in your Severity custom field. + +**Serialization rule.** Save under `severity_scheme`: + +```json +"severity_scheme": { + "1 - Critical": { "due_offset_days": 7, "escalate_immediately": true }, + "2 - High": { "due_offset_days": 14, "escalate_immediately": false }, + "3 - Medium": { "due_offset_days": 30, "escalate_immediately": false }, + "4 - Low": { "due_offset_days": 90, "escalate_immediately": false } +} +``` + +#### Q8: Escalation contacts (optional) + +Three free-text sub-prompts. Each accepts an empty answer (Enter for none). + +1. > Microsoft Teams channel for high-severity escalation pings? (e.g., `Incident Response > Escalations`) Press Enter for none. +2. > Primary escalation contact? Format: `Alice Kumar `. Press Enter for none. +3. > Fallback escalation contact? Same format. Press Enter for none. + +Parse the contact strings into `{ "name": "Alice Kumar", "email": "alice@example.com" }`. If the format does not match, warn and re-prompt. + +**Serialization rule.** Empty answers map to JSON `null`, not empty strings or empty objects. Save under `escalation`: + +```json +"escalation": { + "teams_channel": null, + "primary_contact": null, + "fallback_contact": null +} +``` + +The agent's Prerequisites step 4 resolves the contacts to Teams user descriptors at session start; if a contact's email cannot be resolved, the channel post (when configured) mentions them by name only and a deferred warning surfaces in the Phase 10 summary. + +#### Q9: Sprint placement (optional) + +Use `AskUserQuestion`: + +> Should the agent place User Story / Feature / Task / Spike work items into a sprint at triage time? + +Options: +- `No, skip sprint placement` (recommended default) +- `Yes, current sprint of the default team` — uses `work_list_team_iterations` with `timeframe: "current"` at runtime to find the active iteration. Requires `default_team` to be set; the wizard prompts for it as a follow-up if it's still null. +- `Yes, a fixed iteration path I'll specify` — prompts for an explicit path like `MyProject\\Backend\\Sprint 42`. + +**Serialization rule.** "No" writes `"iteration_path_strategy": null`. "Current" writes `"iteration_path_strategy": "current"`. "Fixed" writes `"iteration_path_strategy": "explicit:"`. When "Current" is chosen and `default_team` is null, ask: + +> What's your team name in Azure DevOps? (e.g., `Platform Engineering`). The agent uses this to find the active iteration. + +Save the answer to `default_team`. + +#### Q10: Story-point estimation (optional) + +Use `AskUserQuestion`: + +> Should the agent prompt for a story-point estimate at the Phase 3 confirmation gate (User Story / Feature work items)? + +Options: +- `No, skip estimation` (recommended default) +- `Yes, write to Microsoft.VSTS.Scheduling.StoryPoints` — the built-in field on Agile and Scrum process templates. +- `Custom field reference name (type the name)` — for projects using a custom story-point field. + +**Serialization rule.** "No" writes `"story_points_field": null`. "Yes" writes `"story_points_field": "Microsoft.VSTS.Scheduling.StoryPoints"`. "Custom" writes the typed reference name verbatim. Validate non-empty. + +#### Q11: Save? + +Show the assembled config as pretty-printed JSON with sorted top-level keys. Use `AskUserQuestion`: + +> Save this config to `.claude/azure-issue-triage.config.json`? + +Options: +- `Yes, write the file` +- `No, discard and exit` +- `Edit a specific question (which one?)` + +On `Edit`, ask which question number (1-10) to revisit, re-prompt that question, and loop back to Q11. + +### 4. Write the config file + +Use the `Write` tool with `path: ".claude/azure-issue-triage.config.json"`. Pretty-print with two-space indent and sort top-level keys alphabetically for stable diffs. The full schema (with all top-level keys, in alphabetical order): + +```json +{ + "archetype_assignment_after_triage": { + "Bug": "unassign", + "Incident": "self", + "User Story": "self", + "Feature": "self", + "Task": "self", + "Spike": "self" + }, + "area_path_prefix": null, + "default_team": null, + "description_preview_pause_seconds": 3, + "escalation": { + "teams_channel": null, + "primary_contact": null, + "fallback_contact": null + }, + "iteration_path_strategy": null, + "organization_url": "https://dev.azure.com/", + "pr_linking_enabled": true, + "project": "", + "severity_field": "Microsoft.VSTS.Common.Severity", + "severity_scheme": { + "1 - Critical": { "due_offset_days": 7, "escalate_immediately": true }, + "2 - High": { "due_offset_days": 14, "escalate_immediately": false }, + "3 - Medium": { "due_offset_days": 30, "escalate_immediately": false }, + "4 - Low": { "due_offset_days": 90, "escalate_immediately": false } + }, + "skip_tags": [], + "story_points_field": null, + "states": { + "investigating": { "state": "Active", "reason": "Investigating" }, + "waiting_reply": { "state": "Active", "reason": "Awaiting Customer" } + }, + "teams_channel": null, + "triaged_tag": "triaged", + "work_item_type_map": { + "Bug": "Bug", + "Incident": "Issue", + "User Story": "User Story", + "Feature": "Feature", + "Task": "Task", + "Spike": "Task" + } +} +``` + +The wizard does NOT ask about these advanced fields: `archetype_assignment_after_triage`, `description_preview_pause_seconds`, `pr_linking_enabled`, `skip_tags`, `triaged_tag`, `work_item_type_map`. They are written with their default values shown above so that the saved JSON is a complete, browsable config. Users edit the file directly to override. (`default_team` is asked only when Q9 picks "current sprint" — it stays null otherwise.) + +### 5. Confirmation message + +Print these lines: + +> Wrote `.claude/azure-issue-triage.config.json`. You can re-run `/azure-issue-triage:setup` any time to update. +> +> Advanced config keys not asked here (defaults used; edit the file directly to override): `archetype_assignment_after_triage`, `description_preview_pause_seconds`, `default_team`, `skip_tags`, `triaged_tag`, `work_item_type_map`. See the plugin README's "Configuration" section for what each one does. + +## Notes + +- This wizard never modifies Azure DevOps. Read-only auto-discovery only. +- If `core_list_projects` or `wit_my_work_items` fails, proceed with the static defaults and tell the user the auto-discovery failed. +- The wizard does not validate the entered Teams channel name against the live Teams instance. The agent's Phase 10 attempts the send at runtime; if the channel doesn't resolve, the agent falls back to inline output and notes the failure in the summary. +- The wizard does not auto-detect the process template (Agile / Scrum / CMMI). Q5's defaults match Agile; users on Scrum or CMMI override Q5 manually based on their workflow. diff --git a/azure-issue-triage/skills/azure-issue-investigator/SKILL.md b/azure-issue-triage/skills/azure-issue-investigator/SKILL.md new file mode 100644 index 0000000..cbc2c7d --- /dev/null +++ b/azure-issue-triage/skills/azure-issue-investigator/SKILL.md @@ -0,0 +1,205 @@ +--- +name: azure-issue-investigator +description: "Investigates an Azure DevOps Bug or Incident work item by searching Microsoft Teams, the work item and related AzDO/Wiki pages, Datadog, and the codebase, then writes an evidence-tagged report in the bug-archetype template. Use when a Bug or Incident work item needs an investigation report before triage decisions are made. For User Story, Feature, Task, or Spike work items, see `azure-requirements-investigator`." +metadata: + author: Taha Bikanerwala +tools: Read, Bash, Grep, wit_get_work_item, wit_query_by_wiql, wiki_search, teams_search_messages, teams_read_thread, mcp__datadog__search_datadog_logs +--- + +# Azure Issue Investigator + +Produce a structured report that orients an engineer for an Azure DevOps Bug or Incident work item. The report names what is broken, ranks 2-3 hypotheses, lists concrete next-step queries, and tags every claim with its evidence level. + +**Scope:** Bug and Incident archetypes. For User Story, Feature, Task, or Spike work items, the `azure-issue-triage` agent calls `azure-requirements-investigator` instead. + +This skill investigates. It does not solve, post, or modify anything. + +## Calling Convention + +This skill runs without user interaction. The constraints below let it work cleanly inside the `azure-issue-triage` agent (which has its own confirmation gate) and standalone. + +- **Non-interactive.** Never ask the user a question. Inputs are inferred from the work item and search results. +- **Predictable structure.** Same six section headers every run, in the same order, with one allowed reorder for production incidents (see Adaptation Rules). +- **Same evidence tags.** Always `[VERIFIED]`, `[OBSERVED]`, `[INFERRED]`, `[UNKNOWN]`. +- **Output is the last thing.** Skill ends after the report renders. No follow-up prompts. +- **Read-only.** No `wit_update_work_item`, no `wit_add_work_item_comment`, no `teams_send_message`. Posting is the caller's job. + +## Tool naming note + +The frontmatter `tools` list uses short, unprefixed names (`wit_get_work_item`, `wit_query_by_wiql`, `wiki_search`, `teams_search_messages`). The actual tool prefix depends on which Azure DevOps MCP server and which Teams MCP server you have installed and how Claude Code mounts them (e.g., `mcp__azure_devops__wit_get_work_item`, `mcp__plugin_ado__wit_get_work_item`). If the skill's tool calls fail because the prefix doesn't match, edit the frontmatter to add your prefix once. The skill body refers to tools by their short name throughout. + +If no Teams MCP server is installed, Level 1 (Teams search) is silently skipped and the investigation starts at Level 2. + +## Search Ladder + +Investigation runs four levels top to bottom. Each level has a gate: if it produces enough evidence to write a useful report, skip the remaining levels. + +### Setup + +Before running the levels, fetch the work item once and cache it for the rest of the skill. + +1. Identify the work-item ID from the invocation context (e.g., `12345` from a pasted URL or a parameter passed by the caller). If the calling context (such as the `azure-issue-triage` agent) has already fetched the work item and exposed the payload, reuse that payload; don't fetch again. +2. If no payload is available, call `wit_get_work_item` with the ID and `expand: "all"`. Request these fields at minimum: `System.Title`, `System.Description`, `System.State`, `System.Reason`, `Microsoft.VSTS.Common.Priority`, `Microsoft.VSTS.Common.Severity`, `System.Tags`, `System.AreaPath`, `System.IterationPath`, `System.AssignedTo`, `System.CreatedBy`, `System.CreatedDate`, `System.ChangedDate`, `System.WorkItemType`, plus the `relations` array. +3. Cache the response. Reference it as "the work-item payload" throughout the skill — `System.Title`, `System.Description`, `System.CreatedDate`, `relations`, `System.CreatedBy`, etc. + +If `wit_get_work_item` fails (auth error, work item not found, network), stop and tell the caller which call failed. Do not proceed without work-item data. + +### Level 1: Teams + +Skip this level entirely if no Teams MCP server is installed (the tool calls will return tool-not-found and the level produces nothing). + +Run 2-3 queries via `teams_search_messages`: + +1. The work-item ID (e.g., `12345` or `AB#12345` if your team uses the AzDO link prefix). +2. The most distinctive symptom or error message. +3. The customer or area name combined with a key term. + +For each relevant hit, follow the thread in full with `teams_read_thread`. + +What you are looking for: +- An engineer who already identified the root cause. +- A workaround that was shared. +- A specific service, config setting, or deploy named as the culprit. +- Links to relevant pull requests, commits, or related work items. + +**Gate:** if a Teams thread contains a confirmed root cause or workaround, write the report citing that thread and skip Levels 2-4. + +### Level 2: Work Item + AzDO + Wiki + +Read the work-item payload (cached in Setup) carefully. Signals are easy to miss on a fast scan: error messages, timestamps, customer names, browser/device, the question the reporter is actually asking. + +Then search: + +- **Related work items** via `wit_query_by_wiql`. Common patterns: + - `SELECT [System.Id], [System.Title], [System.State] FROM WorkItems WHERE [System.TeamProject] = '' AND [System.Description] CONTAINS '' ORDER BY [System.CreatedDate] DESC` + - `SELECT [System.Id], [System.Title], [System.State] FROM WorkItems WHERE [System.TeamProject] = '' AND [System.Title] CONTAINS '' AND [System.State] <> 'Closed' ORDER BY [System.CreatedDate] DESC` + - `SELECT [System.Id], [System.Title], [System.State] FROM WorkItems WHERE [System.TeamProject] = '' AND [System.AreaPath] UNDER '' ORDER BY [System.CreatedDate] DESC` +- **Linked work items.** Walk every entry in the `relations` array from the work-item payload. Read the linked work-item title, state, and the most relevant scope statement. +- **AzDO Wiki** via `wiki_search`. Look for runbooks, architecture pages, known-issues pages, onboarding docs. Use the feature area, system name, or entity type as the search term. + +For each related work item, record: ID, title, state, assignee, the most relevant finding from description or comments. +For each Wiki page, record: URL and a 1-line summary. + +**Gate:** if a runbook describes the exact scenario or a prior work item has the resolution, write the report pointing at that source. Skip Levels 3-4. + +### Level 3: Datadog + +Build queries from signals collected in Levels 1-2: error strings, service names, entity IDs, HTTP status codes. + +Call `search_datadog_logs` with: +- `query`: e.g., `service:my-service status:error @http.status_code:500 @user_id:abc123` +- `from`: 7 days before the work item's `System.CreatedDate`, or the timeframe mentioned in the work item +- `to`: work-item `System.CreatedDate` or now +- `limit`: 10-25 + +Build a Logs URL the engineer can click: +`https://app.datadoghq.com/logs?query=&from_ts=&to_ts=` + +**Suppression rule:** if Datadog returns any error (auth, 403/404, timeout, rate limit, empty results, or any non-success), treat Datadog as unavailable for this work item. Do not mention Datadog anywhere in the report. This rule overrides every other instruction that references Datadog data. + +**Gate:** if Datadog returned usable results that identify a service, an error pattern, or a timeline gap, write the report incorporating those findings. Skip Level 4 unless an external source points specifically to a code-level cause. + +### Level 4: Code + +Enter only when Levels 1-3 turned up nothing useful, OR external sources point to a code-level cause that needs tracing. + +1. **Error strings.** Use `Bash` (e.g., `grep -r 'pattern' path/`) or `Grep` to find error messages in the codebase. Identify which service owns the error. +2. **Endpoints or event handler names.** Search for route definitions or event handler names to confirm which service handles the affected flow. +3. **Observable signals.** Use `Read` to open source files near the relevant code; find logging and monitoring calls. For each call found, note the log message string and any structured tags so the "Where To Look" section can name them. +4. **Recent changes.** Run `git log --since="2 weeks ago" -- ` via `Bash` to find commits that correlate with the reported timeline. + +Stop when you can name: which service is involved, what signals are observable, and 2-3 concrete observability queries. Do not trace full call chains unless the chain itself is the finding. + +## Evidence Model + +Every claim in the report carries one of four tags. + +| Tag | Meaning | +|-----|---------| +| `[VERIFIED]` | Directly confirmed. Read in code, or a source explicitly states this. | +| `[OBSERVED]` | A pattern matches the reported behavior, but reaching the conclusion required a logical step. | +| `[INFERRED]` | Logical deduction from available information. Not directly observed. | +| `[UNKNOWN]` | Cannot determine from available sources. Requires runtime data. | + +If the finished report has more `[INFERRED]` than `[VERIFIED]` findings, the search was insufficient. Go back and search more before writing. + +Every `[UNKNOWN]` becomes a "Where To Look" item: name the runtime check that would resolve it. + +## Stop Condition + +Investigation is **done** when all three are true: + +1. There are 2-3 ranked hypotheses, most-likely first. **Exception:** if the Level 1 or Level 2 gate fired with a confirmed root cause, a single hypothesis is sufficient. +2. At least one source has been consulted at every search level the investigation reached. (If Level 1 closed via its gate, Levels 2-4 do not need sources. If Level 1 was skipped because no Teams MCP is installed, that is not a source gap; investigation just starts at Level 2.) +3. There are concrete next-step queries or files in "Where To Look". + +If any one is missing, keep investigating. + +## Report Template + +Every report has all six sections. If a section has nothing meaningful to say, write a 1-line note ("Not applicable for this work item") rather than skip the section. + +### 1. Lead + +1-2 sentences. Name what is broken and your single best hypothesis. Inline evidence tag. Do not restate the work-item title. + +Example: + +> Sessions for tenant `MapleTower` started failing at the join step yesterday after deploy `2026-04-29T18:00Z`; the new SSO middleware is the most likely cause `[OBSERVED]`. + +### 2. Scope & State + +Who is affected (one user, a segment, or all). Whether investigation is complete or needs runtime verification. Stale-work-item flag if the work item has been quiet for more than 2 weeks while the bug may already be fixed. + +### 3. Domain Context + +2-4 sentences. Define vendor names, internal acronyms, or product terminology a new team member would not know. Skip with "Not applicable" if the affected area is obvious from the title. + +### 4. What Happened + +2-4 sentences. Plain language. Include the exact error message and when the issue started if known. + +### 5. What We Found + +Narrative prose with evidence tags inline. Cover: + +- Which service or component owns the behavior. +- 2-3 hypotheses ranked by likelihood, each with its evidence trail. +- Recent changes (deploys, PRs, config) that correlate with the timeline. +- Related prior work items and what they say. + +No tables in this section. No code snippets unless the snippet itself is the finding (then keep it short). + +### 6. Where To Look + +2-5 tool-by-tool items. Each item: + +- Names the tool (code search, Teams search, admin URL, Sentry, Datadog, etc.). The list reflects tools the engineer should use after reading the report, not tools this skill itself queried. +- Gives the exact ready-to-paste query, URL, or file path. +- Says in one phrase what a hit or miss tells you. +- Datadog items appear here only if Datadog returned usable results during Level 3. The Level 3 suppression rule overrides this whenever Datadog was unavailable. + +Example: + +> - **Code search:** `grep -r 'SSO_TOKEN_EXPIRED' services/auth/` to find the error string in source. A hit identifies the service that owns the failure mode; a miss means the error originates outside the auth service. + +## Adaptation Rules + +These rules adjust section order or content emphasis. All six sections still appear every run. + +- **Found at Level 1 (Teams):** Section 5 leads with the Teams source and links the thread. Sections 3, 4 may be 1 line each. +- **Found at Level 2 (runbook or prior work item):** Section 5 leads with the source. Same brevity allowed elsewhere. +- **Required Levels 3-4 (code/logs):** Section 5 includes code references inline as `path/to/file.ext:line`. No long code snippets unless the snippet is the finding. +- **Production incident (live impact):** Reorder. Put Section 6 ("Where To Look") immediately after Section 1 ("Lead"). Sections 2-5 follow. Engineers reading this need next actions before context. +- **Vague work item (almost no signal):** Section 5 describes what was searched and what is unknown. Section 6 ends with a single `Where To Look` item naming the specific information the reporter could provide, phrased as a concrete question for the owning team to use if they choose to contact the reporter. The skill itself never contacts the reporter. + +## Writing Rules + +These apply to all text in the report. + +- No em dashes or spaced hyphens as separators. Em dashes inside parenthetical asides are fine. +- No LLM vocabulary: delve, leverage, robust, seamlessly, comprehensive, nuanced, elevate, foster, paradigm, ecosystem, holistic, innovative, synergy, empower, facilitate. +- Lead with the answer. No opener phrases. +- No trailing summaries on short sections. +- Prose over bullet lists when the content flows naturally as sentences. +- Never present unverified analysis as a confirmed root cause. diff --git a/azure-issue-triage/skills/azure-requirements-investigator/SKILL.md b/azure-issue-triage/skills/azure-requirements-investigator/SKILL.md new file mode 100644 index 0000000..259099b --- /dev/null +++ b/azure-issue-triage/skills/azure-requirements-investigator/SKILL.md @@ -0,0 +1,152 @@ +--- +name: azure-requirements-investigator +description: "Investigates a non-bug Azure DevOps work item (User Story, Feature, Task, Spike) by reading the work item and linked design or product docs, searching Microsoft Teams and AzDO Wiki for prior decisions, and producing an evidence-tagged orientation report. Use when a developer is about to pick up a non-bug work item and wants context before starting work." +metadata: + author: Taha Bikanerwala +tools: Read, Bash, Grep, wit_get_work_item, wit_query_by_wiql, wiki_search, teams_search_messages, teams_read_thread +--- + +# Azure Requirements Investigator + +Produce a structured report that orients a developer about to pick up a User Story, Feature, Task, or Spike work item. The report names what is being built (or asked), surfaces prior decisions found in Microsoft Teams and AzDO Wiki, lists the open questions that block start-of-work, and tags every claim with its evidence level. + +**Scope:** User Story, Feature, Task, Spike. The agent's Phase 0 routes any of these archetypes into this skill. Bug and Incident go to `azure-issue-investigator` instead. + +This skill investigates. It does not solve, post, or modify anything. + +## Calling Convention + +This skill runs without user interaction. The constraints below let it work cleanly inside the `azure-issue-triage` agent (which has its own confirmation gate) and standalone. + +- **Non-interactive.** Never ask the user a question. Inputs are inferred from the work item and search results. +- **Predictable structure.** Per-archetype templates with fixed section orders. The archetype is passed by the caller; when running standalone, infer it from the work-item-type field. +- **Same evidence tags.** Always `[VERIFIED]`, `[OBSERVED]`, `[INFERRED]`, `[UNKNOWN]`. +- **Output is the last thing.** Skill ends after the report renders. No follow-up prompts. +- **Read-only.** No `wit_update_work_item`, no `wit_add_work_item_comment`, no `teams_send_message`. Posting is the caller's job. + +## Tool naming note + +The frontmatter `tools` list uses short, unprefixed names (`wit_get_work_item`, `wit_query_by_wiql`, `wiki_search`, `teams_search_messages`). The actual tool prefix depends on which Azure DevOps MCP server and which Teams MCP server you have installed and how Claude Code mounts them. If the skill's tool calls fail because the prefix doesn't match, edit the frontmatter to add your prefix once. The skill body refers to tools by their short name throughout. If no Teams MCP is installed, Level 1 is silently skipped. + +## Search Ladder + +Investigation runs three levels top to bottom. Each level has a gate: if it produces enough evidence to write a useful report, skip the remaining levels. Datadog is not in the ladder by default; non-bug work items rarely have runtime telemetry to query, and adding it pulls in a tool the skill does not need for this archetype. + +### Setup + +Before running the levels, fetch the work item once and cache it for the rest of the skill. + +1. Identify the work-item ID from the invocation context (e.g., `12345` from a pasted URL or a parameter passed by the caller). If the calling context (such as the `azure-issue-triage` agent) has already fetched the work item and exposed the payload, reuse that payload; don't fetch again. +2. If no payload is available, call `wit_get_work_item` with the ID and `expand: "all"`. Request these fields at minimum: `System.Title`, `System.Description`, `System.State`, `System.WorkItemType`, `System.Tags`, `System.AreaPath`, `System.IterationPath`, `System.AssignedTo`, `System.CreatedBy`, `System.CreatedDate`, `System.ChangedDate`, `System.Parent`, plus the `relations` array. Add the optional fields the caller cares about (`Microsoft.VSTS.Common.AcceptanceCriteria`, `Microsoft.VSTS.Scheduling.StoryPoints`) when known. +3. Determine the archetype: use the value passed by the caller. If running standalone, map `System.WorkItemType` using the table below. If the work-item type is `Bug`, or `Issue` / `Impediment` (which the agent routes as Incident), stop and tell the caller this skill does not handle those archetypes; route to `azure-issue-investigator` instead. + + | Work-item type (Agile / Scrum / CMMI) | Archetype | Report template | + |---------------------------------------|-----------|-----------------| + | User Story (Agile) / Product Backlog Item (Scrum) / Requirement (CMMI) | User Story | Feature template | + | Feature / Epic | Feature (epic-level — note in Lead) | Feature template | + | Task | Task | Task template | + | Task tagged `spike`, or a custom `Spike` work-item type | Spike | Spike template | + +4. Cache the response as "the work-item payload" throughout the skill. + +If `wit_get_work_item` fails (auth error, work item not found, network), stop and tell the caller which call failed. Do not proceed without work-item data. + +### Level 1: Teams + +Skip entirely if no Teams MCP server is installed. + +Run 2-3 queries via `teams_search_messages`: + +1. The work-item ID (e.g., `12345` or `AB#12345`). +2. The feature, task, or spike name, or the most distinctive phrase from the title. +3. The area or system name combined with a key term from the description. + +For each relevant hit, follow the thread in full with `teams_read_thread`. + +What you are looking for: +- A prior decision that defines the scope (often labeled "decided", "approved", "agreed"). +- A linked design doc, product brief, or RFC. +- A named owner or Decider for the area. +- A prior thread that names the same problem with a different framing (the team may have a different vocabulary for the same scope). + +**Gate:** if a Teams thread contains a confirmed scope statement, an approved design link, or a clear "out of scope" decision, write the report citing that thread and skip Levels 2-3. + +### Level 2: Work Item + AzDO + Wiki + +Read the work-item payload (cached in Setup) carefully. Signals are easy to miss on a fast scan: linked design docs, mentions of related work items, vocabulary like "blocks" or "depends on" that points at sequencing constraints, the question the reporter is actually asking. + +Then search: + +- **Linked work items.** Walk every entry in the `relations` array and the `System.Parent` field. Read the related work-item titles and the most recent 1-2 comments. Note: state, assignee, the most relevant scope statement. +- **Related work items** via `wit_query_by_wiql`. Common patterns: + - `SELECT [System.Id], [System.Title], [System.State] FROM WorkItems WHERE [System.TeamProject] = '' AND [System.Description] CONTAINS '' ORDER BY [System.CreatedDate] DESC` + - `SELECT [System.Id], [System.Title], [System.State] FROM WorkItems WHERE [System.TeamProject] = '' AND [System.AreaPath] UNDER '' AND [System.WorkItemType] IN ('User Story', 'Product Backlog Item', 'Task', 'Feature') ORDER BY [System.CreatedDate] DESC` + - When the work item has a parent epic or feature: `SELECT [System.Id], [System.Title] FROM WorkItemLinks WHERE Source.[System.Id] = AND [System.Links.LinkType] = 'System.LinkTypes.Hierarchy-Forward' MODE (Recursive)`. +- **AzDO Wiki** via `wiki_search`. Search for product briefs, design docs, ADRs, RFCs, runbooks for the area. Use the feature area, system name, or product theme as the search term. + +For each related work item, record: ID, title, state, assignee, the most relevant scope or AC finding. +For each Wiki page, record: URL and a 1-line summary of what it contains. + +**Gate:** if a product brief or design doc explicitly states the scope and acceptance criteria for this work, write the report citing that source. Skip Level 3 unless the work item also references existing code that needs orientation. + +### Level 3: Code + +Enter only when Levels 1-2 turned up nothing useful, OR the work item explicitly references existing code that needs context to size or scope the work. + +1. **Affected service or module.** When the work item names a service, module, or file path, use `Bash` (e.g., `git ls-files | grep -i 'pattern'`) or `Grep` to locate it in the repository. Note the relative path. +2. **Existing patterns.** When the work item references "the same way we do X for Y", use `Grep` to find the X pattern and `Read` the existing implementation. Record the path and a 1-line summary of the pattern. +3. **Recent changes.** Run `git log --since="2 months ago" -- ` via `Bash` to find commits in the affected area. The recent change list informs the "Where To Look" section. + +Stop when you can name: which area of the code is involved, what existing patterns to follow, and 1-3 specific files or directories the developer should open first. Do not trace full call chains; this is orientation, not implementation. + +## Evidence Model + +Every claim in the report carries one of four tags. + +| Tag | Meaning | +|-----|---------| +| `[VERIFIED]` | Directly confirmed. Read in code or in an authoritative source (design doc, ADR, work-item description) that explicitly states this. | +| `[OBSERVED]` | A pattern matches the reported scope, but reaching the conclusion required a logical step. | +| `[INFERRED]` | Logical deduction from available information. Not directly observed. | +| `[UNKNOWN]` | Cannot determine from available sources. Requires asking a person or reading docs that were not found. | + +If the finished report has more `[INFERRED]` than `[VERIFIED]` findings, the search was insufficient. Go back and search more before writing. + +Every `[UNKNOWN]` becomes either an "Open Questions" item (when the unknown blocks start of work and a person can answer it) or a "Where To Look" item (when the unknown can be resolved by reading or searching). + +## Stop Condition + +Investigation is **done** when all three are true: + +1. The Lead names what the work item asks for in one or two sentences, with the strongest evidence inline. +2. At least one source has been consulted at every search level the investigation reached. (If Level 1 closed via its gate, Levels 2-3 do not need sources. If Level 1 was skipped because no Teams MCP is installed, that is not a source gap.) +3. There are concrete next-step queries, file paths, or doc links in "Where To Look". + +If any one is missing, keep investigating. + +## Report Template + +The template differs by archetype. Read `references/report-template.md` when you reach the report-writing step. Pick the matching template based on the archetype determined in Setup. + +Section orders and definitions live in `references/report-template.md`. The three templates share five concepts: a one-line Lead, a context section (Background or Why Now or Question to Answer), a scope-or-knowledge section, an unknowns section, and a Where To Look section. + +## Adaptation Rules + +These rules adjust section emphasis without changing the section list. + +- **Found at Level 1 (Teams):** the context section leads with the Teams source and links the thread. Other context sections may be 1 line. +- **Found at Level 2 (design doc or product brief):** the context section leads with the doc URL and quotes the most relevant sentence. Same brevity allowed elsewhere. +- **Required Level 3 (code):** the Where To Look section includes code references as `path/to/file.ext` (no line numbers; this is orientation, not pinpointing). When the existing pattern is the finding, name the file in the context section. +- **Vague work item (almost no signal):** the Open Questions / Open Blockers section grows to name what the developer would need from the reporter or product owner before starting. Do not pad with prescriptive suggestions; only name genuine unknowns. + +## Writing Rules + +These apply to all text in the report. + +- No em dashes or spaced hyphens as separators. Em dashes inside parenthetical asides are fine. +- No LLM vocabulary: delve, leverage, robust, seamlessly, comprehensive, nuanced, elevate, foster, paradigm, ecosystem, holistic, innovative, synergy, empower, facilitate. +- Lead with the answer. No opener phrases. +- No trailing summaries on short sections. +- Prose over bullet lists when the content flows naturally as sentences. +- Never present unverified scope as confirmed acceptance criteria. +- Quote design docs and product briefs directly (single sentence) when paraphrasing risks losing meaning. diff --git a/azure-issue-triage/skills/azure-requirements-investigator/references/report-template.md b/azure-issue-triage/skills/azure-requirements-investigator/references/report-template.md new file mode 100644 index 0000000..fb5f6ea --- /dev/null +++ b/azure-issue-triage/skills/azure-requirements-investigator/references/report-template.md @@ -0,0 +1,148 @@ +# Report Template + +Read this when you reach the report-writing step in `azure-requirements-investigator`. Skip on earlier steps. + +The report template differs by archetype. Pick the matching template based on the archetype passed by the caller (the agent in Phase 1) or inferred from the work-item-type when running standalone (User Story / Product Backlog Item / Requirement -> Feature; Task -> Task; Spike custom type or Task tagged `spike` -> Spike). + +Each template has a fixed section list and order. Do not omit a section heading. If a section has nothing meaningful to say, write a one-line note under the heading ("Not applicable for this work item" or "Nothing prior found"). The absence is itself signal, and keeping the heading lets a reader scan for the section even when it is empty. + +## Feature Template + +Six sections. The agent routes User Story, Feature, and Epic work items into this template. A standalone invocation against any of those archetypes uses the same shape. + +### 1. Lead + +1-2 sentences. Name what is being built and the single best summary of scope. Inline evidence tag. Do not restate the work-item title. + +Example: + +> Add a per-tenant rate limit for the bulk-export endpoint, default 100 requests per hour, configurable per tenant `[VERIFIED]` from the product brief at WI #1234. + +### 2. Background + +2-4 sentences. Why this work item exists. Prior decisions, related history, links to product briefs, design docs, runbooks, related work items. Pull this from the AzDO Wiki and the linked work items discovered in Level 2. + +When the background is established in a Wiki brief or ADR, lead with the URL and quote the single most relevant sentence. Do not paraphrase across multiple decisions; name them separately. + +### 3. Requirements Found + +Concrete acceptance criteria, definition of done, success metrics, target user stories. Pull from the work item itself (including the `Microsoft.VSTS.Common.AcceptanceCriteria` field if populated), linked work items, and any Wiki spec the work item points to. + +Format as a bulleted list when there are 3 or more items. Each bullet starts with the requirement, followed by the source citation in brackets. Tag explicit gaps as `[UNKNOWN]`. + +Example: + +> - Bulk export rate limit defaults to 100 req/hour per tenant `[VERIFIED]` (WI #1234 description, paragraph 2). +> - Override via `bulk_export_rate_limit` config field, integer in requests/hour, no upper bound `[OBSERVED]` (WI #1234 comment from @alice 2026-04-15). +> - Rate limit error response shape unspecified `[UNKNOWN]`. + +### 4. Design Refs + +Links to Figma boards, design docs, mockup reviews, ADRs. One bullet per link with a one-phrase summary of what's at the link. If nothing is linked, write "None found in work item or comments." A Feature work item with no design refs is itself a triage flag, worth naming. + +### 5. Open Questions + +Genuine unknowns that need an answer before development starts. Each question is specific (not "what's the scope?"). Tag the most likely answerer if discoverable from the work-item history or the Teams search. + +Example: + +> - What error response shape should the rate limit return? Likely answerer: @bob (named in WI #1234 thread). `[UNKNOWN]` +> - Does the rate limit apply to admin-impersonated requests? `[UNKNOWN]` + +### 6. Where To Look + +2-5 tool-by-tool items. Each item: + +- Names the tool (code search, AzDO Wiki search, Teams search, work-item ID, design tool). +- Gives the exact ready-to-paste query, URL, or file path. +- Says in one phrase what a hit or miss tells you. + +Examples: + +> - **Code search:** `rg 'rate_limit' services/bulk_export/` to find existing rate-limit infrastructure. A hit identifies the pattern to follow; a miss means this is greenfield in the bulk-export service. +> - **Teams search:** "bulk export" in the platform channel to find the design discussion that led to WI #1234. A hit gives the rationale behind the 100 req/hour default. +> - **AzDO Wiki:** search `Bulk Export ADR` (full text) to find the architecture decision record. The ADR is the canonical scope source. + +## Task Template + +Five sections. + +### 1. Lead + +1-2 sentences. Name what needs to happen and the why-now. Inline evidence tag. + +Example: + +> Bump `pg` driver from 8.x to 9.x across the `payments` and `billing` services to unblock Postgres 16 upgrade `[VERIFIED]` from WI #2345. + +### 2. Why Now + +2-3 sentences. The trigger for the task: dependency upgrade unblocks something downstream, deprecation deadline, related migration, runbook execution, security advisory. + +Pull from the work item and recent Teams discussions. When a deadline or dependency is the trigger, name it specifically with date or work-item reference. + +### 3. Definition of Done Found + +Concrete completion criteria. Pull from the work item and any linked checklist or runbook. Tag explicit gaps as `[UNKNOWN]`. + +Format as a bulleted list. Each bullet is a single completion check, not a process step. + +### 4. Risks + +Anything in the affected code, config, or runbook that makes this task non-trivial. One bullet per risk. Examples: + +- Shared config touched by other teams. +- Dependency with a known breaking change between minor versions. +- Infrastructure resource with downstream consumers (database, queue, shared library). +- The task touches a runbook step that has no documented rollback. + +If no risks are found, write "No specific risks surfaced from search; standard care expected." + +### 5. Where To Look + +Same format as Feature template Section 6. + +## Spike Template + +Five sections. The agent routes any Task tagged `spike` (or a custom Spike work-item type) into this template. + +### 1. Lead + +1-2 sentences. Name the question being investigated and the time-box if known. Inline evidence tag. + +Example: + +> Spike: evaluate whether Cognito can support more than 100 IdPs per user pool. Time-boxed at 3 days `[VERIFIED]` from WI #3456. + +### 2. Question to Answer + +The specific decision or unknown the spike is meant to resolve. Quote it directly from the work item if the work item states it well; rewrite if vague. Multiple sub-questions are allowed if they are all in scope. + +Example: + +> Primary question: Does Cognito support more than 100 IdPs per user pool today? `[VERIFIED]` from WI #3456. +> Sub-questions: What is the practical performance impact of >100 IdPs? What are the migration paths if the limit is hard? + +### 3. What's Already Known + +Findings from any prior spike, design doc, related work item, or Teams thread that bear on the question. Tag each with evidence level. If nothing is known, write "Nothing prior found." (which is itself a useful signal: the spike is starting from scratch). + +### 4. What's Unknown + +The gaps that the spike needs to fill. One bullet per gap. Each phrased as a concrete check (`Does Cognito support more than 100 IdPs?`) rather than a vague topic (`scalability`). + +When the team has access to a vendor representative or a person with prior context, name them on the relevant unknown. + +### 5. Where To Look + +Same format as Feature template Section 6. For Spikes, "Where To Look" frequently includes external research (vendor docs, RFCs, comparison articles, vendor support tickets) in addition to internal sources. + +Example: + +> - **Vendor docs:** AWS Cognito IdP service quotas page (https://docs.aws.amazon.com/cognito/latest/developerguide/limits.html) to confirm the documented limit. +> - **Vendor support:** check past AWS Support cases for IdP limit-raise requests; AWS often raises soft limits. +> - **Related code:** `rg 'IdentityProvider' services/auth/` to see how many IdPs are currently configured per pool in the existing codebase. + +## Section Order + +Sections appear in the order shown for each archetype. Never omit a section heading. When a section legitimately has nothing to say, keep the heading and write the brief placeholder line described at the top of this file. If the absence is itself important context (e.g., a Feature work item with no design refs), call it out in the Lead in addition to the placeholder line in the empty section. diff --git a/azure-issue-triage/skills/azure-work-item-refiner/SKILL.md b/azure-issue-triage/skills/azure-work-item-refiner/SKILL.md new file mode 100644 index 0000000..24ba9fe --- /dev/null +++ b/azure-issue-triage/skills/azure-work-item-refiner/SKILL.md @@ -0,0 +1,175 @@ +--- +name: azure-work-item-refiner +description: "Restructures a poorly written Azure DevOps work item into a clear, self-contained document that a stranger can read cold and act on. Works on any work-item type (Bug, Incident, User Story / Feature, Task, Spike). Updates the description (System.Description) and title (System.Title) via the Azure DevOps MCP and never deletes original content. Use when the user asks to refine, rewrite, restructure, clean up, or improve an Azure DevOps work item." +metadata: + author: Taha Bikanerwala +tools: AskUserQuestion, Read, wit_get_work_item, wit_update_work_item, wit_add_work_item_comment, core_list_projects +--- + +# Azure Work Item Refiner + +Take an Azure DevOps work item that is hard to read and turn it into a document a stranger can open cold, in a year, and act on. Reorganize the content. Never delete it. Every fact in the original survives the rewrite, just placed somewhere it can be found. + +This skill modifies Azure DevOps in its default mode: it updates the work item's title and description (writing to the `System.Title` and `System.Description` fields, both submitted as a JSON Patch document via `wit_update_work_item`), and posts an optional next-steps comment when the user asks for one. **One exception:** when invoked with `Calling context: skip_preview=true.` (the `azure-issue-triage` agent does this in Phase 5), the skill operates in **read-only-return mode** and performs no Azure DevOps writes at all; it returns the refined title and description as plain text for the caller to write. See the "Calling Convention" bullets below for the full read-only-return contract. + +## Calling Convention + +The skill works two ways. When a user pastes a work-item ID or URL and asks to refine it, run end to end. When the `azure-issue-triage` agent calls this skill in Phase 5, treat the agent's already-fetched payload and investigation findings as the source data and skip the fetch step. + +- **One confirmation gate.** Always preview the rewrite before calling `wit_update_work_item`. The user must approve. **Exception:** when the prompt passed to the skill begins with `Calling context: skip_preview=true.` (the `azure-issue-triage` agent inserts this leading line in Phase 5 because the agent already captured user approval at Phase 3 and renders its own informational preview before writing), the skill operates in **read-only-return mode**: + - Do not call `AskUserQuestion` (no preview gate). + - Do not call `wit_update_work_item` (no description or title write). + - Do not call `wit_add_work_item_comment` (no Step 8 next-steps comment, even if the input would normally trigger it). + - Do not call any other Azure-DevOps-mutating MCP tool. + - The skill's only side effects are reading the cached payload and producing the refined title and description as plain-text output for the caller to consume. + - The agent (Phase 5 step 4) owns the `wit_update_work_item` write; if the user wants a Step 8 next-steps comment, they request it through a separate flow after the agent run completes. +- **Read-then-write.** Refuse to write before reading the description, comments, and (when relevant) revision history. +- **Strict superset.** The refined work item contains every fact from the original. Restructure, rewrite, and re-tag, never truncate. +- **No solution prescription.** The skill structures information. It does not invent fixes, recommend roadmap, or editorialize on causes that are not in the work item. + +## Prerequisites + +- The user provides a work-item ID (e.g., `12345`) or a work-item URL (e.g., `https://dev.azure.com///_workitems/edit/12345`). +- The Azure DevOps MCP server is configured. `core_list_projects` returns at least one project so the organization context is resolvable. +- The user has edit permission on the work item. + +If any prerequisite fails, stop and tell the user which call failed before continuing. + +## Workflow + +The workflow is eight ordered steps. Steps 1, 5, and 6 require reading the corresponding reference file when you reach that step. Do not pre-load the references at the start of the run. + +1. **Fetch** the work item: fields, comments, revision history when needed, and linked work items. +2. **Classify** the work-item archetype (Bug / Feature / Task / Incident / Spike). The archetype controls which sections appear in the refined description. +3. **Inventory** every distinct fact across the description, comments, revision history, and linked work items. Tag each fact with its information category. +4. **Rewrite** the inventory into prose, applying the rewrite principles. +5. **Apply the template** in `assets/template.md`, including only the sections the archetype calls for. +6. **Rewrite the title** so a reader on a board can decide whether to open the work item. +7. **Preview** the proposed title and description as inline markdown. Wait for the user to approve. +8. **Post a next-steps comment** only when the user asks for one. + +--- + +### Step 1: Fetch the Work Item + +Read [references/gathering-guide.md](references/gathering-guide.md) when you reach this step. It documents which fields to request, the HTML content-loss check, and the completeness gate that blocks Step 2 until comments and revision history are read. + +If the calling context (the `azure-issue-triage` agent in Phase 5) has already fetched the work item and exposes the payload, reuse it. Do not refetch. + +### Step 2: Classify the Archetype + +Use the table below. The archetype is the single most important variable in the rewrite because it picks which template sections appear in the final description. + +| Archetype | Typical Azure DevOps work-item types | What the content looks like | +|-----------|---------------------------------------|------------------------------| +| **Bug** | Bug | A user-visible symptom, an error, broken behavior, or unexpected output. May include reproduction steps. | +| **Feature** | User Story (Agile), Product Backlog Item (Scrum), Requirement (CMMI), Feature, Epic | A user need or business goal, acceptance criteria, design specs, links to product briefs. | +| **Task** | Task | Operational work: cleanup, configuration, migration, dependency upgrade, runbook execution. | +| **Incident** | Issue (Agile), Impediment (Scrum), Issue (CMMI). Some teams also use Bug + a `incident` tag. | Production impact, blast radius, timeline, mitigation steps, post-mortem context. | +| **Spike** | Task or User Story tagged `spike`, or a custom Spike work-item type. | Open questions to resolve, exploration scope, proof-of-concept boundaries, preliminary findings. | + +**When the work-item type and the content disagree, trust the content.** A work item typed `Bug` whose body is acceptance criteria and a Figma link is a Feature. A work item typed `Task` describing user-visible breakage is a Bug. The content drives the template, not the work-item type field. + +Hold the archetype as working context. Do not surface it in the preview unless the user asks. + +### Step 3: Inventory the Information + +Catalog every distinct fact found in the description, every comment, the revision history (if read), and any linked work items that add scope. Read [references/classification-guide.md](references/classification-guide.md) when you reach this step. It defines the seven information categories and the verified-vs-unverified flag every item carries. + +### Step 4: Rewrite + +Apply the rewrite principles from `references/classification-guide.md` (already loaded in Step 3). The principles cover symptoms-over-solutions, evidence-over-claims, prose-over-tables, and the rules for surfacing decisions buried in comment threads. + +### Step 5: Apply the Template + +Structure the rewritten content using `assets/template.md`. Read it now, alongside [references/azure-html-formatting.md](references/azure-html-formatting.md) for the markdown-to-HTML safety rules. + +Three rules are non-negotiable: + +- **Pick sections by archetype.** The template ships with an archetype-to-sections map. Include only the sections that map says belong to the current archetype. +- **Skip empty sections.** A section header with no content under it is noise. Omit it. +- **Fold raw metadata into the body.** Source-system blocks (intake forms, support escalation tables, Zendesk dumps, customer-feedback exports) get their facts extracted and placed in the appropriate template section. Do not preserve the raw block verbatim. The only time an Original Metadata section is appropriate is when a fact is genuinely unclassifiable and has no other home. + +### Step 6: Rewrite the Title + +Read [references/title-guide.md](references/title-guide.md) when you reach this step. It documents the title pattern, the archetype-by-archetype examples, and the character budget. + +### Step 7: Preview and Confirm + +Preview before posting. The preview is the user's only chance to catch a mistake before the work item is overwritten. + +**Render the preview as inline markdown.** The refined description contains its own fenced code blocks (errors, queries, JSON). Wrapping the whole preview in an outer code fence breaks every inner block. Use this exact layout: + +1. A horizontal rule. +2. The new title on its own line, formatted as bold `Title:` followed by the rewritten title text inside an inline-code span. "Title" here means the user-facing label and the value that will land in the `System.Title` field. The literal markdown to emit is shown below. +3. A blank line. +4. The full refined description rendered as plain markdown, no outer fence. The agent body converts this markdown to HTML before writing; see `references/azure-html-formatting.md` for the safe-HTML subset and conversion rules. +5. A horizontal rule. +6. A short prompt asking the user to approve or request changes. + +The title line in step 2 looks like this when emitted as markdown: + +```markdown +**Title:** `{the rewritten title}` +``` + +Do not pad the preview with workflow notes (`Archetype:`, `HTML warning:`, `Previous state:`). The preview is what will appear on the work item. If the rewrite carries a real risk of losing HTML-only content (rich attachments, embedded videos, complex tables, in-line mentions, color-tinted callouts), say so in one sentence outside the preview. Do not warn for cosmetic-only losses. + +After the user reviews: + +- **Approved.** Call `wit_update_work_item` once with a JSON Patch document setting `System.Title` and `System.Description`. The patch's `System.Description` value is HTML; convert the markdown rewrite to HTML using the rules in `references/azure-html-formatting.md`. Never send raw markdown into `System.Description`: the AzDO API stores it verbatim and renders the literal markdown characters on the work item. +- **Changes requested.** Revise and re-preview. Do not call `wit_update_work_item` until the user explicitly approves. + +### Step 8: Post a Next-Steps Comment + +Skip this step unless the user asks for it. + +When asked, post via `wit_add_work_item_comment`. Comments in Azure DevOps are stored as HTML; convert your draft from markdown to safe HTML using the rules in `references/azure-html-formatting.md` before submitting. + +Build the comment with these elements: + +- An `

` heading whose text is `Next Steps (YYYY-MM-DD)`. Substitute today's date in `YYYY-MM-DD` form. Use parentheses for the date so the heading does not need a separator. +- An `
    ` list containing one `
  1. ` per action. +- Every action names an owner or team (as plain text, not an `@mention`, unless the user explicitly approved tagging that account). +- Use concrete verbs. Write `Verify token rotation schedule with Platform team` rather than `Look into auth`. + +The skeleton below is the structure to build. Substitute the real date for `YYYY-MM-DD`. + +```html +

    Next Steps (YYYY-MM-DD)

    +
      +
    1. Verify token rotation schedule with Platform team.
    2. +
    3. Page on-call if customer impact persists past 14:00 UTC.
    4. +
    +``` + +Comment body actions belong here, not in the description. The description records what is known and unknown. The comment records what happens next. + +--- + +## Anti-Patterns + +These are hard rules. Each one prevents a failure mode that has been observed on real refinements. + +1. **Never present unverified analysis as confirmed root cause.** Frame it as a working hypothesis in the Working Hypotheses section, in plain prose. No disclaimer block, no warning panel. +2. **Never inject solutions that are not in the original work item.** If the original contains evidence pointing toward a fix, surface the evidence. Drop the solution. +3. **Never add roadmap or tech-debt suggestions.** A work-item description is a record. It is not a forum for "we should also...". +4. **Never replace the description with less information.** The refined version is a strict superset of the original. The amount of unique information cannot decrease. +5. **Never lose investigation artifacts.** Customer IDs, log links, query results, error strings, attachment URLs, screenshot embeds, video links: every one survives. +6. **Never emit raw markdown into `System.Description`.** The API stores HTML; it does not run a markdown-to-HTML conversion. Convert before writing. +7. **Never inject `