diff --git a/pstack/.cursor-plugin/plugin.json b/pstack/.cursor-plugin/plugin.json index f1e61ce..7f525c0 100644 --- a/pstack/.cursor-plugin/plugin.json +++ b/pstack/.cursor-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "pstack", "displayName": "pstack", - "version": "0.9.2", + "version": "0.10.0", "description": "if you want to go fast, go deep first. pstack helps you write less, but higher quality code. rigorous agent workflows you can parallelize with confidence.", "author": { "name": "Lauren Tan" diff --git a/pstack/README.md b/pstack/README.md index 19ebb58..2889492 100644 --- a/pstack/README.md +++ b/pstack/README.md @@ -26,6 +26,12 @@ type `/automate-me`. it mines your recent transcripts, drafts a `-mod models are configurable too. type `/setup-pstack`. it detects the models you have access to and writes a small always-applied rule mapping each role (code, judgment, the review panels) to a model. every skill reads it and falls back to sensible defaults when the rule is absent, so you override only what you want. +## automations + +pstack also ships a dormant [benny automation pack](./automations/benny/). benny triages slack issue reports, then reproduces and fixes confirmed bugs with real ui evidence. its files are not registered as slash skills. + +to set it up, point cursor at [`FOR_AGENTS.md`](./automations/benny/FOR_AGENTS.md). setup copies the pack into the target repository at `.cursor/automations/benny/`, enables pstack there for shared skills, and keeps user configuration outside the copied pack. + ## usage use `/poteto-mode` at the start of a task. it reads your request, picks from a set of playbooks, and runs the other skills as the steps need them. diff --git a/pstack/automations/benny/FOR_AGENTS.md b/pstack/automations/benny/FOR_AGENTS.md new file mode 100644 index 0000000..e732156 --- /dev/null +++ b/pstack/automations/benny/FOR_AGENTS.md @@ -0,0 +1,89 @@ +# benny automation intent + +## what i want to automate + +i want two cursor automations that work together in one slack issue channel. + +### automation 1: triage issue reports + +- trigger: when someone posts a new top-level report in my configured source slack channel, i want this automation to start on that report and keep its original thread coordinates. +- behavior: i want it to read the thread and attachments, classify the report as a bug or performance issue, feature request, question or feedback, or reroute, and trace the likely owning layer before routing. +- tracker: i want it to search my configured tracker for duplicates, update a confident duplicate, and create a ticket only for a clear net-new bug. +- tools: i want slack thread read and reply access, my configured tracker integration, and my optional routing map. +- outcome: i want exactly one reply in the source thread with a short verdict and `[benny:bug]`, `[benny:performance]`, or `[benny:other]`. a bug or performance marker may include the tracker url. +- boundary: i never want this automation to post a root message in the source channel. + +### automation 2: reproduce and fix confirmed bugs + +- trigger: i want this automation to start from the same new top-level report, or another supported trigger chosen during setup, then wait for the trusted triage marker in the original thread. +- gates: i want it to stop when someone clearly owns the fix. if an existing pull request or merged commit may fix the report, i want verification instead of a competing change. +- behavior: i want it to use my configured control adapter and feature map, reproduce the exact symptom twice through the real ui, and capture screenshots, video, and a read-only state cross-check. +- fix: i want it to verify existing pull requests without authoring over them. after a confirmed repro, it may attempt one bounded root-cause fix, use tdd when the test is cheap, smoke the blast radius, and open a draft pull request only when before-and-after proof passes. +- tools: i want slack thread read and reply access, repository and history access, draft pull request creation, my configured tracker, and my control adapter. +- outcome: i want evidence and a verified result in the source or optional operations threads, plus an optional draft pull request. updates should be concise. +- boundary: i never want this automation to post a root message in the source channel. + +### shared rules + +- i want the source channel and root thread coordinates to stay immutable for the whole run. +- i treat utility and debug bots as evidence, not delegation or fix ownership. +- i allow subagents to help, but they cannot post to slack or receive slack credentials. +- i want this entire pack committed at `.cursor/automations/benny/` in the target repository. its `SKILL.md` files are direct automation instructions, not registered plugin skills. +- i want pstack enabled through the target repository's committed `.cursor/settings.json` only for shared dependencies such as `how`, `why`, `tdd`, `unslop`, and the required principle skills. +- i want each live automation prompt to read its committed operational file directly. i do not want plugin cache paths, copied excerpts, or slash-skill discovery. +- i keep user-owned configuration, feature maps, routing maps, and secrets outside `.cursor/automations/benny/` so pack refreshes cannot overwrite them. +- i want both automations to fail closed when channel coordinates, tracker access, the control adapter, or the feature map are missing or uncertain. +- i want draft pull requests only. do not merge or deploy. + +### my configuration + +- source slack channel: `` +- optional operations channel: `` +- repository and default branch: ``, `` +- tracker: `` +- routing map: `` +- triage identity: `` +- control skill: `` +- feature map: `` +- models: `` +- status emoji strings: `` +- budgets: `` +- optional bot token capability: `` + +start from [`configuration.example.yaml`](./templates/configuration.example.yaml) and [`feature-map.example.md`](./skills/reproduce-and-fix-issues/references/feature-map.example.md). copy and fill them outside this pack, for example under `.cursor/benny/`. keep secret values in a secret manager or environment. + +## for the agent + +the human enters setup by pointing cursor at this file. do not look for or invoke a discovered benny slash skill. + +1. ask which repository will run the automations. +2. treat the directory containing this `FOR_AGENTS.md` as the source pack. +3. merge the entire source pack into `/.cursor/automations/benny/`. +4. preserve every destination-only file. never delete unrelated files or overwrite user-owned configuration, feature maps, or routing maps. +5. when an existing destination file at a source-managed path differs, review the diff and merge without discarding local edits. if ownership is ambiguous, stop and ask before replacing it. +6. verify that the copied `FOR_AGENTS.md` and `skills/setup-benny/SKILL.md` exist in the target repository. +7. read and follow `.cursor/automations/benny/skills/setup-benny/SKILL.md` directly from the target repository. + +i want you to merge this entry into the target repository's `.cursor/settings.json`: + +```json +{ + "plugins": { + "pstack": { "enabled": true } + } +} +``` + +preserve every unrelated setting and plugin. preserve comments and valid jsonc syntax when the existing file uses jsonc. + +i want verification from a fresh agent rooted in the target repository. confirm that pstack's `how`, `why`, `tdd`, `unslop`, and the principle skills used by benny resolve in project scope. do not count skills loaded from the current session or a user-scoped install. + +if project-scoped plugins are unavailable or any shared dependency does not resolve, stop and explain what failed. do not add `.cursor/automations/benny/skills/` to a plugin manifest or expect its files to appear in the slash-skill list. + +tell me that `.cursor/settings.json`, `.cursor/automations/benny/`, and any referenced secret-free configuration must be committed before either automation is enabled. do not create or update an automation until i explicitly ask. + +for first-time creation, use built-in `/automate` once for triage and once for repro and fix. complete the draft review, approval, readiness check, and Automations editor handoff for the first automation before starting the second. + +paraphrase this intent and the finished configuration into each draft. the triage prompt must read and follow `.cursor/automations/benny/skills/triage-issue-reports/SKILL.md`. the repro prompt must read and follow `.cursor/automations/benny/skills/reproduce-and-fix-issues/SKILL.md`. use these repo-relative paths only after `/automate` confirms they are committed in the repository where the automation will run. + +for existing automations, do not use `/automate` to inspect or update them. validate the configuration, then use the concise field checklist in the copied setup file so i can edit each automation directly in its editor. do not create duplicates. diff --git a/pstack/automations/benny/README.md b/pstack/automations/benny/README.md new file mode 100644 index 0000000..76804c8 --- /dev/null +++ b/pstack/automations/benny/README.md @@ -0,0 +1,23 @@ +# benny + +benny gives you two cursor automations for slack issue reports. one triages each report. the other reproduces confirmed bugs and may prepare a small draft fix. + +the files in this directory are dormant setup and automation sources. they do not appear as slash skills. + +## set it up + +1. point cursor at [`FOR_AGENTS.md`](./FOR_AGENTS.md) and name the target repository. +2. let setup merge this whole directory into the target at `.cursor/automations/benny/`. it must preserve destination-only files and review conflicts instead of overwriting local edits. +3. let setup enable pstack in the target repository's `.cursor/settings.json` for shared dependencies: + +```json +{ + "plugins": { + "pstack": { "enabled": true } + } +} +``` + +4. keep user-owned configuration outside the copied pack, for example in `.cursor/benny/`. adapt [`configuration.example.yaml`](./templates/configuration.example.yaml) and [`feature-map.example.md`](./skills/reproduce-and-fix-issues/references/feature-map.example.md). +5. commit `.cursor/settings.json`, `.cursor/automations/benny/`, and any secret-free configuration before enabling either automation. +6. review each new automation draft or update existing automations in their editors. then send a harmless test report and verify every source-channel post stays in the original thread. diff --git a/pstack/automations/benny/skills/reproduce-and-fix-issues/SKILL.md b/pstack/automations/benny/skills/reproduce-and-fix-issues/SKILL.md new file mode 100644 index 0000000..effa5f0 --- /dev/null +++ b/pstack/automations/benny/skills/reproduce-and-fix-issues/SKILL.md @@ -0,0 +1,310 @@ +--- +name: reproduce-and-fix-issues +description: Reproduce triaged Slack bugs through a configured app-control adapter, verify existing fixes, and open a bounded draft pull request only after before-and-after proof. Use only from the configured Benny repro automation. +disable-model-invocation: true +--- + +# Reproduce and fix issues + +Wait for a trusted triage marker in the source thread. Reproduce the exact symptom through the target app's real UI. Verify an existing fix when one exists. Attempt a bounded fix only after a confirmed repro. + +Load the external Benny configuration supplied by the automation. If the config, required actions, control adapter, or completed feature map is missing, fail closed. + +## Hard safety rules + +- Freeze the source channel and root thread coordinates before doing any work. +- Never post a root message in the source channel. +- Preflight the source parent before every source-thread post. +- The coordinator is the only Slack poster. +- Delegated analysis workers are read-only and return findings or media notes. +- A fix-phase code worker may edit only when its environment provably excludes Slack credentials and every Slack write action. Otherwise the coordinator edits. +- Every child prompt must explicitly forbid `SendSlackMessage`, `PostToSlack`, `chat.postMessage`, and all other Slack writes. +- Never give a child a Slack token, posting instructions, source coordinates for posting, or permission to report externally. +- If a child needs Slack write access to run, do not launch it. +- Utility bots are evidence sources. They do not own the fix unless a person explicitly delegated the fix to them. +- The exact discriminating symptom must appear twice through real UI interaction. +- State inspection may confirm an observation. It must not inject or force the symptom. +- No confirmed repro means no authored fix. +- Existing pull requests or commits switch the run to verify mode. Do not author over them. +- Use `github.com` pull request links. +- Keep captures, recordings, logs, and tokens out of source control. +- Use pstack's `principle-guard-the-context-window` for delegated analysis. +- Apply pstack's `principle-sequence-verifiable-units`, `principle-fix-root-causes`, and `principle-prove-it-works` through repro, fix, and verification. + +## 1. Freeze source coordinates + +Before making a work list or delegating: + +1. Require the trigger channel to equal the configured source channel. +2. Set `SOURCE_THREAD_TS` to `trigger.thread_ts` when present. Otherwise use `trigger.ts`. +3. Require a nonempty `SOURCE_THREAD_TS`. +4. Store `SOURCE_CHANNEL_ID` and `SOURCE_THREAD_TS` as immutable values. +5. Read the source thread and verify its root has those exact coordinates. +6. Fetch the source permalink. + +Never replace these values with a reply timestamp, operations timestamp, or status-message timestamp. + +Before every source-channel post: + +1. Read the thread by the immutable coordinates. +2. Confirm the parent exists, is not deleted, and still belongs to the source channel. +3. Send only with `channel=SOURCE_CHANNEL_ID` and `thread_ts=SOURCE_THREAD_TS`. +4. Read the thread again and verify the new message is a reply. + +If any check fails, post nothing. Never retry at the root or in a fallback channel. + +## 2. Wait for the triage contract + +Watch the source thread for the configured verdict budget. Stay silent while waiting. + +Accept a verdict only when: + +- Its author matches `slack.triage_identity_user_id`. +- It is a reply under `SOURCE_THREAD_TS`. +- It contains exactly one configured marker. + +Public marker forms: + +```text +[benny:bug] +[benny:bug] tracker=https://tracker.example/issue/123 +[benny:performance] +[benny:performance] tracker=https://tracker.example/issue/123 +[benny:other] +``` + +Proceed only for `bug` or `performance`. Capture the optional tracker URL. Stop silently for `other`, a missing verdict, an untrusted author, conflicting markers, or a timeout. + +This marker replaces private bot identities and free-form verdict matching. + +## 3. Apply ownership and fix-artifact gates + +Re-read the thread immediately before starting work. + +### Someone is explicitly fixing it + +Stop when a person clearly claims the fix, gives a concrete implementation plan, or asks another agent to implement, patch, fix, or open a pull request. + +Do not treat these as fix ownership: + +- A bot summarizes evidence. +- A tool looks up logs or tickets. +- Someone asks a bot to diagnose, explain, inspect, or reproduce. +- A bot posts a cause hypothesis without agreeing to implement it. + +Judge the requested action, not the presence of a bot. + +### A fix artifact already exists + +If an open pull request or merged commit plausibly fixes this report, switch to `references/verify-existing-fix.md`. + +An artifact may come from the thread, tracker issue, repository history, or pull request search. A claim without a commit or pull request is not a fix artifact. + +If a person owns the work but has not produced an artifact, stop. Do not race them. + +## 4. Open an optional operations thread + +If `slack.operations_channel_id` is configured, the coordinator may create one root status message there. This is the only allowed root post in the repro workflow. + +Store its coordinates as `OPERATIONS_CHANNEL_ID` and `OPERATIONS_THREAD_TS`. Never confuse them with the source coordinates. + +Use the configured plain Unicode status strings. Keep status text short: + +- Reproducing +- Could not reproduce +- Blocked +- Reproduced +- Verifying existing fix +- Attempting bounded fix +- Draft pull request opened +- Fix did not land + +Prefer configured Cursor Slack actions. Use `BENNY_SLACK_BOT_TOKEN` only when the user configured it for a narrow missing capability such as editing this one status message. Never expose the token to a worker. + +If no operations channel is configured, keep detailed status in the automation run output. Do not substitute a source-channel root message. + +## 5. Load and check the control adapter + +Read `references/control-adapter.md` and the completed map at `control.feature_map_path`, then invoke the skill named by `control.skill_name`. + +Find the feature-map section that matches the reported user path. Read it before driving the app. If no section covers the feature, mark the run blocked instead of inventing a path or selector. + +Require all seven capabilities: + +1. Bring up the configured target app and test environment. +2. Navigate the mapped feature and exercise its documented states. +3. Drive the real UI with clicks, typing, keys, scrolling, drag, resize, or navigation. +4. Inspect state without mutating it. +5. Capture screenshots. +6. Start and stop a screen recording. +7. Clean up processes, sessions, profiles, and temporary data. + +If the adapter is absent or any required capability is missing, mark the operations status as blocked and stop. Do not pretend a screenshot, unit test, state mutation, or source reading is a UI repro. + +## 6. Study the report + +Read the full source thread and tracker issue when present. + +Collect: + +- Exact action path +- Expected behavior +- Observed behavior +- Discriminating state where they diverge +- Frequency +- Version, environment, and platform +- Attachments and error signatures +- Candidate code area + +Inspect screenshots and video. Use read-only parallel workers for code history, test ideas, blast-radius mapping, and media review when useful. Each worker gets a narrow question and the Slack-write prohibition. + +Use pstack's `how` skill to trace the action through the repository. Use `why` for regression history and defensive code. Form competing cause hypotheses and identify evidence that would separate them. + +## 7. Reproduce + +Bring up the target app through the control adapter. + +Confirm the correct app, workspace, account, data set, and feature state before acting. Use stable app markers. Do not rely on window order or a familiar title alone. + +Drive the reported path through real UI actions. + +Before calling it reproduced: + +1. Name the correct final state. +2. Name the broken final state. +3. Reach the point where they diverge. +4. Observe the broken state. +5. Reset enough state to make the second attempt independent. +6. Repeat the same path and observe the same broken state again. +7. Cross-check a real state value when possible. + +An expected dialog, loading state, or setup step is not the bug. Capture the final state that distinguishes correct from broken behavior. + +Use the configured repro budget. If the symptom does not reproduce within it, report a clean `Could not reproduce` outcome. If the environment cannot provide a required capability, report `Blocked` and state what was missing. + +## 8. Capture and review evidence + +For a successful repro: + +- Record the full path through the symptom. +- Capture a screenshot of the broken final state. +- Save a short note with the exact steps and observed state. +- Keep artifacts in the configured temporary artifact directory. + +Have a read-only media reviewer answer one question: does the evidence visibly show the discriminating broken state? + +If the answer is no or uncertain, the repro is not confirmed. Capture better evidence or use `Could not reproduce`. + +Post detailed evidence only in the operations thread when configured. Keep the source update concise. + +## 9. Report the repro outcome + +Update the operations status first. + +For `Could not reproduce` or `Blocked`, post nothing in the source thread. The operations thread or run output carries the result. + +For a confirmed repro, run the source preflight and post at most one unprompted source reply: + +- Say the issue reproduced. +- Link the operations evidence thread when one exists. +- Include at most three short findings. +- Link the tracker issue when one exists. +- Do not ping an owner by default. + +Attach evidence only when the configured Slack action keeps it inside the same source thread and the organization's retention policy allows it. + +Wait for the configured rejection window. If a person shows that the setup or interpretation was wrong, correct the repro once. Do not start the fix phase until the window closes without a valid rejection. + +## 10. Verify an existing fix + +When a fix artifact exists, follow `references/verify-existing-fix.md`. + +Verification must show the symptom on the baseline and its absence on the patched build. Both paths use the real UI twice. + +Do not edit the existing fix, add a competing patch, or open a replacement pull request. + +## 11. Qualify a bounded fix + +Attempt a fix only when all of these hold: + +- The outcome is a plain confirmed repro. +- Media review confirmed the broken final state. +- No existing fix artifact appeared. +- No person claimed the fix during the rejection window. +- Runtime evidence identifies the root cause. +- The likely change fits the configured fix budget and repository scope. +- The control adapter can run both baseline and patched builds. + +If any condition fails, keep the repro report and stop without a pull request. + +When the gate passes, update operations status to `Attempting bounded fix`. + +## 12. Root-cause and implement + +The coordinator owns every Slack post, the final diff review, commits, and the pull request. + +Read-only workers may: + +- Trace code and history +- Propose tests +- Map blast radius +- Review a diff +- Review media + +They do not edit, run external writes, post status, or own the fix. + +A tightly scoped code edit may be delegated during this phase only when tool isolation removes Slack credentials and every Slack write action from that worker. Its prompt must still carry the explicit Slack-write ban. The coordinator reviews the edit and runs or verifies the required tests. If tool isolation is uncertain, keep the edit in the coordinator. + +Confirm the mechanism with runtime evidence. Eliminate competing hypotheses before editing. + +Fix the root cause with the smallest justified change. + +- Invoke pstack's `tdd` skill when there is a cheap local test target, and write the failing test before the fix. +- State why TDD was skipped when the path is expensive, unclear, or integration-heavy. +- Keep unrelated cleanup out. +- Stop if the change grows beyond the configured effort or risk budget. + +## 13. Prove the fix + +Keep the original baseline evidence. + +On the patched build: + +1. Run the same real UI path. +2. Repeat it twice. +3. Show that the broken state is gone. +4. Show the expected state in its place. +5. Capture an after recording and screenshot. +6. Cross-check the same real state value used for the baseline. + +A compile, unit test, code review, or plausible diff is not after evidence. + +Run focused tests, then smoke the blast radius around the changed behavior. Cover nearby states, inputs, permissions, platforms, and failure paths that the change could affect. Stop without a pull request if a regression remains. + +## 14. Open a draft pull request + +Only after before-and-after proof: + +- Review the final diff for unrelated changes and secrets. +- Run the repository's required checks. +- Create small ordered commits when the repository workflow allows it. +- Open a draft pull request. Never merge or deploy from this workflow. +- Link the configured tracker issue using the tracker's supported pull request syntax. +- Use the configured public URL form, normally `https://github.com/{owner}/{repo}/pull/{number}`. +- Include the repro steps, root cause, test result, before and after evidence, and blast-radius checks. +- Run the pull request text and all Slack updates through pstack's `unslop` skill. + +If pull request creation fails, do not claim success. Keep the commit or branch state in the run output and mark operations status `Fix did not land`. + +On success, mark operations status `Draft pull request opened` and post one concise reply in the operations thread with the linked pull request. Do not create a second source-channel root or unprompted source reply. + +## 15. Follow-ups and cleanup + +Watch the configured operations thread for one follow-up window. + +- Answer a direct question from evidence already gathered. +- Apply one concrete correction and rerun the repro once when it invalidates the setup. +- Stay out of human coordination and side chatter. +- Stop when asked. + +Always call the control adapter's cleanup capability. Keep artifacts only as long as the configured retention policy allows. diff --git a/pstack/automations/benny/skills/reproduce-and-fix-issues/references/control-adapter.md b/pstack/automations/benny/skills/reproduce-and-fix-issues/references/control-adapter.md new file mode 100644 index 0000000..3399491 --- /dev/null +++ b/pstack/automations/benny/skills/reproduce-and-fix-issues/references/control-adapter.md @@ -0,0 +1,169 @@ +# Control-adapter contract + +Benny does not know how to start or drive every app. The user must configure one control skill or adapter that implements this contract for the target app. + +Set its skill name in `control.skill_name`. + +Set the completed user-facing feature map path in `control.feature_map_path`. Copy and fill [`feature-map.example.md`](./feature-map.example.md) outside `.cursor/automations/benny/` instead of editing the copied example. + +If the skill, feature map, or a required capability is absent, ambiguous, or incomplete, repro and fix work must fail closed. + +## Required capabilities + +### Bring up + +Start the requested app revision in the requested test environment. + +Input: + +- Repository and revision +- Build or start mode +- Workspace, account, fixture, and feature-state requirements +- Artifact directory +- Completed feature-map path + +Return: + +- Session identifier +- How the adapter confirmed the correct app and environment +- Stable app markers +- Running process or target details needed by later calls +- Any missing capability + +The adapter must distinguish the target app from a similar window, shell, or production instance. + +### Drive UI + +Perform real user actions: + +- Click +- Type +- Press keys +- Scroll +- Drag +- Resize +- Navigate through app controls + +Prefer roles, labels, and stable selectors. Use coordinates only after a fresh screenshot. + +Return each action and the observed state change. + +Do not set internal state, call hidden app methods, write directly to storage, or inject DOM changes to create the symptom. + +### Drive mapped features and states + +Read the relevant feature-map section before driving the app. + +The adapter must expose ways to: + +- Navigate every mapped feature through the user-visible path. +- Invoke the adapter action names listed for that feature. +- Interact with default, hover, focus-visible, active, disabled, loading, empty, error, selected, open, expanded, and feature-specific states when they apply. +- Arrange a state through safe fixture data, permissions, flags, service responses, or supported test controls. +- Reset the feature for a second independent repro attempt. +- Capture the screenshot, video, and read-only cross-check named by the feature map. + +Use roles, accessible names, ARIA relationships, stable component markers, and purpose-named data attributes. Never use generated CSS or StyleX classes, dynamic hashes, child indexes, or brittle DOM position. + +Arranging a precondition is not permission to inject the reported symptom. The repro itself must still come from real user interaction. + +### Inspect state + +Read state to confirm what the UI shows. + +Examples: + +- Accessibility tree +- DOM or view hierarchy +- Process state +- Local logs +- Network request status +- App-exposed debug state + +Inspection is read-only. If a query changes state, it belongs in `drive UI` and must represent a real user action. + +### Screenshot + +Capture the current app state to a requested path. + +Return: + +- File path +- Capture time +- App marker or window title +- Short description of what should be visible + +The screenshot must show enough app chrome to prove that the correct app is under test. + +### Recording + +Start and stop a screen recording around the full repro path. + +Return: + +- File path +- Start and stop times +- Captured window or region +- Whether audio or sensitive overlays were omitted + +The recording must show the discriminating final state, not only setup or a loading screen. + +### Cleanup + +Stop processes and sessions created by the adapter. + +Remove disposable: + +- Browser or app profiles +- Temporary workspaces +- Test accounts or fixtures when the adapter created them +- Debug ports and tunnels +- Captures past their retention window + +Return what was stopped, removed, retained, or left for a person. + +Cleanup must not delete user work. + +## Adapter behavior + +The adapter must: + +- Report capabilities before the repro starts. +- Report which feature-map sections it can drive and which are blocked. +- Use the same environment inputs for baseline and patched builds. +- Surface startup failures as failures. +- Bound retries. +- Keep secrets out of logs and artifacts. +- Keep captures outside the repository. +- Support a fresh or reset state between the two repro attempts. +- Avoid production changes unless the user explicitly configured a safe test action. + +## Environment translation + +Before declaring an environment block, restate the defect without platform-specific nouns and ask whether the same behavior can be tested safely in the available environment. + +Examples: + +- A named browser may mean any external browser. +- A named key may mean the configured shortcut. +- A named remote host may mean a delayed or disconnected remote target. + +Use a translated attempt only when it tests the same underlying behavior. Label it as translated evidence. Do not call it an exact repro when the missing environment is part of the defect. + +Hardware prompts, operating-system permission dialogs, device-only APIs, and unavailable account states may be real blocks. + +## Setup check + +Before enabling the repro automation, run one harmless adapter check: + +1. Bring up the app. +2. Confirm the stable app marker. +3. Load one completed feature-map section. +4. Navigate to that feature through its user path. +5. Exercise one disposable state through mapped adapter actions. +6. Inspect the resulting state. +7. Capture a screenshot. +8. Record a short clip. +9. Clean up. + +Enable repro work only when all nine steps succeed and no source-channel Slack post is involved. diff --git a/pstack/automations/benny/skills/reproduce-and-fix-issues/references/feature-map.example.md b/pstack/automations/benny/skills/reproduce-and-fix-issues/references/feature-map.example.md new file mode 100644 index 0000000..620dded --- /dev/null +++ b/pstack/automations/benny/skills/reproduce-and-fix-issues/references/feature-map.example.md @@ -0,0 +1,205 @@ +# Feature-map example + +Map every user-facing feature Benny may reproduce. Read the relevant section before driving the app. Keep this map at the user point of view. Discover internals and current code paths at runtime instead of freezing them here. + +Copy this file outside `.cursor/automations/benny/`, for example to `.cursor/benny/feature-map.md`, and set `control.feature_map_path` to the copy. Pack refreshes must not overwrite it. + +## Per-feature template + +### `` + +`` + +#### How a user gets there + +- Click path: ` -> -> ` +- Keyboard shortcut: `` + +#### How the control adapter drives it + +- `` with `` should ``. +- Reset: ``. + +#### Stable selectors + +- `` +- `` +- `` + +Never use generated CSS or StyleX classes, dynamic hashes, child indexes, or brittle DOM position. + +#### States to exercise + +- Default, hover, focus-visible, active, disabled +- Loading, empty, error +- Selected, open, expanded +- `` + +Mark states that do not apply. + +#### Preconditions and setup + +- Auth: `` +- Data: `` +- Permissions: `` +- Flags: `` +- Services: `` + +#### Evidence and cross-check + +- Screenshot: `` +- Video: `` +- Cross-check: `` + +#### Gotchas + +- `` +- `` + +## Fictional example + +These features belong to a fictional task app. They are examples, not required Benny features. + +### Sign in + +Lets a user enter the task app. + +#### How a user gets there + +- Open the app and choose `Sign in`. No shortcut. + +#### How the control adapter drives it + +- `open_app`, `click Sign in`, `fill credentials`, and `click Continue` should open the item list. +- Reset by signing out and clearing the disposable session. + +#### Stable selectors + +- Button `Sign in`, textboxes `Email` and `Password`, `data-component="sign-in-form"` + +#### States to exercise + +- Default, focus-visible, submitting, disabled, loading, error + +#### Preconditions and setup + +- Disposable account and available authentication service + +#### Evidence and cross-check + +- Record landing page through item list. Check read-only session state. + +#### Gotchas + +- A marketing page is the wrong surface. A missing auth service is a block. + +### Item list and detail + +Lets a user browse items and open one. + +#### How a user gets there + +- Open the `Items` tab, then choose a row. + +#### How the control adapter drives it + +- `select_tab Items` and `click ` should open its detail. +- Reset by closing the detail and clearing selection. + +#### Stable selectors + +- Tab and list named `Items`, fixture-named row, `data-component="item-detail"` + +#### States to exercise + +- Loading, empty, error, selected, open, expanded + +#### Preconditions and setup + +- Named fixture items, read permission, available item service + +#### Evidence and cross-check + +- Show selection and matching detail title. Check selected-item ID. + +#### Gotchas + +- Search results may look similar but use a different path. + +### Item editor + +Lets a user create or edit an item. + +#### How a user gets there + +- Choose `Edit` from detail or `New item` from the list. + +#### How the control adapter drives it + +- `click Edit`, `fill `, and `click Save` should update detail. +- Reset by restoring the fixture. + +#### Stable selectors + +- Buttons `Edit`, `New item`, `Save`, form `Item editor`, label-linked fields + +#### States to exercise + +- Default, focus-visible, dirty, validating, disabled, saving, error, success + +#### Preconditions and setup + +- Editable fixture, write permission, available save service + +#### Evidence and cross-check + +- Show field change through updated detail. Check the stored item value read-only. + +#### Gotchas + +- Do not inject form state. A read-only detail field is not the editor. + +### Settings + +Lets a user change personal preferences. + +#### How a user gets there + +- Open the profile menu, then choose `Settings`. + +#### How the control adapter drives it + +- `open_menu Profile`, `click Settings`, and `toggle ` should update the control. +- Reset by restoring the starting preference. + +#### Stable selectors + +- Button `Profile`, menu item `Settings`, region `Settings`, purpose-named preference attribute + +#### States to exercise + +- Closed, open, selected, focus-visible, disabled, loading, error + +#### Preconditions and setup + +- Signed-in test account, known preferences, available preference service + +#### Evidence and cross-check + +- Show the menu path and final control state. Check the preference value read-only. + +#### Gotchas + +- Operating-system settings are a different surface. + +## Completeness checklist + +- Every reproducible user-facing feature has a section. +- Every section names a user path, adapter actions, and reset. +- Selectors use roles, names, ARIA, stable component markers, or purpose-named attributes. +- No selector uses generated classes or DOM position. +- Relevant interaction, loading, empty, error, selected, and expanded states are covered. +- Auth, fixtures, permissions, flags, and services are explicit. +- Screenshot, video, and underlying cross-check requirements are explicit. +- Wrong surfaces, dead ends, and safe environment translations are listed. +- Implementation details remain runtime discoveries. diff --git a/pstack/automations/benny/skills/reproduce-and-fix-issues/references/verify-existing-fix.md b/pstack/automations/benny/skills/reproduce-and-fix-issues/references/verify-existing-fix.md new file mode 100644 index 0000000..e3cda3c --- /dev/null +++ b/pstack/automations/benny/skills/reproduce-and-fix-issues/references/verify-existing-fix.md @@ -0,0 +1,93 @@ +# Verify an existing fix + +Use this mode when an open pull request or merged commit plausibly fixes the report. + +The existing artifact owns the fix. Verify it. Do not edit it, author a competing patch, or open another pull request. + +## Qualify the artifact + +Require one concrete artifact: + +- An open pull request with code changes that address the symptom +- A merged pull request +- A merged commit with matching code and intent + +A thread claim, tracker status, branch name, or cause hypothesis without a pull request or commit is not enough. + +When several artifacts exist, choose the one linked from the source thread or tracker. Otherwise choose the closest match to the affected code and state why. + +## Protect the working tree + +Use an isolated worktree or another clean checkout when the repository supports it. Do not overwrite user changes. + +Record: + +- Baseline revision +- Patched revision +- Pull request or commit URL +- Build and environment inputs shared by both runs + +Use regular `github.com` pull request links. + +## Measure the baseline + +For an open pull request, use its base branch as the baseline. + +For a merged fix, use the revision immediately before the fix when that revision builds and represents the old behavior. + +Through the configured control adapter: + +1. Bring up the baseline app. +2. Confirm the correct app and environment. +3. Run the reported path through real UI actions. +4. Observe the discriminating symptom. +5. Reset and repeat it. +6. Capture baseline recording, screenshot, and state check. + +If the symptom does not appear twice on the baseline, there is no baseline. Do not claim that the fix works. + +## Measure the patched build + +Build and run the pull request or fix commit with the same environment and data. + +1. Run the same UI path. +2. Repeat it twice. +3. Confirm that the broken state is gone. +4. Confirm the expected state appears. +5. Capture after recording, screenshot, and the same state check. + +Do not stop at compilation or tests. The after result must come from a running patched app. + +## Outcomes + +### Confirmed + +The baseline reproduces twice and the patched build resolves it twice. + +- Mark operations status as verified. +- Link the artifact. +- Post one concise source-thread reply after the source preflight. +- Include the before and after result. +- Open no pull request. + +### Insufficient fix + +The symptom appears on both baseline and patched builds. + +- Mark operations status as reproduced but not fixed. +- Link the artifact and say it did not resolve the symptom. +- Post the normal confirmed-repro source update if the run has not already used it. +- Open no competing pull request. + +### Inconclusive + +The baseline does not reproduce, the patched app cannot run, or the evidence does not show the discriminating state. + +- Do not claim success. +- State which half could not be measured. +- Keep the result in the operations thread or run output. +- Post nothing in the source thread unless a direct question requires an answer. + +## Cleanup + +Stop both builds, remove temporary profiles and captures according to retention policy, and return the repository to its prior state without discarding user work. diff --git a/pstack/automations/benny/skills/setup-benny/SKILL.md b/pstack/automations/benny/skills/setup-benny/SKILL.md new file mode 100644 index 0000000..b05c26f --- /dev/null +++ b/pstack/automations/benny/skills/setup-benny/SKILL.md @@ -0,0 +1,266 @@ +--- +name: setup-benny +description: Configure Benny and prepare its triage and repro automations. Use when installing Benny or changing its Slack, tracker, repository, routing, control, model, or budget settings. +disable-model-invocation: true +--- + +# Set up Benny + +Benny ships as a dormant automation pack inside pstack. The plugin manifest exposes only pstack's normal skill root; this file and the two operational files are not slash skills. + +The human enters setup by pointing Cursor at the pack's `FOR_AGENTS.md`. The bootstrap flow copies the whole pack into the target repository, then reads this file directly at `.cursor/automations/benny/skills/setup-benny/SKILL.md`. + +Benny needs external configuration and two live Cursor automations. + +Do not create or update an automation until the user explicitly asks. Never put a secret value in plugin files, prompts, or committed configuration. + +## 1. Copy the pack and enable shared pstack skills + +Do this before asking for Benny configuration and before invoking the built-in `/automate` skill. + +Ask which repository will run the automations. The source pack is the directory containing `FOR_AGENTS.md`. The destination is `/.cursor/automations/benny/`. + +Merge the entire source pack into the destination: + +1. Create the destination when it is absent. +2. Copy every source file to the same relative path. +3. Preserve destination-only files. Never delete unrelated files during install or refresh. +4. Keep user-owned configuration, feature maps, and routing maps outside the destination. Never overwrite them. +5. When an existing source-managed file differs, inspect the diff and merge without discarding local edits. If ownership is ambiguous, stop and ask before replacing it. +6. Verify that the destination contains `FOR_AGENTS.md`, this setup file, both operational files, their references, and the templates. + +If this file is already being read from the target destination, treat the copy as complete and run the same verification before continuing. + +Add pstack to the target repository's `.cursor/settings.json`. If the file or `.cursor` directory does not exist, create it. + +Merge this entry into the existing JSON or JSONC: + +```json +{ + "plugins": { + "pstack": { "enabled": true } + } +} +``` + +Preserve every unrelated top-level setting and every other plugin entry. If `plugins.pstack` already exists, change only its `enabled` value. Preserve comments and valid JSONC syntax when the file uses JSONC. Validate the file after editing it. + +Reload the target project or start a fresh agent rooted there. Verify that these shared pstack skills resolve from project scope: + +- `how` +- `why` +- `tdd` +- `unslop` +- `principle-separate-before-serializing-shared-state` +- `principle-minimize-reader-load` +- `principle-guard-the-context-window` +- `principle-sequence-verifiable-units` +- `principle-fix-root-causes` +- `principle-prove-it-works` + +Do not count a skill loaded from the current session or a user-scoped plugin. The check must show that a fresh agent in the target repository receives pstack through project settings. + +If project-scoped plugin installation is unavailable or any shared dependency does not resolve, stop and explain the failure. + +The Benny files are read directly from `.cursor/automations/benny/`. Do not add that directory to a plugin manifest or expect its `SKILL.md` files to appear in the slash-skill list. + +Tell the user that `.cursor/settings.json`, `.cursor/automations/benny/`, and any referenced secret-free configuration must be committed before either automation is enabled. Do not commit them unless the user asks. + +Once this check passes, live automation prompts may read the committed operational files by their stable repository-relative paths. They must not embed a plugin cache path or copy the file contents. + +## 2. Adapt the configuration + +Open these copied examples: + +- `../../templates/configuration.example.yaml` +- `../reproduce-and-fix-issues/references/feature-map.example.md` + +Create user-owned copies outside `.cursor/automations/benny/`. These are configuration files, not pack files. Example locations: + +- Project config, such as `.cursor/benny/configuration.yaml` +- Project feature map, such as `.cursor/benny/feature-map.md` +- Project routing map, such as `.cursor/benny/routing.md` +- User config, such as `~/.config/benny/configuration.yaml` +- User feature map, such as `~/.config/benny/feature-map.md` + +Fill one feature-map section for every user-facing feature the automation may reproduce. Keep it at the user point of view. Do not freeze implementation details or current code paths in the map. + +Do not edit the copied examples. Pack refreshes may update source-managed files after conflict review, but they must never touch the user-owned copies. + +Prefer committed, secret-free files in the target repository when a fresh automation checkout must read them. Otherwise paraphrase the required values into the live prompt. Reference a repository file only after the built-in `/automate` skill confirms that the file is committed in the repository where the automation runs. + +Use stable repository-relative paths for committed pack and configuration files. Never reference the plugin source directory or a plugin cache path from a live automation. + +## 3. Fill the required choices + +Ask for or confirm: + +- Source Slack channel ID +- Optional operations or status channel ID +- Repository URL and default branch +- Triage identity or Slack user ID +- Issue tracker type, team, project, labels, and intake status +- Tracker adapter skill or MCP actions +- Optional routing map path +- Required control skill name +- Required user-facing feature-map path +- Status emoji strings +- Pull request URL format +- Polling and effort budgets +- Model slug for triage, repro, code work, and media review + +Use only model slugs shown as available in the user's Cursor model picker or supported model list. Do not guess a slug and do not carry over a private default. + +The source channel, triage identity, repository, tracker adapter, control skill, and feature map must be explicit. Fail setup if any required value stays ambiguous. + +Use pstack's `unslop` skill on the final automation names, descriptions, and prompt shims before saving them. + +## 4. Check integration capabilities + +The triage automation needs: + +- Read access to the configured source Slack channel and its threads +- Thread-reply access in that channel +- Attachment metadata and file download access when reports include media +- Search, read, create, and update access through the configured issue-tracker adapter + +The repro automation needs: + +- Read access to the source thread +- Thread-reply access in the source channel +- Optional post and edit access in the configured operations channel +- Repository read and history access +- A pull request action that can open a draft pull request +- The configured control-adapter skill + +Prefer configured Cursor Slack actions for reads and posts. The optional `BENNY_SLACK_BOT_TOKEN` may fill a narrow gap such as editing one operations status message or downloading an attachment. Store the value in a secret manager or environment, not in YAML. + +Do not use undocumented integration endpoints. + +## 5. Prepare the routing map + +If the user wants reroutes or owner pings: + +1. Copy `../triage-issue-reports/references/routing.example.md` outside `.cursor/automations/benny/`. +2. Replace every placeholder with public or organization-local values. +3. Keep owner pings off by default. +4. Allow a ping only for a configured feature owner or a confirmed likely regression author. + +If no routing map is configured, triage may classify a report but must not guess a destination or owner. + +## 6. Verify the control adapter + +Read `../reproduce-and-fix-issues/references/control-adapter.md` and the user's completed feature map. + +Confirm that the named skill can: + +- Bring up the target app +- Navigate every mapped feature through the real UI +- Exercise mapped states through declared adapter actions +- Inspect state without forcing the result +- Capture screenshots +- Start and stop a recording +- Clean up its processes and temporary data + +If any capability is missing, leave the repro automation disabled. It must fail closed rather than claim a reproduction it did not perform. + +## 7. Prepare the live automations + +Ask whether this is first-time creation or configuration of existing automations. + +Read `../../FOR_AGENTS.md` from the copied pack as the primary user-intent source for either path. Use it to understand the two triggers, tools, instructions, outcomes, and shared rules. + +### First-time creation + +Create one automation at a time. + +For each automation: + +1. Read the matching copied prompt template as secondary internal source material. +2. Turn `FOR_AGENTS.md`, the finished Benny configuration, and the template intent into a complete natural-language request. +3. Tell the live prompt to read and follow its exact committed operational file under `.cursor/automations/benny/`. +4. Use the stable repository-relative path, not a plugin source or cache path. Do not copy the operational file contents into the live prompt. +5. Read and follow the built-in `automate` skill. +6. Let `automate` discover Slack channels, the repository, and connected integrations. +7. Let `automate` confirm that the copied pack and any referenced configuration files are committed in the same repository where the automation will run. +8. Let `automate` show its draft table, obtain approval, ask readiness, and open the Automations editor. +9. Finish the editor handoff for this automation before starting the next one. + +Give `automate` this complete triage intent, filled from configuration: + +- Name `benny-triage`. +- Read and follow `.cursor/automations/benny/skills/triage-issue-reports/SKILL.md` for every run. +- Trigger on each new top-level report in the configured source Slack channel. +- Read the triggering thread and reply only inside it. +- Use the configured issue-tracker integration. +- Classify, inspect evidence, trace cause, dedupe, and create only clear new bugs. +- End one thread-only verdict with the configured `[benny:bug]`, `[benny:performance]`, or `[benny:other]` marker and optional tracker URL. +- Never post a source-channel root message. + +After the triage editor handoff is complete, give `automate` this complete repro and fix intent: + +- Name `benny-reproduce`. +- Read and follow `.cursor/automations/benny/skills/reproduce-and-fix-issues/SKILL.md` for every run. +- Trigger on the same new top-level reports in the configured source Slack channel. +- Use the configured repository and default branch. +- Read the source thread and reply only inside it. +- Include pull request creation and the configured tracker, control-adapter, and feature-map requirements. Paraphrase mapped user paths and states unless `automate` confirms an eligible committed file in the same repository. +- Wait for a trusted triage marker before acting. +- Reproduce the exact symptom twice through the mapped real UI and capture evidence. +- Verify an existing fix without authoring over it. +- Attempt an optional bounded fix only after confirmed repro, then open a draft pull request when proof and checks pass. +- Never post a source-channel root message. + +Do not duplicate `automate`'s Slack, repository, integration, completeness, authentication, draft-review, approval, readiness, or editor-handoff work. + +### Existing automations + +The built-in `automate` skill is creation-only. Do not use it to search for, inspect, or update existing automations. + +Finish configuration, routing, control-adapter, and feature-map validation. Then give the user this concise editor checklist. + +For the existing triage automation, update: + +- Name and description +- Direct instruction to read `.cursor/automations/benny/skills/triage-issue-reports/SKILL.md` +- New top-level Slack report trigger and source channel +- Slack thread read and reply capabilities +- Issue-tracker integration +- Paraphrased triage instructions, thread-only rule, and Benny verdict markers + +For the existing repro automation, update: + +- Name and description +- Direct instruction to read `.cursor/automations/benny/skills/reproduce-and-fix-issues/SKILL.md` +- Matching Slack trigger and source channel +- Repository and default branch +- Slack thread read and reply capabilities +- Pull request action +- Tracker, control-adapter, and feature-map requirements +- Paraphrased marker wait, evidence, verification, and bounded-fix instructions + +Ask the user to update each existing automation directly in its Automations editor. Do not create replacements or duplicates. + +### Creation boundary + +Never call a direct automation backend service or backend automation tool. Never use a browser URL that carries draft fields. Never build or open a Cursor protocol deep link. For new automations, the only finish path is the built-in `automate` skill's reviewed Automations editor handoff. + +Do not enable either automation until the thread-safety test passes after the editor save. + +## 8. Test thread safety + +Use a test channel or a harmless test report. + +Before testing, confirm that the target repository's `.cursor/settings.json`, `.cursor/automations/benny/`, and every referenced secret-free configuration file are committed on the branch used by the automation checkout. Confirm that both live prompts point at their exact committed operational files. If any check fails, stop. Tell the user that the automation cannot be enabled yet. + +Verify: + +1. Triage stores the root `thread_ts` and posts exactly one verdict as a reply. +2. The verdict contains one configured marker. +3. Repro accepts the marker only from the configured triage identity. +4. Repro keeps the same immutable source coordinates. +5. No source-channel root message appears. +6. A delegated worker cannot use any Slack write action. +7. Missing coordinates, a deleted parent, or a failed preflight produces no post and no tracker issue. + +Enable normal traffic only after all seven checks pass. diff --git a/pstack/automations/benny/skills/triage-issue-reports/SKILL.md b/pstack/automations/benny/skills/triage-issue-reports/SKILL.md new file mode 100644 index 0000000..d691672 --- /dev/null +++ b/pstack/automations/benny/skills/triage-issue-reports/SKILL.md @@ -0,0 +1,240 @@ +--- +name: triage-issue-reports +description: Triage Slack issue reports with one thread-only verdict, evidence review, cause-aware routing, tracker dedupe, and fail-closed ticket creation. Use only from the configured Benny triage automation. +disable-model-invocation: true +--- + +# Triage issue reports + +Classify one Slack report and post one useful verdict in its source thread. Create a tracker issue only for a clear, new bug. Do not reproduce or fix it here. + +Load the external Benny configuration supplied by the automation. If the config is missing, malformed, or incomplete, stop without posting or writing to the tracker. + +## Hard safety rules + +- The source channel and root thread coordinates are immutable. +- Never post a root message in the source channel. +- Never post to another channel, broadcast a reply, send a DM, or start a replacement thread. +- Preflight the source parent before any tracker write and immediately before the verdict post. +- If the parent is missing, deleted, inaccessible, or uncertain, stop with no writes. +- Post one substantive verdict. Do not narrate progress. +- The coordinator is the only Slack poster. +- Delegated workers return findings only. They must be read-only and receive no Slack credentials or write actions. +- Every child prompt must forbid `SendSlackMessage`, `PostToSlack`, `chat.postMessage`, and every other Slack write. +- If worker isolation cannot enforce those limits, do the work in the coordinator. +- Never create an issue that cannot link back to the source thread. +- Prefer no ticket over a guessed or duplicate ticket. +- Apply pstack's `principle-separate-before-serializing-shared-state` to source coordinates. +- Apply pstack's `principle-minimize-reader-load` and `unslop` skills to the final verdict. + +## 1. Freeze source coordinates + +Before making a work list or delegating: + +1. Read `source_channel_id` from the trigger. +2. Require it to equal the configured source channel. +3. Set `SOURCE_THREAD_TS` to `trigger.thread_ts` when present. Otherwise use `trigger.ts`. +4. Require a nonempty `SOURCE_THREAD_TS`. +5. Store `SOURCE_CHANNEL_ID` and `SOURCE_THREAD_TS` as immutable values. +6. Read the thread and verify that its root has exactly those coordinates. +7. Fetch a stable source permalink. + +Every later source read and post must use those stored values. Never replace them with a reply timestamp or an operations-thread timestamp. + +## 2. Read the whole report + +Read the root and current replies before deciding. + +Capture: + +- Reporter wording +- Product version, app build, environment, and platform when present +- Expected behavior +- Observed behavior +- Frequency and trigger +- Error text or stack signature +- Existing issue, commit, or pull request links +- Any explicit statement that someone is already fixing it + +Inspect every relevant attachment. + +- Read screenshots at full useful resolution. +- Review video for the state transition that separates correct and broken behavior. +- Read logs, traces, and crash text for concrete signatures. +- If media needs specialist review, use a read-only media worker and ask a narrow question. The worker returns findings only. +- If an attachment cannot be read, say so in the verdict. Do not invent what it shows. + +Use evidence already in the thread before asking the reporter for more. + +## 3. Trace cause before routing + +Do a bounded source and history pass before choosing an owner or destination. Use pstack's `how` skill to trace the path from the reported action to the observed result. Use `why` when the report looks like a regression or touches defensive code. + +1. Identify the likely code path from the reported action to the observed result. +2. Check whether the visible symptom belongs to that code path or a dependency below it. +3. Check recent changes when the report looks like a regression. +4. Check whether a merged commit or open pull request already addresses the same symptom. +5. Separate confirmed facts from hypotheses. + +This pass does not need a complete root cause. It must be strong enough to avoid routing a visible symptom to the wrong owner. + +If the repository cannot be read, do not guess a code owner. Continue with a conservative classification and say that cause tracing was unavailable. + +## 4. Classify + +Choose one category. + +### Bug + +Something violates intended behavior. Examples include wrong output, broken state, an error, a crash, a hang, a silent no-op, or a regression. + +### Performance + +The report describes measurable slowness, excess memory, battery drain, jank, or another resource problem. Treat it as a bug, but preserve measurements and profiles. + +### Feature request + +The current behavior appears intentional and the reporter wants a different behavior or affordance. + +### Question or feedback + +The report asks how something works, expresses a preference without a concrete defect, or gives general feedback. + +### Reroute + +Cause tracing shows that another configured destination owns the issue. + +When the bug versus feature line is unclear, do not file. The one verdict may ask one focused question and use the `other` marker. + +## 5. Apply configured routing + +Read the optional routing map from `routing.map_path`. + +- Match on confirmed product area, code path, or error signature. +- A visible symptom alone is not enough when cause tracing points elsewhere. +- If no route matches, say the owner is unclear. Do not guess. +- Do not cross-post. Tell the reporter where to take the issue in the source thread. + +Owner pings are off by default. A ping is allowed only when all of these hold: + +1. The routing map explicitly names the owner. +2. The config allows that ping type. +3. The item is a feature request that needs owner input, or recent history identifies a likely regression author with strong evidence. +4. The owner is not a broad on-call group. + +No other case gets a ping. + +## 6. Use the issue-tracker adapter + +The tracker is an adapter, not a required vendor. A Linear adapter is one valid example. A GitHub Issues adapter or another tracker may implement the same contract. + +The configured adapter must provide: + +- Search issues by text, state, label, source URL, and date range +- Read one issue and its links +- Create an issue with title, body, status, labels, and source URL +- Update an existing issue without replacing unrelated fields +- Add a source link and recurrence note +- Cancel, close, or delete an issue created by this run if the Slack handoff fails + +If a required operation is unavailable, fail closed for that write. + +Resolve configured team, project, status, and labels at runtime. Do not invent IDs, create labels, assign owners, or set priority unless the config explicitly requires it. + +## 7. Dedupe + +Always check whether this source permalink is already linked to a tracker issue or a prior triage reply. If so, do not post or create a duplicate. + +For bugs and performance reports, search the tracker using: + +- Exact error or crash signature +- Product area +- Trigger +- Symptom +- Version or date window +- Suspected regression commit +- Source permalink + +Choose one outcome: + +- Confident duplicate: same signature, or the same area, trigger, and symptom, or a confirmed shared cause. +- Possibly related: a shared cause is plausible but not proven. +- Weak resemblance: similarity is superficial. +- No match. + +For a confident duplicate, update the existing issue with the source permalink and one short recurrence note. Do not reopen, relabel, or reassign it unless the config says to. + +For a possible match, link it in the verdict as uncertain and create nothing. + +A long-closed issue is a regression lead, not automatically a live duplicate. + +## 8. Decide whether to create + +Create only when all of these are true: + +1. The classification is bug or performance. +2. The behavior is clearly broken. +3. The issue is still live or not known to be fixed. +4. Dedupe found no confident or plausible live match. +5. The source parent and permalink passed preflight. +6. The tracker target fields resolved. +7. The adapter can compensate if the verdict post fails. + +Never create for a feature request, question, feedback item, reroute, possible duplicate, confident duplicate, or already-fixed issue. + +The new issue must be self-contained: + +- Plain title that names the area and symptom +- Reporter quote +- Expected and observed behavior +- Version and environment, or `unknown` +- Trigger and frequency +- Source thread permalink +- Short cause-tracing findings with hypotheses labeled as hypotheses +- Inline screenshot or representative video frame when supported +- Links to remaining artifacts +- Configured intake status and labels + +Do not put a guessed root cause in the title. + +## 9. Post one verdict + +Run a fresh source-parent preflight. Then post exactly one reply with `channel=SOURCE_CHANNEL_ID` and `thread_ts=SOURCE_THREAD_TS`. + +Never call a source-channel posting action without a nonempty `thread_ts`. + +Keep the reply short: + +- Lead with the outcome. +- Link the existing or new tracker issue when there is one. +- Mention a reroute or one missing fact when needed. +- Include at most one allowed owner ping. +- End with exactly one marker line. + +Marker contract: + +```text +[benny:bug] +[benny:bug] tracker=https://tracker.example/issue/123 +[benny:performance] +[benny:performance] tracker=https://tracker.example/issue/123 +[benny:other] +``` + +Use only the configured marker strings. The repro automation trusts the marker only when it comes from the configured triage identity in this source thread. + +After posting, read the same source thread and verify the verdict appears under `SOURCE_THREAD_TS`. If it does not, never retry at the root. + +If this run created a tracker issue and the verdict did not land, use the adapter's compensation action. Verify that the issue is canceled, closed, or deleted. If compensation cannot be verified, report the failure only in the automation run output. + +## 10. Watch one follow-up window + +Watch the source thread for the configured follow-up window, then stop. + +- Answer only a direct question to the triage identity. +- Apply a concrete correction to the tracker issue when safe. +- Do not emit a second marker in the same run. +- Stay out of human coordination and side chatter. +- Stop early if someone asks the automation to stop. + +Do not extend the window more than once. A new report should start a new run. diff --git a/pstack/automations/benny/skills/triage-issue-reports/references/routing.example.md b/pstack/automations/benny/skills/triage-issue-reports/references/routing.example.md new file mode 100644 index 0000000..d71aea7 --- /dev/null +++ b/pstack/automations/benny/skills/triage-issue-reports/references/routing.example.md @@ -0,0 +1,61 @@ +# Routing map example + +Copy this file outside `.cursor/automations/benny/`, for example to `.cursor/benny/routing.md`, and replace every placeholder. Point `routing.map_path` at the copy. Pack refreshes must not overwrite it. + +The triage skill treats this as data. A route needs evidence from the report or cause trace. A keyword match alone is not enough. + +```yaml +routes: + - name: "billing-example" + match: + product_areas: + - "billing-area-placeholder" + code_paths: + - "billing-code-path-placeholder" + error_signatures: + - "billing-error-placeholder" + destination: + slack_channel: "billing-channel-placeholder" + tracker_team: "billing-team-placeholder" + owners: + - "billing-owner-placeholder" + allow_feature_owner_ping: false + + - name: "desktop-example" + match: + product_areas: + - "desktop-area-placeholder" + code_paths: + - "desktop-code-path-placeholder" + error_signatures: + - "desktop-error-placeholder" + destination: + slack_channel: "desktop-channel-placeholder" + tracker_team: "desktop-team-placeholder" + owners: + - "desktop-owner-placeholder" + allow_feature_owner_ping: false + +fallback: + destination: "" + owners: [] + allow_feature_owner_ping: false + +ping_policy: + default: "off" + allow: + - "configured-feature-owner" + - "confirmed-regression-author" + deny: + - "broad-on-call-group" + - "unverified-owner" +``` + +## Rules + +- Leave `fallback.destination` empty unless one team accepts all unmatched reports. +- Use stable product areas, code paths, and error signatures. +- Do not include private data in a public copy. +- Do not paste raw user or channel IDs into an example that will be published. +- Keep feature-owner pings off until the target team agrees to them. +- A reroute tells the reporter where to go. The automation never cross-posts. diff --git a/pstack/automations/benny/templates/configuration.example.yaml b/pstack/automations/benny/templates/configuration.example.yaml new file mode 100644 index 0000000..8616f5f --- /dev/null +++ b/pstack/automations/benny/templates/configuration.example.yaml @@ -0,0 +1,84 @@ +schema_version: 1 + +automations: + triage_name: "benny-triage" + reproduce_name: "benny-reproduce" + +slack: + source_channel_id: "SOURCE_CHANNEL_ID" + operations_channel_id: "" + triage_identity_user_id: "TRIAGE_IDENTITY_USER_ID" + read_action: "configured-slack-read-action" + thread_post_action: "configured-slack-thread-post-action" + file_download_action: "configured-slack-file-download-action" + operations_edit_action: "configured-slack-edit-action" + prefer_cursor_actions: true + optional_bot_token_env: "BENNY_SLACK_BOT_TOKEN" + allow_source_root_posts: false + allow_worker_slack_writes: false + +repository: + url: "https://github.com/example-org/example-repo" + default_branch: "main" + pull_request_action: "configured-draft-pull-request-action" + pull_request_url_format: "https://github.com/{owner}/{repo}/pull/{number}" + draft_only: true + +tracker: + type: "linear" + adapter_skill_name: "issue-tracker-adapter-placeholder" + team: "team-placeholder" + project: "project-placeholder" + labels: + bug: "bug-label-placeholder" + performance: "performance-label-placeholder" + intake: "intake-label-placeholder" + needs_repro: "needs-repro-label-placeholder" + status: "intake-status-placeholder" + source_link_title: "Slack report" + require_compensation_action: true + +routing: + map_path: ".cursor/benny/routing.md" + owner_pings_default: false + allow_feature_owner_ping: false + allow_confirmed_regression_author_ping: false + +control: + skill_name: "control-target-app" + feature_map_path: ".cursor/benny/feature-map.md" + environment: "safe-test-environment-placeholder" + artifact_directory: "/tmp/benny-artifacts" + artifact_retention_hours: 24 + +verdict_markers: + bug: "[benny:bug]" + performance: "[benny:performance]" + other: "[benny:other]" + tracker_attribute: "tracker" + +status_emoji: + seen: "👀" + reproducing: "🔎" + reproduced: "✅" + could_not_reproduce: "⚪" + blocked: "⛔" + fixing: "🛠️" + fix_failed: "❌" + pull_request_opened: "🔗" + +budgets: + poll_seconds: 45 + verdict_wait_minutes: 45 + triage_follow_up_minutes: 10 + triage_total_minutes: 30 + repro_minutes: 60 + rejection_window_minutes: 10 + fix_minutes: 90 + operations_follow_up_minutes: 45 + +models: + triage: "choose-an-available-public-model-slug" + reproduce: "choose-an-available-public-model-slug" + code: "choose-an-available-public-model-slug" + media_review: "choose-an-available-public-model-slug" diff --git a/pstack/automations/benny/templates/reproduce-automation-prompt.md b/pstack/automations/benny/templates/reproduce-automation-prompt.md new file mode 100644 index 0000000..5c66c92 --- /dev/null +++ b/pstack/automations/benny/templates/reproduce-automation-prompt.md @@ -0,0 +1,33 @@ +# Reproduce automation prompt + +> Source material for the copied setup workflow. Paraphrase this intent into a built-in `automate` draft after `automate` confirms that the copied pack is committed in the repository where the automation will run. + +Read and follow `.cursor/automations/benny/skills/reproduce-and-fix-issues/SKILL.md` for this run. + +Configuration source. Include this repository-relative path only when it is committed in the same target repository. Otherwise paraphrase the configured values. Never use a plugin source or cache path: + +```text +{{BENNY_CONFIG_PATH}} +``` + +Trigger: + +```json +{ + "source_channel_id": "{{SLACK_CHANNEL_ID}}", + "message_ts": "{{SLACK_MESSAGE_TS}}", + "thread_ts": "{{SLACK_THREAD_TS_OR_EMPTY}}" +} +``` + +The creation intent should describe this as a new top-level report in the configured source Slack channel. It should include the configured repository, default branch, issue tracker, control adapter, feature map, and draft pull request capability. + +Treat the source channel and root thread timestamp as immutable. If either is missing or does not match configuration, stop without posting. + +Wait for a configured triage marker from the configured triage identity in this exact thread. Proceed only for `[benny:bug]` or `[benny:performance]`. + +Require the configured control-adapter skill before attempting a repro. Reproduce the exact discriminating symptom twice through the real UI. Verify existing pull requests or commits without authoring over them. Attempt a bounded fix only after a confirmed repro and the operational file's fix gate. + +The coordinator is the only Slack poster. Every child prompt must forbid `SendSlackMessage`, `PostToSlack`, `chat.postMessage`, and all other Slack writes. Children return findings only. + +Never post a root message in the source channel. diff --git a/pstack/automations/benny/templates/triage-automation-prompt.md b/pstack/automations/benny/templates/triage-automation-prompt.md new file mode 100644 index 0000000..939e5e1 --- /dev/null +++ b/pstack/automations/benny/templates/triage-automation-prompt.md @@ -0,0 +1,39 @@ +# Triage automation prompt + +> Source material for the copied setup workflow. Paraphrase this intent into a built-in `automate` draft after `automate` confirms that the copied pack is committed in the repository where the automation will run. + +Read and follow `.cursor/automations/benny/skills/triage-issue-reports/SKILL.md` for this run. + +Configuration source. Include this repository-relative path only when it is committed in the same target repository. Otherwise paraphrase the configured values. Never use a plugin source or cache path: + +```text +{{BENNY_CONFIG_PATH}} +``` + +Trigger: + +```json +{ + "source_channel_id": "{{SLACK_CHANNEL_ID}}", + "message_ts": "{{SLACK_MESSAGE_TS}}", + "thread_ts": "{{SLACK_THREAD_TS_OR_EMPTY}}" +} +``` + +The creation intent should describe this as a new top-level report in the configured source Slack channel. + +Treat the source channel and root thread timestamp as immutable. If either is missing or does not match configuration, stop without posting or writing to the issue tracker. + +The committed operational file owns classification, attachment review, cause tracing, routing, dedupe, tracker writes, and the final verdict. Post no progress messages. Never post a root message in the source channel. + +The coordinator is the only Slack poster. Any delegated worker must be read-only, return findings only, and receive an explicit ban on every Slack write action. + +End the single verdict with exactly one configured marker: + +```text +[benny:bug] +[benny:performance] +[benny:other] +``` + +A bug or performance marker may add `tracker=`.