diff --git a/.claude/commands/speckit.retro.analyze.md b/.claude/commands/speckit.retro.analyze.md new file mode 100644 index 00000000..ceb6c7be --- /dev/null +++ b/.claude/commands/speckit.retro.analyze.md @@ -0,0 +1,472 @@ +--- +description: Perform retrospective analysis on completed specs to extract shared constraints + and improve constitution, templates, and checklists through self-improvement. +--- + + + + +## User Input + +```text +$ARGUMENTS +``` + +You **MUST** consider the user input before proceeding (if not empty). + +## Goal + +Analyze completed specs to identify cross-cutting patterns, constraints, and lessons learned, then propose improvements to the project's constitution, templates, and checklists. This enables speckit to continuously improve itself based on real implementation experience. + +## Operating Philosophy + +**Self-Improvement Loop**: Each completed spec is a learning opportunity. The retro process distills implementation experience into reusable governance rules, better templates, and quality gates that make future specs higher quality with less ambiguity. + +**Constitution-First**: The constitution is the only artifact loaded by all future specs. Extracting shared constraints into the constitution has maximum leverage—it improves every subsequent spec automatically. + +**No New Artifacts**: This command does NOT create summary.md or other new file types. It strengthens existing infrastructure (constitution, templates, checklists) that speckit commands already use. + +## Execution Steps + +### 1. Identify Completed Specs + +Ask the user which specs to analyze. Suggested formats: +- Range: "015-019" (analyze specs 015 through 019) +- List: "015,017,019" (analyze specific specs) +- All: "all" (analyze all specs in /specs/) + +Parse the input and build a list of spec directories to analyze. + +### 2. Cluster Specs by Topic (if analyzing 5+ specs) + +**Skip this step if analyzing fewer than 5 specs.** + +For large-scale analysis (5+ specs), automatically cluster specs by topic before pattern extraction to ensure high-quality, focused proposals. + +#### 2.1 Load Minimal Context for Clustering + +For each spec, read only: +- **spec.md**: First 50 lines (Overview/Context section) +- **plan.md**: Technology & Architecture Constraints section + +Extract key indicators: +- Primary components mentioned (daemon, proxy, CLI, web UI, config, TUI, etc.) +- Technology stack (Go packages, React, SQLite, etc.) +- Feature category keywords (stability, routing, monitoring, migration, etc.) + +#### 2.2 Perform Automatic Clustering + +Group specs by similarity using these heuristics: + +**Component-based clustering** (adapt to your project structure): +- Backend/API: specs mentioning server, API endpoints, business logic, data access +- Frontend/UI: specs mentioning UI components, user interactions, styling, client-side logic +- CLI/Tooling: specs mentioning command-line interface, scripts, automation +- Infrastructure: specs mentioning deployment, configuration, monitoring, logging +- Testing & Quality: specs mentioning test coverage, CI/CD, integration tests + +**Feature-based clustering** (if component clustering produces groups >8 specs): +- Stability & Reliability: error handling, recovery, resilience, fault tolerance +- Performance & Scalability: optimization, caching, concurrency, load handling +- Security & Privacy: authentication, authorization, data protection, validation +- Data & Storage: database, schema, migration, persistence +- User Experience: usability, accessibility, responsiveness, feedback + +#### 2.3 Present Clustering Results + +Output clustering summary: + +```markdown +## Spec Clustering Results + +Analyzed 20 specs, grouped into 4 clusters: + +### Group 1: Backend & API (7 specs) +- 015-user-authentication +- 017-api-rate-limiting +- 018-data-validation +- 019-caching-strategy +- 020-error-handling +- ... + +**Focus areas**: API design, data processing, error handling, performance + +### Group 2: Frontend & UI (5 specs) +- 003-responsive-layout +- 011-form-validation +- 016-accessibility-improvements +- ... + +**Focus areas**: Component design, user interactions, styling, accessibility + +### Group 3: Infrastructure & Deployment (4 specs) +- 005-logging-system +- 006-monitoring-dashboard +- 008-ci-cd-pipeline +- ... + +**Focus areas**: Observability, deployment automation, configuration management + +### Group 4: Testing & Quality (4 specs) +- 008-integration-tests +- ... + +**Focus areas**: Test coverage, quality gates, automated testing +``` + +#### 2.4 Ask User to Select Groups + +Present options: + +``` +Which groups would you like to analyze? +- [ ] Group 1: Backend & API (7 specs) +- [ ] Group 2: Frontend & UI (5 specs) +- [ ] Group 3: Infrastructure & Deployment (4 specs) +- [ ] Group 4: Testing & Quality (4 specs) +- [ ] All groups (analyze each separately, generate per-group proposals) +- [ ] Skip clustering (analyze all specs together) +``` + +Wait for user selection before proceeding. + +**If user selects multiple groups**: Analyze each group independently and generate separate proposal sections for each. + +**If user selects "Skip clustering"**: Proceed with all specs in a single analysis (may produce lower-quality cross-domain proposals). + +### 3. Load Spec Artifacts + +For each spec in the selected group(s), load: + +**From spec.md**: +- Functional requirements +- Non-functional requirements +- User stories +- Edge cases + +**From plan.md**: +- Architecture decisions and rationale +- Technology choices +- Constitution Check section (violations, complexity justifications) +- Phase breakdown + +**From tasks.md**: +- Task structure and organization patterns +- Dependency patterns +- Parallelization markers + +**From checklists/** (if exists): +- Quality dimensions checked +- Recurring validation patterns + +**From implementation** (if merged): +- Check git log for the feature branch to understand what was actually built +- Look for deviations between plan and implementation + +### 3. Pattern Extraction + +Analyze loaded specs across these dimensions: + +#### A. Shared Constraints (Constitution Candidates) + +Identify rules that appear across multiple specs: +- Technology choices that became de facto standards +- Architecture patterns repeatedly used +- Performance/security requirements that recur +- Testing strategies applied consistently +- Forbidden patterns that caused issues + +**Example**: If 3+ specs all avoid nested API responses beyond 3 levels, that's a constraint worth codifying. + +#### B. Template Gaps + +Identify sections frequently added manually that should be in templates: +- Missing sections in spec-template.md (e.g., "Performance Considerations") +- Missing phases in plan-template.md +- Missing task categories in tasks-template.md + +**Example**: If every spec adds a "Migration Strategy" section, add it to spec-template. + +#### C. Quality Gate Patterns + +Identify validation checks that should become default checklists: +- Security checks repeatedly needed +- Performance validation patterns +- UX quality dimensions +- API design principles + +**Example**: If multiple specs check "rate limiting for batch operations", add it to a default checklist. + +#### D. Constitution Violations + +Review Constitution Check sections across specs: +- Which principles are frequently violated? +- Are violations justified (complexity trade-offs) or avoidable? +- Do violation patterns suggest the principle needs refinement? + +**Example**: If Principle II is violated in 5 specs with similar justifications, the principle may need adjustment. + +#### E. Implementation Deviations + +Compare plans vs actual implementation: +- What changed during implementation and why? +- Were there recurring surprises or unknowns? +- Did certain types of tasks consistently take longer than expected? + +**Example**: If integration tasks consistently reveal missing error handling, add "error handling strategy" to plan-template. + +### 4. Generate Improvement Proposals + +**If analyzing multiple groups**: Generate separate proposal sections for each group with clear group headers. + +**If analyzing a single group or all specs together**: Generate a unified proposal. + +Output a structured proposal document with three sections per group: + +#### Proposed Constitution Amendments + +For each proposed amendment: +- **Type**: New principle | Principle modification | New constraint +- **Rationale**: Which specs demonstrate this pattern (cite spec numbers) +- **Proposed Text**: Exact wording to add/modify +- **Impact**: Which future specs will benefit +- **Version Bump**: MAJOR | MINOR | PATCH (per constitution governance rules) + +**Format** (for multi-group analysis): +```markdown +## Group 1: Backend & API - Improvement Proposals + +### Constitution Amendments + +#### Amendment 1.1: API Response Time Limits + +**Type**: New constraint (add to "Technology & Architecture Constraints") + +**Rationale**: Specs 017, 019 both implemented timeout mechanisms (10-second API response limit). This pattern should be codified to ensure consistent user experience. + +**Proposed Text**: +> - **API Response Time**: All API endpoints MUST respond within 10 seconds or return a timeout error. Long-running operations MUST use async patterns with status polling. + +**Impact**: Future API-related specs will include timeout handling from the planning phase. + +**Version Bump**: MINOR (new constraint) + +### Template Updates + +#### Template Update 1.1: Add Error Handling Strategy to plan-template.md + +**Template**: `.specify/templates/plan-template.md` + +**Change Type**: Add section + +**Rationale**: Specs 017, 019, 020 all added "Error Handling Strategy" sections manually for backend features. + +**Proposed Diff**: +```diff ++ ## Error Handling Strategy (for backend/API features) ++ ++ - Error classification (client errors, server errors, transient failures) ++ - Retry logic and backoff strategy ++ - Error response format and status codes ++ - Logging and monitoring for errors +``` + +### Checklist Additions + +#### Checklist Addition 1.1: API Design Checklist + +**Checklist**: Create `.specify/templates/api-design-checklist-template.md` + +**Items**: +- [ ] CHK-API-001: All endpoints have timeout handling +- [ ] CHK-API-002: Error responses follow consistent format +- [ ] CHK-API-003: Rate limiting implemented for resource-intensive endpoints +- [ ] CHK-API-004: Input validation covers all required fields +- [ ] CHK-API-005: API documentation includes error codes and examples + +**Rationale**: Specs 017, 019 both needed these checks. Creating a dedicated API design checklist catches these requirements during planning. + +--- + +## Group 2: Frontend & UI - Improvement Proposals + +### Constitution Amendments + +#### Amendment 2.1: Accessibility Standards + +... +``` + +**Format** (for single-group or unified analysis): +```markdown +### Amendment 1: API Response Nesting Limit + +**Type**: New constraint (add to "Technology & Architecture Constraints") + +**Rationale**: Specs 015, 017, 019 all independently limited API response nesting to 3 levels for performance and client parsing simplicity. This pattern should be codified. + +**Proposed Text**: +> - **API Design**: Response bodies MUST NOT nest objects deeper than 3 levels. Use flat structures with references (IDs) for deep relationships. + +**Impact**: Prevents future specs from creating deeply nested APIs that cause client-side parsing issues. + +**Version Bump**: MINOR (new constraint) +``` + +#### Proposed Template Updates + +For each template update: +- **Template**: Which template file +- **Change Type**: Add section | Modify section | Remove section +- **Rationale**: Which specs needed this manually +- **Proposed Diff**: Show before/after + +**Format**: +```markdown +### Template Update 1: Add Performance Considerations to plan-template.md + +**Template**: `.specify/templates/plan-template.md` + +**Change Type**: Add section + +**Rationale**: Specs 016, 017, 018, 019 all added "Performance Considerations" sections manually. This should be a standard plan section. + +**Proposed Diff**: +```diff ++ ## Performance Considerations ++ ++ - Expected load characteristics ++ - Performance targets (latency, throughput) ++ - Bottleneck analysis ++ - Optimization strategy +``` +``` + +#### Proposed Checklist Additions + +For each checklist addition: +- **Checklist**: Which checklist file (or new checklist to create) +- **Items**: New checklist items to add +- **Rationale**: Which specs would have caught issues earlier + +**Format**: +```markdown +### Checklist Addition 1: Rate Limiting Check + +**Checklist**: `.specify/templates/checklist-template.md` (or create `api-checklist-template.md`) + +**Items**: +- [ ] CHK-API-001: Batch operations have rate limiting +- [ ] CHK-API-002: Rate limit errors return 429 with Retry-After header +- [ ] CHK-API-003: Rate limits documented in API contracts + +**Rationale**: Specs 016, 019 both discovered missing rate limiting during implementation. Adding this to default API checklist catches it during planning. +``` + +### 5. Archive Completed Specs (Optional) + +After extracting improvements, optionally archive the analyzed specs: + +Ask user: "Would you like to archive these specs? This will move original files to `specs/.archive/[NNN]-feature-name/` to reduce future token usage." + +If yes: +- Create `specs/.archive/` directory if it doesn't exist +- For each analyzed spec: + - Move entire spec directory to `.archive/` + - Leave a minimal index file at original location (optional) + +**Minimal index format** (if user wants it): +```markdown +# [NNN] - [Feature Name] (Archived) + +Archived: [date] +Location: `specs/.archive/[NNN]-feature-name/` +Constitution updates: [list amendment numbers from retro] +``` + +### 6. User Review and Approval + +Present the complete proposal and ask: + +"Review the proposed improvements above. Which changes would you like to apply?" + +Options: +- [ ] Apply all constitution amendments +- [ ] Apply all template updates +- [ ] Apply all checklist additions +- [ ] Apply selected items (specify which) +- [ ] Save proposal for later review +- [ ] Cancel (no changes) + +### 7. Apply Approved Changes + +For each approved change: + +**Constitution amendments**: +1. Read current `.specify/memory/constitution.md` +2. Apply the amendment +3. Update version number per governance rules +4. Update Sync Impact Report (HTML comment at top) +5. Write updated constitution + +**Template updates**: +1. Read the template file +2. Apply the diff +3. Write updated template + +**Checklist additions**: +1. Read or create the checklist template +2. Add new items with proper CHK-### IDs +3. Write updated checklist + +### 8. Generate Retro Summary + +Output a concise summary: + +```markdown +## Retro Summary + +**Specs Analyzed**: [list with group breakdown if applicable] +**Groups**: [number of groups, or "unified analysis"] +**Patterns Identified**: [count per group if applicable] +**Changes Applied**: +- Constitution: [count] amendments (version [old] → [new]) +- Templates: [count] updates +- Checklists: [count] additions + +**Per-Group Breakdown** (if multi-group analysis): +- Group 1 (Backend & API): [X] amendments, [Y] template updates, [Z] checklist items +- Group 2 (Frontend & UI): [X] amendments, [Y] template updates, [Z] checklist items +- ... + +**Next Steps**: +- New specs will automatically benefit from updated constitution and templates +- Existing in-progress specs may want to incorporate new checklist items +- Consider running retro again after completing next 5-10 specs +``` + +## Operating Principles + +### Context Efficiency + +- **Progressive loading**: Load specs incrementally, not all at once +- **Pattern-focused**: Extract high-signal patterns, not exhaustive documentation +- **Minimal output**: Proposals should be concise and actionable + +### Analysis Guidelines + +- **Evidence-based**: Every proposal must cite specific specs as evidence +- **Cross-spec patterns only**: Don't propose rules based on a single spec (minimum 2-3 specs showing the same pattern) +- **Respect constitution governance**: Follow versioning and amendment rules +- **No speculation**: Only propose constraints actually demonstrated in completed specs +- **Group-focused proposals**: When analyzing multiple groups, ensure proposals are relevant to the group's domain (don't mix daemon constraints with web UI constraints) + +### Safety + +- **User approval required**: Never auto-apply constitution changes +- **Preserve originals**: Archive moves files, doesn't delete them +- **Reversible**: All changes are git-tracked and can be reverted + +## Context + +$ARGUMENTS \ No newline at end of file diff --git a/.claude/commands/speckit.retro.md b/.claude/commands/speckit.retro.md new file mode 100644 index 00000000..ceb6c7be --- /dev/null +++ b/.claude/commands/speckit.retro.md @@ -0,0 +1,472 @@ +--- +description: Perform retrospective analysis on completed specs to extract shared constraints + and improve constitution, templates, and checklists through self-improvement. +--- + + + + +## User Input + +```text +$ARGUMENTS +``` + +You **MUST** consider the user input before proceeding (if not empty). + +## Goal + +Analyze completed specs to identify cross-cutting patterns, constraints, and lessons learned, then propose improvements to the project's constitution, templates, and checklists. This enables speckit to continuously improve itself based on real implementation experience. + +## Operating Philosophy + +**Self-Improvement Loop**: Each completed spec is a learning opportunity. The retro process distills implementation experience into reusable governance rules, better templates, and quality gates that make future specs higher quality with less ambiguity. + +**Constitution-First**: The constitution is the only artifact loaded by all future specs. Extracting shared constraints into the constitution has maximum leverage—it improves every subsequent spec automatically. + +**No New Artifacts**: This command does NOT create summary.md or other new file types. It strengthens existing infrastructure (constitution, templates, checklists) that speckit commands already use. + +## Execution Steps + +### 1. Identify Completed Specs + +Ask the user which specs to analyze. Suggested formats: +- Range: "015-019" (analyze specs 015 through 019) +- List: "015,017,019" (analyze specific specs) +- All: "all" (analyze all specs in /specs/) + +Parse the input and build a list of spec directories to analyze. + +### 2. Cluster Specs by Topic (if analyzing 5+ specs) + +**Skip this step if analyzing fewer than 5 specs.** + +For large-scale analysis (5+ specs), automatically cluster specs by topic before pattern extraction to ensure high-quality, focused proposals. + +#### 2.1 Load Minimal Context for Clustering + +For each spec, read only: +- **spec.md**: First 50 lines (Overview/Context section) +- **plan.md**: Technology & Architecture Constraints section + +Extract key indicators: +- Primary components mentioned (daemon, proxy, CLI, web UI, config, TUI, etc.) +- Technology stack (Go packages, React, SQLite, etc.) +- Feature category keywords (stability, routing, monitoring, migration, etc.) + +#### 2.2 Perform Automatic Clustering + +Group specs by similarity using these heuristics: + +**Component-based clustering** (adapt to your project structure): +- Backend/API: specs mentioning server, API endpoints, business logic, data access +- Frontend/UI: specs mentioning UI components, user interactions, styling, client-side logic +- CLI/Tooling: specs mentioning command-line interface, scripts, automation +- Infrastructure: specs mentioning deployment, configuration, monitoring, logging +- Testing & Quality: specs mentioning test coverage, CI/CD, integration tests + +**Feature-based clustering** (if component clustering produces groups >8 specs): +- Stability & Reliability: error handling, recovery, resilience, fault tolerance +- Performance & Scalability: optimization, caching, concurrency, load handling +- Security & Privacy: authentication, authorization, data protection, validation +- Data & Storage: database, schema, migration, persistence +- User Experience: usability, accessibility, responsiveness, feedback + +#### 2.3 Present Clustering Results + +Output clustering summary: + +```markdown +## Spec Clustering Results + +Analyzed 20 specs, grouped into 4 clusters: + +### Group 1: Backend & API (7 specs) +- 015-user-authentication +- 017-api-rate-limiting +- 018-data-validation +- 019-caching-strategy +- 020-error-handling +- ... + +**Focus areas**: API design, data processing, error handling, performance + +### Group 2: Frontend & UI (5 specs) +- 003-responsive-layout +- 011-form-validation +- 016-accessibility-improvements +- ... + +**Focus areas**: Component design, user interactions, styling, accessibility + +### Group 3: Infrastructure & Deployment (4 specs) +- 005-logging-system +- 006-monitoring-dashboard +- 008-ci-cd-pipeline +- ... + +**Focus areas**: Observability, deployment automation, configuration management + +### Group 4: Testing & Quality (4 specs) +- 008-integration-tests +- ... + +**Focus areas**: Test coverage, quality gates, automated testing +``` + +#### 2.4 Ask User to Select Groups + +Present options: + +``` +Which groups would you like to analyze? +- [ ] Group 1: Backend & API (7 specs) +- [ ] Group 2: Frontend & UI (5 specs) +- [ ] Group 3: Infrastructure & Deployment (4 specs) +- [ ] Group 4: Testing & Quality (4 specs) +- [ ] All groups (analyze each separately, generate per-group proposals) +- [ ] Skip clustering (analyze all specs together) +``` + +Wait for user selection before proceeding. + +**If user selects multiple groups**: Analyze each group independently and generate separate proposal sections for each. + +**If user selects "Skip clustering"**: Proceed with all specs in a single analysis (may produce lower-quality cross-domain proposals). + +### 3. Load Spec Artifacts + +For each spec in the selected group(s), load: + +**From spec.md**: +- Functional requirements +- Non-functional requirements +- User stories +- Edge cases + +**From plan.md**: +- Architecture decisions and rationale +- Technology choices +- Constitution Check section (violations, complexity justifications) +- Phase breakdown + +**From tasks.md**: +- Task structure and organization patterns +- Dependency patterns +- Parallelization markers + +**From checklists/** (if exists): +- Quality dimensions checked +- Recurring validation patterns + +**From implementation** (if merged): +- Check git log for the feature branch to understand what was actually built +- Look for deviations between plan and implementation + +### 3. Pattern Extraction + +Analyze loaded specs across these dimensions: + +#### A. Shared Constraints (Constitution Candidates) + +Identify rules that appear across multiple specs: +- Technology choices that became de facto standards +- Architecture patterns repeatedly used +- Performance/security requirements that recur +- Testing strategies applied consistently +- Forbidden patterns that caused issues + +**Example**: If 3+ specs all avoid nested API responses beyond 3 levels, that's a constraint worth codifying. + +#### B. Template Gaps + +Identify sections frequently added manually that should be in templates: +- Missing sections in spec-template.md (e.g., "Performance Considerations") +- Missing phases in plan-template.md +- Missing task categories in tasks-template.md + +**Example**: If every spec adds a "Migration Strategy" section, add it to spec-template. + +#### C. Quality Gate Patterns + +Identify validation checks that should become default checklists: +- Security checks repeatedly needed +- Performance validation patterns +- UX quality dimensions +- API design principles + +**Example**: If multiple specs check "rate limiting for batch operations", add it to a default checklist. + +#### D. Constitution Violations + +Review Constitution Check sections across specs: +- Which principles are frequently violated? +- Are violations justified (complexity trade-offs) or avoidable? +- Do violation patterns suggest the principle needs refinement? + +**Example**: If Principle II is violated in 5 specs with similar justifications, the principle may need adjustment. + +#### E. Implementation Deviations + +Compare plans vs actual implementation: +- What changed during implementation and why? +- Were there recurring surprises or unknowns? +- Did certain types of tasks consistently take longer than expected? + +**Example**: If integration tasks consistently reveal missing error handling, add "error handling strategy" to plan-template. + +### 4. Generate Improvement Proposals + +**If analyzing multiple groups**: Generate separate proposal sections for each group with clear group headers. + +**If analyzing a single group or all specs together**: Generate a unified proposal. + +Output a structured proposal document with three sections per group: + +#### Proposed Constitution Amendments + +For each proposed amendment: +- **Type**: New principle | Principle modification | New constraint +- **Rationale**: Which specs demonstrate this pattern (cite spec numbers) +- **Proposed Text**: Exact wording to add/modify +- **Impact**: Which future specs will benefit +- **Version Bump**: MAJOR | MINOR | PATCH (per constitution governance rules) + +**Format** (for multi-group analysis): +```markdown +## Group 1: Backend & API - Improvement Proposals + +### Constitution Amendments + +#### Amendment 1.1: API Response Time Limits + +**Type**: New constraint (add to "Technology & Architecture Constraints") + +**Rationale**: Specs 017, 019 both implemented timeout mechanisms (10-second API response limit). This pattern should be codified to ensure consistent user experience. + +**Proposed Text**: +> - **API Response Time**: All API endpoints MUST respond within 10 seconds or return a timeout error. Long-running operations MUST use async patterns with status polling. + +**Impact**: Future API-related specs will include timeout handling from the planning phase. + +**Version Bump**: MINOR (new constraint) + +### Template Updates + +#### Template Update 1.1: Add Error Handling Strategy to plan-template.md + +**Template**: `.specify/templates/plan-template.md` + +**Change Type**: Add section + +**Rationale**: Specs 017, 019, 020 all added "Error Handling Strategy" sections manually for backend features. + +**Proposed Diff**: +```diff ++ ## Error Handling Strategy (for backend/API features) ++ ++ - Error classification (client errors, server errors, transient failures) ++ - Retry logic and backoff strategy ++ - Error response format and status codes ++ - Logging and monitoring for errors +``` + +### Checklist Additions + +#### Checklist Addition 1.1: API Design Checklist + +**Checklist**: Create `.specify/templates/api-design-checklist-template.md` + +**Items**: +- [ ] CHK-API-001: All endpoints have timeout handling +- [ ] CHK-API-002: Error responses follow consistent format +- [ ] CHK-API-003: Rate limiting implemented for resource-intensive endpoints +- [ ] CHK-API-004: Input validation covers all required fields +- [ ] CHK-API-005: API documentation includes error codes and examples + +**Rationale**: Specs 017, 019 both needed these checks. Creating a dedicated API design checklist catches these requirements during planning. + +--- + +## Group 2: Frontend & UI - Improvement Proposals + +### Constitution Amendments + +#### Amendment 2.1: Accessibility Standards + +... +``` + +**Format** (for single-group or unified analysis): +```markdown +### Amendment 1: API Response Nesting Limit + +**Type**: New constraint (add to "Technology & Architecture Constraints") + +**Rationale**: Specs 015, 017, 019 all independently limited API response nesting to 3 levels for performance and client parsing simplicity. This pattern should be codified. + +**Proposed Text**: +> - **API Design**: Response bodies MUST NOT nest objects deeper than 3 levels. Use flat structures with references (IDs) for deep relationships. + +**Impact**: Prevents future specs from creating deeply nested APIs that cause client-side parsing issues. + +**Version Bump**: MINOR (new constraint) +``` + +#### Proposed Template Updates + +For each template update: +- **Template**: Which template file +- **Change Type**: Add section | Modify section | Remove section +- **Rationale**: Which specs needed this manually +- **Proposed Diff**: Show before/after + +**Format**: +```markdown +### Template Update 1: Add Performance Considerations to plan-template.md + +**Template**: `.specify/templates/plan-template.md` + +**Change Type**: Add section + +**Rationale**: Specs 016, 017, 018, 019 all added "Performance Considerations" sections manually. This should be a standard plan section. + +**Proposed Diff**: +```diff ++ ## Performance Considerations ++ ++ - Expected load characteristics ++ - Performance targets (latency, throughput) ++ - Bottleneck analysis ++ - Optimization strategy +``` +``` + +#### Proposed Checklist Additions + +For each checklist addition: +- **Checklist**: Which checklist file (or new checklist to create) +- **Items**: New checklist items to add +- **Rationale**: Which specs would have caught issues earlier + +**Format**: +```markdown +### Checklist Addition 1: Rate Limiting Check + +**Checklist**: `.specify/templates/checklist-template.md` (or create `api-checklist-template.md`) + +**Items**: +- [ ] CHK-API-001: Batch operations have rate limiting +- [ ] CHK-API-002: Rate limit errors return 429 with Retry-After header +- [ ] CHK-API-003: Rate limits documented in API contracts + +**Rationale**: Specs 016, 019 both discovered missing rate limiting during implementation. Adding this to default API checklist catches it during planning. +``` + +### 5. Archive Completed Specs (Optional) + +After extracting improvements, optionally archive the analyzed specs: + +Ask user: "Would you like to archive these specs? This will move original files to `specs/.archive/[NNN]-feature-name/` to reduce future token usage." + +If yes: +- Create `specs/.archive/` directory if it doesn't exist +- For each analyzed spec: + - Move entire spec directory to `.archive/` + - Leave a minimal index file at original location (optional) + +**Minimal index format** (if user wants it): +```markdown +# [NNN] - [Feature Name] (Archived) + +Archived: [date] +Location: `specs/.archive/[NNN]-feature-name/` +Constitution updates: [list amendment numbers from retro] +``` + +### 6. User Review and Approval + +Present the complete proposal and ask: + +"Review the proposed improvements above. Which changes would you like to apply?" + +Options: +- [ ] Apply all constitution amendments +- [ ] Apply all template updates +- [ ] Apply all checklist additions +- [ ] Apply selected items (specify which) +- [ ] Save proposal for later review +- [ ] Cancel (no changes) + +### 7. Apply Approved Changes + +For each approved change: + +**Constitution amendments**: +1. Read current `.specify/memory/constitution.md` +2. Apply the amendment +3. Update version number per governance rules +4. Update Sync Impact Report (HTML comment at top) +5. Write updated constitution + +**Template updates**: +1. Read the template file +2. Apply the diff +3. Write updated template + +**Checklist additions**: +1. Read or create the checklist template +2. Add new items with proper CHK-### IDs +3. Write updated checklist + +### 8. Generate Retro Summary + +Output a concise summary: + +```markdown +## Retro Summary + +**Specs Analyzed**: [list with group breakdown if applicable] +**Groups**: [number of groups, or "unified analysis"] +**Patterns Identified**: [count per group if applicable] +**Changes Applied**: +- Constitution: [count] amendments (version [old] → [new]) +- Templates: [count] updates +- Checklists: [count] additions + +**Per-Group Breakdown** (if multi-group analysis): +- Group 1 (Backend & API): [X] amendments, [Y] template updates, [Z] checklist items +- Group 2 (Frontend & UI): [X] amendments, [Y] template updates, [Z] checklist items +- ... + +**Next Steps**: +- New specs will automatically benefit from updated constitution and templates +- Existing in-progress specs may want to incorporate new checklist items +- Consider running retro again after completing next 5-10 specs +``` + +## Operating Principles + +### Context Efficiency + +- **Progressive loading**: Load specs incrementally, not all at once +- **Pattern-focused**: Extract high-signal patterns, not exhaustive documentation +- **Minimal output**: Proposals should be concise and actionable + +### Analysis Guidelines + +- **Evidence-based**: Every proposal must cite specific specs as evidence +- **Cross-spec patterns only**: Don't propose rules based on a single spec (minimum 2-3 specs showing the same pattern) +- **Respect constitution governance**: Follow versioning and amendment rules +- **No speculation**: Only propose constraints actually demonstrated in completed specs +- **Group-focused proposals**: When analyzing multiple groups, ensure proposals are relevant to the group's domain (don't mix daemon constraints with web UI constraints) + +### Safety + +- **User approval required**: Never auto-apply constitution changes +- **Preserve originals**: Archive moves files, doesn't delete them +- **Reversible**: All changes are git-tracked and can be reverted + +## Context + +$ARGUMENTS \ No newline at end of file diff --git a/.codex/prompts/speckit.analyze.md b/.codex/prompts/speckit.analyze.md new file mode 100644 index 00000000..98b04b0c --- /dev/null +++ b/.codex/prompts/speckit.analyze.md @@ -0,0 +1,184 @@ +--- +description: Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation. +--- + +## User Input + +```text +$ARGUMENTS +``` + +You **MUST** consider the user input before proceeding (if not empty). + +## Goal + +Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`. + +## Operating Constraints + +**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually). + +**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.analyze`. + +## Execution Steps + +### 1. Initialize Analysis Context + +Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths: + +- SPEC = FEATURE_DIR/spec.md +- PLAN = FEATURE_DIR/plan.md +- TASKS = FEATURE_DIR/tasks.md + +Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command). +For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot"). + +### 2. Load Artifacts (Progressive Disclosure) + +Load only the minimal necessary context from each artifact: + +**From spec.md:** + +- Overview/Context +- Functional Requirements +- Non-Functional Requirements +- User Stories +- Edge Cases (if present) + +**From plan.md:** + +- Architecture/stack choices +- Data Model references +- Phases +- Technical constraints + +**From tasks.md:** + +- Task IDs +- Descriptions +- Phase grouping +- Parallel markers [P] +- Referenced file paths + +**From constitution:** + +- Load `.specify/memory/constitution.md` for principle validation + +### 3. Build Semantic Models + +Create internal representations (do not include raw artifacts in output): + +- **Requirements inventory**: Each functional + non-functional requirement with a stable key (derive slug based on imperative phrase; e.g., "User can upload file" → `user-can-upload-file`) +- **User story/action inventory**: Discrete user actions with acceptance criteria +- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases) +- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements + +### 4. Detection Passes (Token-Efficient Analysis) + +Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary. + +#### A. Duplication Detection + +- Identify near-duplicate requirements +- Mark lower-quality phrasing for consolidation + +#### B. Ambiguity Detection + +- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria +- Flag unresolved placeholders (TODO, TKTK, ???, ``, etc.) + +#### C. Underspecification + +- Requirements with verbs but missing object or measurable outcome +- User stories missing acceptance criteria alignment +- Tasks referencing files or components not defined in spec/plan + +#### D. Constitution Alignment + +- Any requirement or plan element conflicting with a MUST principle +- Missing mandated sections or quality gates from constitution + +#### E. Coverage Gaps + +- Requirements with zero associated tasks +- Tasks with no mapped requirement/story +- Non-functional requirements not reflected in tasks (e.g., performance, security) + +#### F. Inconsistency + +- Terminology drift (same concept named differently across files) +- Data entities referenced in plan but absent in spec (or vice versa) +- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note) +- Conflicting requirements (e.g., one requires Next.js while other specifies Vue) + +### 5. Severity Assignment + +Use this heuristic to prioritize findings: + +- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality +- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion +- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case +- **LOW**: Style/wording improvements, minor redundancy not affecting execution order + +### 6. Produce Compact Analysis Report + +Output a Markdown report (no file writes) with the following structure: + +## Specification Analysis Report + +| ID | Category | Severity | Location(s) | Summary | Recommendation | +|----|----------|----------|-------------|---------|----------------| +| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version | + +(Add one row per finding; generate stable IDs prefixed by category initial.) + +**Coverage Summary Table:** + +| Requirement Key | Has Task? | Task IDs | Notes | +|-----------------|-----------|----------|-------| + +**Constitution Alignment Issues:** (if any) + +**Unmapped Tasks:** (if any) + +**Metrics:** + +- Total Requirements +- Total Tasks +- Coverage % (requirements with >=1 task) +- Ambiguity Count +- Duplication Count +- Critical Issues Count + +### 7. Provide Next Actions + +At end of report, output a concise Next Actions block: + +- If CRITICAL issues exist: Recommend resolving before `/speckit.implement` +- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions +- Provide explicit command suggestions: e.g., "Run /speckit.specify with refinement", "Run /speckit.plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'" + +### 8. Offer Remediation + +Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.) + +## Operating Principles + +### Context Efficiency + +- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation +- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis +- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow +- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts + +### Analysis Guidelines + +- **NEVER modify files** (this is read-only analysis) +- **NEVER hallucinate missing sections** (if absent, report them accurately) +- **Prioritize constitution violations** (these are always CRITICAL) +- **Use examples over exhaustive rules** (cite specific instances, not generic patterns) +- **Report zero issues gracefully** (emit success report with coverage statistics) + +## Context + +$ARGUMENTS diff --git a/.codex/prompts/speckit.checklist.md b/.codex/prompts/speckit.checklist.md new file mode 100644 index 00000000..b7624e22 --- /dev/null +++ b/.codex/prompts/speckit.checklist.md @@ -0,0 +1,295 @@ +--- +description: Generate a custom checklist for the current feature based on user requirements. +--- + +## Checklist Purpose: "Unit Tests for English" + +**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain. + +**NOT for verification/testing**: + +- ❌ NOT "Verify the button clicks correctly" +- ❌ NOT "Test error handling works" +- ❌ NOT "Confirm the API returns 200" +- ❌ NOT checking if code/implementation matches the spec + +**FOR requirements quality validation**: + +- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness) +- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity) +- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency) +- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage) +- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases) + +**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works. + +## User Input + +```text +$ARGUMENTS +``` + +You **MUST** consider the user input before proceeding (if not empty). + +## Execution Steps + +1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list. + - All file paths must be absolute. + - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot"). + +2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST: + - Be generated from the user's phrasing + extracted signals from spec/plan/tasks + - Only ask about information that materially changes checklist content + - Be skipped individually if already unambiguous in `$ARGUMENTS` + - Prefer precision over breadth + + Generation algorithm: + 1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts"). + 2. Cluster signals into candidate focus areas (max 4) ranked by relevance. + 3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit. + 4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria. + 5. Formulate questions chosen from these archetypes: + - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?") + - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?") + - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?") + - Audience framing (e.g., "Will this be used by the author only or peers during PR review?") + - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?") + - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?") + + Question formatting rules: + - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters + - Limit to A–E options maximum; omit table if a free-form answer is clearer + - Never ask the user to restate what they already said + - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope." + + Defaults when interaction impossible: + - Depth: Standard + - Audience: Reviewer (PR) if code-related; Author otherwise + - Focus: Top 2 relevance clusters + + Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more. + +3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers: + - Derive checklist theme (e.g., security, review, deploy, ux) + - Consolidate explicit must-have items mentioned by user + - Map focus selections to category scaffolding + - Infer any missing context from spec/plan/tasks (do NOT hallucinate) + +4. **Load feature context**: Read from FEATURE_DIR: + - spec.md: Feature requirements and scope + - plan.md (if exists): Technical details, dependencies + - tasks.md (if exists): Implementation tasks + + **Context Loading Strategy**: + - Load only necessary portions relevant to active focus areas (avoid full-file dumping) + - Prefer summarizing long sections into concise scenario/requirement bullets + - Use progressive disclosure: add follow-on retrieval only if gaps detected + - If source docs are large, generate interim summary items instead of embedding raw text + +5. **Generate checklist** - Create "Unit Tests for Requirements": + - Create `FEATURE_DIR/checklists/` directory if it doesn't exist + - Generate unique checklist filename: + - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`) + - Format: `[domain].md` + - File handling behavior: + - If file does NOT exist: Create new file and number items starting from CHK001 + - If file exists: Append new items to existing file, continuing from the last CHK ID (e.g., if last item is CHK015, start new items at CHK016) + - Never delete or replace existing checklist content - always preserve and append + + **CORE PRINCIPLE - Test the Requirements, Not the Implementation**: + Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for: + - **Completeness**: Are all necessary requirements present? + - **Clarity**: Are requirements unambiguous and specific? + - **Consistency**: Do requirements align with each other? + - **Measurability**: Can requirements be objectively verified? + - **Coverage**: Are all scenarios/edge cases addressed? + + **Category Structure** - Group items by requirement quality dimensions: + - **Requirement Completeness** (Are all necessary requirements documented?) + - **Requirement Clarity** (Are requirements specific and unambiguous?) + - **Requirement Consistency** (Do requirements align without conflicts?) + - **Acceptance Criteria Quality** (Are success criteria measurable?) + - **Scenario Coverage** (Are all flows/cases addressed?) + - **Edge Case Coverage** (Are boundary conditions defined?) + - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?) + - **Dependencies & Assumptions** (Are they documented and validated?) + - **Ambiguities & Conflicts** (What needs clarification?) + + **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**: + + ❌ **WRONG** (Testing implementation): + - "Verify landing page displays 3 episode cards" + - "Test hover states work on desktop" + - "Confirm logo click navigates home" + + ✅ **CORRECT** (Testing requirements quality): + - "Are the exact number and layout of featured episodes specified?" [Completeness] + - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity] + - "Are hover state requirements consistent across all interactive elements?" [Consistency] + - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage] + - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases] + - "Are loading states defined for asynchronous episode data?" [Completeness] + - "Does the spec define visual hierarchy for competing UI elements?" [Clarity] + + **ITEM STRUCTURE**: + Each item should follow this pattern: + - Question format asking about requirement quality + - Focus on what's WRITTEN (or not written) in the spec/plan + - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.] + - Reference spec section `[Spec §X.Y]` when checking existing requirements + - Use `[Gap]` marker when checking for missing requirements + + **EXAMPLES BY QUALITY DIMENSION**: + + Completeness: + - "Are error handling requirements defined for all API failure modes? [Gap]" + - "Are accessibility requirements specified for all interactive elements? [Completeness]" + - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]" + + Clarity: + - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]" + - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]" + - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]" + + Consistency: + - "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]" + - "Are card component requirements consistent between landing and detail pages? [Consistency]" + + Coverage: + - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]" + - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]" + - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]" + + Measurability: + - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]" + - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]" + + **Scenario Classification & Coverage** (Requirements Quality Focus): + - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios + - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?" + - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]" + - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]" + + **Traceability Requirements**: + - MINIMUM: ≥80% of items MUST include at least one traceability reference + - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]` + - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]" + + **Surface & Resolve Issues** (Requirements Quality Problems): + Ask questions about the requirements themselves: + - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]" + - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]" + - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]" + - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]" + - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]" + + **Content Consolidation**: + - Soft cap: If raw candidate items > 40, prioritize by risk/impact + - Merge near-duplicates checking the same requirement aspect + - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]" + + **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test: + - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior + - ❌ References to code execution, user actions, system behavior + - ❌ "Displays correctly", "works properly", "functions as expected" + - ❌ "Click", "navigate", "render", "load", "execute" + - ❌ Test cases, test plans, QA procedures + - ❌ Implementation details (frameworks, APIs, algorithms) + + **✅ REQUIRED PATTERNS** - These test requirements quality: + - ✅ "Are [requirement type] defined/specified/documented for [scenario]?" + - ✅ "Is [vague term] quantified/clarified with specific criteria?" + - ✅ "Are requirements consistent between [section A] and [section B]?" + - ✅ "Can [requirement] be objectively measured/verified?" + - ✅ "Are [edge cases/scenarios] addressed in requirements?" + - ✅ "Does the spec define [missing aspect]?" + +6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### ` lines with globally incrementing IDs starting at CHK001. + +7. **Report**: Output full path to checklist file, item count, and summarize whether the run created a new file or appended to an existing one. Summarize: + - Focus areas selected + - Depth level + - Actor/timing + - Any explicit user-specified must-have items incorporated + +**Important**: Each `/speckit.checklist` command invocation uses a short, descriptive checklist filename and either creates a new file or appends to an existing one. This allows: + +- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`) +- Simple, memorable filenames that indicate checklist purpose +- Easy identification and navigation in the `checklists/` folder + +To avoid clutter, use descriptive types and clean up obsolete checklists when done. + +## Example Checklist Types & Sample Items + +**UX Requirements Quality:** `ux.md` + +Sample items (testing the requirements, NOT the implementation): + +- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]" +- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]" +- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]" +- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]" +- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]" +- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]" + +**API Requirements Quality:** `api.md` + +Sample items: + +- "Are error response formats specified for all failure scenarios? [Completeness]" +- "Are rate limiting requirements quantified with specific thresholds? [Clarity]" +- "Are authentication requirements consistent across all endpoints? [Consistency]" +- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]" +- "Is versioning strategy documented in requirements? [Gap]" + +**Performance Requirements Quality:** `performance.md` + +Sample items: + +- "Are performance requirements quantified with specific metrics? [Clarity]" +- "Are performance targets defined for all critical user journeys? [Coverage]" +- "Are performance requirements under different load conditions specified? [Completeness]" +- "Can performance requirements be objectively measured? [Measurability]" +- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]" + +**Security Requirements Quality:** `security.md` + +Sample items: + +- "Are authentication requirements specified for all protected resources? [Coverage]" +- "Are data protection requirements defined for sensitive information? [Completeness]" +- "Is the threat model documented and requirements aligned to it? [Traceability]" +- "Are security requirements consistent with compliance obligations? [Consistency]" +- "Are security failure/breach response requirements defined? [Gap, Exception Flow]" + +## Anti-Examples: What NOT To Do + +**❌ WRONG - These test implementation, not requirements:** + +```markdown +- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001] +- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003] +- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010] +- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005] +``` + +**✅ CORRECT - These test requirements quality:** + +```markdown +- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001] +- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003] +- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010] +- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005] +- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap] +- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001] +``` + +**Key Differences:** + +- Wrong: Tests if the system works correctly +- Correct: Tests if the requirements are written correctly +- Wrong: Verification of behavior +- Correct: Validation of requirement quality +- Wrong: "Does it do X?" +- Correct: "Is X clearly specified?" diff --git a/.codex/prompts/speckit.clarify.md b/.codex/prompts/speckit.clarify.md new file mode 100644 index 00000000..657439f0 --- /dev/null +++ b/.codex/prompts/speckit.clarify.md @@ -0,0 +1,181 @@ +--- +description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec. +handoffs: + - label: Build Technical Plan + agent: speckit.plan + prompt: Create a plan for the spec. I am building with... +--- + +## User Input + +```text +$ARGUMENTS +``` + +You **MUST** consider the user input before proceeding (if not empty). + +## Outline + +Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file. + +Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases. + +Execution steps: + +1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields: + - `FEATURE_DIR` + - `FEATURE_SPEC` + - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.) + - If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment. + - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot"). + +2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked). + + Functional Scope & Behavior: + - Core user goals & success criteria + - Explicit out-of-scope declarations + - User roles / personas differentiation + + Domain & Data Model: + - Entities, attributes, relationships + - Identity & uniqueness rules + - Lifecycle/state transitions + - Data volume / scale assumptions + + Interaction & UX Flow: + - Critical user journeys / sequences + - Error/empty/loading states + - Accessibility or localization notes + + Non-Functional Quality Attributes: + - Performance (latency, throughput targets) + - Scalability (horizontal/vertical, limits) + - Reliability & availability (uptime, recovery expectations) + - Observability (logging, metrics, tracing signals) + - Security & privacy (authN/Z, data protection, threat assumptions) + - Compliance / regulatory constraints (if any) + + Integration & External Dependencies: + - External services/APIs and failure modes + - Data import/export formats + - Protocol/versioning assumptions + + Edge Cases & Failure Handling: + - Negative scenarios + - Rate limiting / throttling + - Conflict resolution (e.g., concurrent edits) + + Constraints & Tradeoffs: + - Technical constraints (language, storage, hosting) + - Explicit tradeoffs or rejected alternatives + + Terminology & Consistency: + - Canonical glossary terms + - Avoided synonyms / deprecated terms + + Completion Signals: + - Acceptance criteria testability + - Measurable Definition of Done style indicators + + Misc / Placeholders: + - TODO markers / unresolved decisions + - Ambiguous adjectives ("robust", "intuitive") lacking quantification + + For each category with Partial or Missing status, add a candidate question opportunity unless: + - Clarification would not materially change implementation or validation strategy + - Information is better deferred to planning phase (note internally) + +3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints: + - Maximum of 5 total questions across the whole session. + - Each question must be answerable with EITHER: + - A short multiple‑choice selection (2–5 distinct, mutually exclusive options), OR + - A one-word / short‑phrase answer (explicitly constrain: "Answer in <=5 words"). + - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation. + - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved. + - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness). + - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests. + - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic. + +4. Sequential questioning loop (interactive): + - Present EXACTLY ONE question at a time. + - For multiple‑choice questions: + - **Analyze all options** and determine the **most suitable option** based on: + - Best practices for the project type + - Common patterns in similar implementations + - Risk reduction (security, performance, maintainability) + - Alignment with any explicit project goals or constraints visible in the spec + - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice). + - Format as: `**Recommended:** Option [X] - ` + - Then render all options as a Markdown table: + + | Option | Description | + |--------|-------------| + | A |