Skip to content

Standardize safe PR creation instructions across all agentic workflows#592

Merged
pethers merged 5 commits intomainfrom
copilot/improve-agentic-workflow-understanding
Feb 27, 2026
Merged

Standardize safe PR creation instructions across all agentic workflows#592
pethers merged 5 commits intomainfrom
copilot/improve-agentic-workflow-understanding

Conversation

Copy link
Contributor

Copilot AI commented Feb 26, 2026

  • Standardize safe PR creation instructions across all agentic workflows
  • Address review round 1: qualify noop guidance, assert file existence, robust regex
  • Address review round 2: standardize noopsafeoutputs___noop in bullet lists across all 10 workflow files + 3 later sections in complex workflows
  • Address review round 2: add HOW SAFE PR CREATION WORKS header assertion to test
  • All 67 workflow tests pass (15 architecture + 52 MCP query)
  • Code review completed
  • Security scan — no vulnerabilities found

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Add blockquote "🚀 HOW SAFE PR CREATION WORKS" section to all 7 short workflows
- Replace verbose/confusing PR creation sections in 3 complex workflows (evening-analysis, realtime-monitor, article-generator)
- Remove misleading "Container Isolation Bug" warnings and JavaScript code examples
- Remove redundant verbose safe-output tool listings that confused agents
- Add explicit "DO NOT git push/git branch/GitHub API" instructions
- Add test verifying all 10 workflows have safe PR how-to instructions

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copilot AI changed the title [WIP] Investigate last 5 agentic workflows and improve PR understanding Standardize safe PR creation instructions across all agentic workflows Feb 26, 2026
@pethers pethers requested a review from Copilot February 26, 2026 19:50
@github-actions github-actions bot added documentation Documentation updates workflow GitHub Actions workflows ci-cd CI/CD pipeline changes testing Test coverage news News articles and content generation agentic-workflow Agentic workflow changes size-l Large change (250-1000 lines) labels Feb 26, 2026
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes “safe PR creation” instructions across the agentic workflow markdown files so agents consistently use safeoutputs___create_pull_request (with local git add/git commit) and avoid time-wasting/incorrect approaches like git push or manual branch creation.

Changes:

  • Adds a consistent “🚀 HOW SAFE PR CREATION WORKS” block to all agentic workflow MD files.
  • Simplifies/replaces verbose PR-creation guidance in the more complex workflows (evening-analysis, realtime-monitor, article-generator).
  • Adds a Vitest to enforce presence of the safe PR creation instructions across all workflows.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/workflow-architecture.test.ts Adds a test to enforce safe PR creation guidance across all workflow MD files.
.github/workflows/news-weekly-review.md Adds standardized safe PR creation block.
.github/workflows/news-week-ahead.md Adds standardized safe PR creation block.
.github/workflows/news-realtime-monitor.md Updates NON-NEGOTIABLE header and replaces verbose PR section with standardized guidance and reminder.
.github/workflows/news-propositions.md Adds standardized safe PR creation block.
.github/workflows/news-motions.md Adds standardized safe PR creation block.
.github/workflows/news-monthly-review.md Adds standardized safe PR creation block.
.github/workflows/news-month-ahead.md Adds standardized safe PR creation block.
.github/workflows/news-evening-analysis.md Replaces verbose PR guidance with standardized block + concise reminder and tool usage.
.github/workflows/news-committee-reports.md Adds standardized safe PR creation block.
.github/workflows/news-article-generator.md Replaces verbose PR guidance with standardized block + concise reminder and tool usage.

> - Required tool unavailable → `safeoutputs___missing_tool({"reason": "..."})`
>
> **`safeoutputs___create_pull_request` handles branch creation and push automatically. Do NOT run `git push` or create branches.**
> **Exiting without calling one of these = workflow failure.** When in doubt, call `safeoutputs___noop`.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NON-NEGOTIABLE header says “When in doubt, call safeoutputs___noop”, but later the workflow explicitly forbids using noop when articles were generated and PR creation failed. This creates conflicting guidance at the top of the document. Please qualify the “when in doubt” instruction so it only applies when no articles were generated (or point readers to the later rule).

Suggested change
> **Exiting without calling one of these = workflow failure.** When in doubt, call `safeoutputs___noop`.
> **Exiting without calling one of these = workflow failure.** When in doubt **and no articles were generated**, call `safeoutputs___noop` (otherwise follow the detailed rules below).

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d1bebd — qualified the "when in doubt" to: When in doubt **and no articles were generated**, call safeoutputs___noop (otherwise follow the detailed rules below).

Comment on lines 158 to 161
for (const workflowFile of allWorkflows) {
const filepath = path.join(WORKFLOWS_DIR, workflowFile);
if (!fs.existsSync(filepath)) continue;

Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test silently skips workflows that are missing from disk (if (!fs.existsSync(filepath)) continue;). That means it can pass even if an expected workflow file (e.g., evening-analysis or realtime-monitor) was deleted/renamed, which undermines the purpose of asserting standard instructions across all workflows. Prefer asserting existsSync(filepath) is true (as in the earlier dedicated-workflow test) before checking file contents.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d1bebd — replaced the silent continue with an explicit expect(fs.existsSync(filepath)).toBe(true) assertion so the test fails if any expected workflow file is missing.

Comment on lines 163 to 164
expect(
content.includes('DO NOT** run `git push`'),
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion content.includes('DO NOT** run git push') is overly coupled to a specific Markdown emphasis pattern (**...**) and could fail from harmless formatting edits (e.g., removing bold/emoji) even if the workflow still correctly forbids git push. Consider checking for the semantic requirement more robustly (e.g., presence of 'git push' and a nearby 'DO NOT', or the standard header string).

Suggested change
expect(
content.includes('DO NOT** run `git push`'),
const hasDoNotGitPush = /DO\s+NOT[\s\S]{0,80}`git push`/i.test(content);
expect(
hasDoNotGitPush,

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d1bebd — replaced the literal string check with const hasDoNotGitPush = /DO\s+NOT[\s\S]{0,80}\git push`/i.test(content)` so it tolerates formatting changes while still checking the semantic requirement.

@pethers
Copy link
Member

pethers commented Feb 27, 2026

@copilot apply changes based on the comments in this thread

…, use robust regex

- Qualify "when in doubt" noop instruction to only apply when no articles generated
- Replace silent skip with existsSync assertion in test
- Use regex pattern for DO NOT git push check to tolerate formatting changes

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 27, 2026

@copilot apply changes based on the comments in this thread

All three review comments applied in 5d1bebd: qualified the noop guidance to exclude the articles-generated case, asserted file existence instead of silently skipping, and switched to a regex for the git push check.

@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (1)

.github/workflows/news-propositions.md:153

  • This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).
> **❌ DO NOT** call `safeoutputs___noop` if articles were generated but PR creation failed — let the workflow FAIL instead.

- ✅ `safeoutputs___create_pull_request` when articles generated
- ✅ `noop` ONLY if genuinely no new propositions
- ❌ NEVER use `noop` as fallback for PR creation failures

## MCP Tools

Comment on lines 152 to 153
- ✅ `noop` ONLY if genuinely no upcoming events in next 30 days
- ❌ NEVER use `noop` as fallback for PR creation failures
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).

Suggested change
-`noop` ONLY if genuinely no upcoming events in next 30 days
- ❌ NEVER use `noop` as fallback for PR creation failures
-`safeoutputs___noop` ONLY if genuinely no upcoming events in next 30 days
- ❌ NEVER use `safeoutputs___noop` as fallback for PR creation failures

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — replaced bare noop with safeoutputs___noop in both bullet lines.

Comment on lines 156 to 157
- ✅ **ONLY USE `noop` if genuinely no new committee reports** from riksdag-regering-mcp
- ❌ **NEVER use `noop` as fallback for PR creation failures**
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).

Suggested change
-**ONLY USE `noop` if genuinely no new committee reports** from riksdag-regering-mcp
-**NEVER use `noop` as fallback for PR creation failures**
-**ONLY USE `safeoutputs___noop` if genuinely no new committee reports** from riksdag-regering-mcp
-**NEVER use `safeoutputs___noop` as fallback for PR creation failures**

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines now use safeoutputs___noop.


**⚠️ From reader's perspective: No PR when articles exist = FAILURE**
- ✅ `safeoutputs___create_pull_request` - When articles generated
- ❌ **NEVER use `noop` as fallback for PR failures**
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR-creation rule list, the text still refers to noop (e.g., “NEVER use noop as fallback…”), while other parts of this workflow (including the NON-NEGOTIABLE header) use the explicit tool name safeoutputs___noop. To avoid tool-name confusion for agents, standardize these bullets to safeoutputs___noop consistently.

Suggested change
-**NEVER use `noop` as fallback for PR failures**
-**NEVER use `safeoutputs___noop` as fallback for PR failures**

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — standardized to safeoutputs___noop here and in the later "NEVER use noop" line at line 796.

Comment on lines 155 to 157
- ✅ **REQUIRED:** `safeoutputs___create_pull_request` - When articles generated
- ✅ **ONLY USE `noop` if genuinely no new data** (checked riksdag-regering-mcp, found no committee reports, no propositions, no significant updates, AND force_generation=false)
- ❌ **NEVER use `noop` as a fallback for PR creation failures** - If articles were generated but PR fails, the workflow MUST FAIL

**⚠️ If you generated articles but cannot create PR → workflow FAILS (not noop)**

Readers expect articles. No PR = No articles = FAILURE.
- ❌ **NEVER use `noop` as a fallback for PR creation failures**
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These bullets instruct using noop, while this workflow also references the safe output tool explicitly as safeoutputs___noop (and the new how-to block uses safeoutputs___noop). Standardize this section to the explicit tool name so agents don’t attempt to call noop as a tool.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines and the later "NEVER use noop if articles were generated" line now all use safeoutputs___noop.

Comment on lines 150 to 151
- ✅ `noop` ONLY if genuinely no new motions
- ❌ NEVER use `noop` as fallback for PR creation failures
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).

Suggested change
-`noop` ONLY if genuinely no new motions
- ❌ NEVER use `noop` as fallback for PR creation failures
-`safeoutputs___noop` ONLY if genuinely no new motions
- ❌ NEVER use `safeoutputs___noop` as fallback for PR creation failures

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines now use safeoutputs___noop.

Comment on lines 152 to 153
- ✅ `noop` ONLY if genuinely no parliamentary activity in past month
- ❌ NEVER use `noop` as fallback for PR creation failures
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).

Suggested change
-`noop` ONLY if genuinely no parliamentary activity in past month
- ❌ NEVER use `noop` as fallback for PR creation failures
-`safeoutputs___noop` ONLY if genuinely no parliamentary activity in past month
- ❌ NEVER use `safeoutputs___noop` as fallback for PR creation failures

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines now use safeoutputs___noop.

Comment on lines 165 to 167
- ✅ **REQUIRED:** `safeoutputs___create_pull_request` - When analysis articles generated
- ✅ **ONLY USE `noop` if genuinely insufficient data** (checked riksdag-regering-mcp, found no votes, no debates, no documents, no calendar events for the lookback period)
- ❌ **NEVER use `noop` as a fallback for PR creation failures** - If articles were generated but PR fails, the workflow MUST FAIL

**⚠️ If you generated analysis articles but cannot create PR → workflow FAILS (not noop)**

Readers expect analysis. No PR = No analysis = FAILURE.
- ❌ **NEVER use `noop` as a fallback for PR creation failures**
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These bullets instruct using noop, while the safe output tool is referenced elsewhere as safeoutputs___noop (and the new how-to block also uses safeoutputs___noop). Standardize this section to the explicit tool name so agents don’t attempt to call noop as a tool.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines and the later "NEVER use noop if articles were generated" line now all use safeoutputs___noop.

expect(
content.includes('git add') && content.includes('git commit'),
`Workflow ${workflowFile} should document git add + git commit before safe PR creation`
).toBe(true);
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test is intended to verify the standardized “🚀 HOW SAFE PR CREATION WORKS” block across workflows, but it currently only checks for git add/git commit, safeoutputs___create_pull_request, and a nearby “DO NOT git push”. This can pass even if the standard header/how-to block is accidentally removed or renamed. Consider also asserting the presence of the standard marker text (e.g., “HOW SAFE PR CREATION WORKS”) so the test truly enforces the standardization goal.

Suggested change
).toBe(true);
).toBe(true);
expect(
content.includes('HOW SAFE PR CREATION WORKS'),
`Workflow ${workflowFile} should include the standardized HOW SAFE PR CREATION WORKS header block`
).toBe(true);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — added expect(content.includes('HOW SAFE PR CREATION WORKS')).toBe(true) assertion to enforce the standard header block.

Comment on lines 152 to 153
- ✅ `noop` ONLY if genuinely no parliamentary activity in the past week
- ❌ NEVER use `noop` as fallback for PR creation failures
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).

Suggested change
-`noop` ONLY if genuinely no parliamentary activity in the past week
- ❌ NEVER use `noop` as fallback for PR creation failures
-`safeoutputs___noop` ONLY if genuinely no parliamentary activity in the past week
- ❌ NEVER use `safeoutputs___noop` as fallback for PR creation failures

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines now use safeoutputs___noop.

Comment on lines 150 to 151
- ✅ `noop` ONLY if genuinely no upcoming calendar events
- ❌ NEVER use `noop` as fallback for PR creation failures
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mixes the tool name safeoutputs___noop (in the new blockquote) with later bullets that instruct using noop. If the actual safe output tool is safeoutputs___noop, this inconsistency can mislead agents into calling a non-existent tool. Recommend updating the bullets to consistently reference safeoutputs___noop (and adjusting the “NEVER use …” line accordingly).

Suggested change
-`noop` ONLY if genuinely no upcoming calendar events
- ❌ NEVER use `noop` as fallback for PR creation failures
-`safeoutputs___noop` ONLY if genuinely no upcoming calendar events
- ❌ NEVER use `safeoutputs___noop` as fallback for PR creation failures

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b37fd25 — both bullet lines now use safeoutputs___noop.

@pethers
Copy link
Member

pethers commented Feb 27, 2026

@copilot apply changes based on the comments in this thread

@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

…d add header assertion

- Replace bare `noop` with `safeoutputs___noop` in bullet lists across all 10 workflows
- Also fix 3 remaining bare `noop` references in later sections of complex workflows
- Add 'HOW SAFE PR CREATION WORKS' header assertion to the test

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 27, 2026

@copilot apply changes based on the comments in this thread

All 10 review comments applied in b37fd25: standardized bare noopsafeoutputs___noop across all 10 workflow files (including 3 additional references in later sections of complex workflows), and added HOW SAFE PR CREATION WORKS header assertion to the test. All 67 tests pass.

@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

@pethers pethers marked this pull request as ready for review February 27, 2026 08:25
@pethers pethers merged commit 753bbb0 into main Feb 27, 2026
20 checks passed
@pethers pethers deleted the copilot/improve-agentic-workflow-understanding branch February 27, 2026 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agentic-workflow Agentic workflow changes ci-cd CI/CD pipeline changes documentation Documentation updates news News articles and content generation size-l Large change (250-1000 lines) testing Test coverage workflow GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants