Skip to content

fix(desktop): reduce group chat task false positives#7538

Open
waffensam wants to merge 2 commits into
BasedHardware:mainfrom
waffensam:codex/desktop-task-group-channel-filter
Open

fix(desktop): reduce group chat task false positives#7538
waffensam wants to merge 2 commits into
BasedHardware:mainfrom
waffensam:codex/desktop-task-group-channel-filter

Conversation

@waffensam
Copy link
Copy Markdown
Contributor

Bot fit:

  • For desktop: yes
  • Bug fix: yes

Summary:

  • Tightens task extraction prompt for public/group channels.
  • Requires visible direct user involvement before extracting a task from community channels.
  • Adds a regression test that the default prompt includes the no_task_found guard.

Tests:

  • git diff --check origin/main..HEAD
  • xcrun swift test --package-path Desktop --filter TaskAssistantPromptTests

Notes:

  • Focused test passes; build/test output still includes existing upstream Swift 6/deprecation warnings and libwebp link target warning.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 29, 2026

Greptile Summary

This PR tightens the LLM task-extraction prompt to reduce false positives in public and group channels (Discord, Slack, Teams, etc.) by adding an explicit CRITICAL FOR PUBLIC/GROUP CHANNELS guard that requires visible direct-user involvement before a task is extracted, and backs it with a new XCTest regression suite.

  • TaskAssistantSettings.swift: Inserts an 8-line instruction block immediately after the Pattern 2 section; it instructs the model to call no_task_found whenever the user is merely observing a community channel and none of three direct-involvement signals (explicit @mention, active prior reply, or DM context) are detectable.
  • TaskAssistantPromptTests.swift: Adds TaskAssistantPromptTests with a single @MainActor test that asserts four key substrings of the new block are present in defaultAnalysisPrompt; three of the four assertions are unique to the new section, giving meaningful regression coverage.

Confidence Score: 4/5

Safe to merge — the change is additive prompt text with no Swift logic modifications; the main risk is minor prompt wording ambiguity in the new section.

The new section introduces one condition ("It is a direct message (DM) thread") that is logically redundant within its own heading, which could create subtle model confusion. One of the four test assertions also checks a string that already existed in the prompt before this PR, so that assertion does not guard the new text.

The new CRITICAL FOR PUBLIC/GROUP CHANNELS block in TaskAssistantSettings.swift (condition 2) and the redundant assertion in TaskAssistantPromptTests.swift (line 11) deserve a second look before merging.

Important Files Changed

Filename Overview
desktop/Desktop/Sources/ProactiveAssistants/Assistants/TaskExtraction/TaskAssistantSettings.swift Adds an 8-line CRITICAL FOR PUBLIC/GROUP CHANNELS paragraph to the default LLM prompt, restricting task extraction to cases where the user is demonstrably involved; no logic changes to Swift code itself.
desktop/Desktop/Tests/TaskAssistantPromptTests.swift New XCTest file with a single test that asserts four substrings are present in the default prompt; one of the four assertions is redundant because the checked string pre-existed in the prompt before this PR.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Screenshot captured] --> B{Conversation visible?}
    B -- No --> Z[no_task_found]
    B -- Yes --> C{Latest exchange pattern?}
    C -- Pattern 1: User agreed --> D[extract_task]
    C -- Pattern 2: Unaddressed request --> E{Public / Group channel?}
    C -- No actionable pattern --> Z
    E -- No, it is a DM --> D
    E -- Yes --> F{Direct involvement evidence?}
    F -- "@mentions user by name/handle" --> D
    F -- "User already replied in thread" --> D
    F -- "Cannot determine" --> Z
    F -- "Merely observing" --> Z
Loading

Reviews (1): Last reviewed commit: "fix(desktop): reduce group-channel task ..." | Re-trigger Greptile


XCTAssertTrue(prompt.contains("CRITICAL FOR PUBLIC/GROUP CHANNELS"))
XCTAssertTrue(prompt.contains("visible evidence shows the user is directly involved"))
XCTAssertTrue(prompt.contains("call no_task_found"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant assertion doesn't cover the new section

"call no_task_found" already appears in the pre-existing MANDATORY WORKFLOW step 2 ("→ call no_task_found immediately"), so this assertion passes even if the entire new CRITICAL FOR PUBLIC/GROUP CHANNELS block is deleted from the prompt. The other three assertions (lines 9, 10, 12) are unique to the new section and do provide real regression coverage; this one adds nothing. Consider replacing it with a substring that is unique to the new rule, such as "merely observing a public channel", or simply remove it since "community at large" is already asserted on line 12.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +173 to +179
CRITICAL FOR PUBLIC/GROUP CHANNELS:
In Discord, Slack, Teams, community chats, and other public/group channels, extract ONLY when the visible evidence shows the user is directly involved:
- The message explicitly @mentions the user by name or handle
- It is a direct message (DM) thread, not a public or community channel
- The user has already replied in the same thread and is an active participant
If the user is merely observing a public channel, or if you cannot tell whether the request is directed at them, call no_task_found.
Do NOT extract tasks from broad bug reports, feature requests, or questions posted to the community at large.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Condition 2 is logically redundant within the section scope

The section header is CRITICAL FOR PUBLIC/GROUP CHANNELS, so a DM thread is already outside the scope this section addresses. Listing "It is a direct message (DM) thread, not a public or community channel" as one of the extraction-allowed conditions may confuse the model: it reads as if you need to first be in a public/group channel AND simultaneously be in a DM thread. In practice this condition belongs in the parent decision tree (before the section is even reached) rather than as a peer of the @mention and active-participant conditions. Removing or relocating it would reduce ambiguity, especially since an LLM reading the bullet list may treat the three items as an OR set whose second member contradicts the section heading.

Copy link
Copy Markdown
Collaborator

@kodjima33 kodjima33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for tightening the group-chat task prompt + the regression test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants