Conversation
…tering Slackbot was responding repeatedly to the same user message in two cases: - Slack redelivered the same Events API event (Ack timeout, ws reconnect) - The bot's own reply, including a stray mention, retriggered processing through thread history reseeding Defenses added: - eventDedup: TTL-bounded, size-bounded, concurrent-safe seen-key tracker - Skip events authored by any bot (BotID set, empty user, self user, or matching our own bot_id) at both RTM (Bot) and Socket Mode (SocketBot) transports, plus a Processor-level safety net - Dedup keys: Events API event_id with channel:ts fallback for AppMention and DM message events; ClientMsgID with channel:ts fallback for RTM - Log retry deliveries (retry_attempt, retry_reason) for diagnosability - ThreadContextBuilder now skips bot-authored history entirely instead of truncating; tests updated to assert the new semantics - SocketBot only forces thread TS when threading is enabled, matching the documented enable_thread option
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request
Summary
Slackbot was replying repeatedly to the same user message. This PR closes two distinct loop sources at the transport layer and a third at the thread-context layer.
Type of Change
Changes Made
internal/slackbot/dedup.go: TTL-bounded, size-bounded, concurrency-safeeventDedupkeyed on Slackevent_id(withchannel:tsfallback) and on RTMclient_msg_id(withchannel:tsfallback).SocketBotandBotnow reject events authored by any bot —BotID != "", missinguser, our ownbotUserID, or our ownbot_id— before they reach the processor.Processor.ProcessMessagekeeps the same guard as a safety net.retry_attemptandretry_reasonfor observability.ThreadContextBuilderskips bot-authored history entries entirely instead of truncating them, so the bot's own past replies cannot reseed a new query. Tests renamed fromTruncatesBotMessagestoSkipsBotMessagesand updated.SocketBotnow only forcesMsgOptionTS(threadTS)when threading is enabled, matching the documentedenable_threadoption.Motivation and Context
Two reproducible loops were observed in production:
event_id(Ack timeout, websocket reconnect, overlapping subscriptions). Each delivery produced a new reply.The fix is defense-in-depth: dedup at the wire, bot-origin filtering at the dispatcher and processor, and exclusion of bot history at the context builder.
Fixes #(issue)
How Has This Been Tested?
go test ./internal/slackbot/... -timeout 60s)New tests:
TestEventDedup_FirstSeenReturnsFalse/TestEventDedup_SecondSeenReturnsTrueTestEventDedup_EmptyKeyIsNotDedupedTestEventDedup_NilReceiverIsSafeTestEventDedup_ExpiresAfterTTLTestEventDedup_RespectsMaxSizeTestEventDedup_ConcurrentSafeUpdated tests:
TestThreadContextBuilder_BuildFormatsHistorynow asserts bot lines are absent.TestThreadContextBuilder_SkipsBotMessages(renamed fromTruncatesBotMessages) coversBotID,SubType=bot_message, andUsername=RAGentpaths.Test Configuration
Impact Analysis
Components Affected
cmd/)internal/vectorizer/)internal/opensearch/)internal/s3vector/)internal/slackbot/)internal/embedding/)internal/config/)AWS Resources Impact
Breaking Changes
Behavioral note: bot-authored messages no longer appear in the formatted thread history that feeds the search query. This is intentional — including them was the loop amplifier — and the user-visible thread on Slack is unchanged.
Dependencies
Documentation
Checklist
go fmt ./...andgo vet ./...)go mod tidyto clean up dependenciesPerformance Considerations
The dedup map is bounded at 4096 entries per process with a 5-minute TTL and opportunistic GC, so memory is O(maxSize). Lookups are O(1) under a single mutex.
Additional Notes
Dedup keys deliberately overlap (Events API uses
event_id; RTM usesclient_msg_id; both fall back tochannel:ts) so any one transport quirk is caught by at least one layer.Screenshots/Logs
n/a
プルリクエスト(日本語版)
概要
Slackbot が同一ユーザー発話に対して何度も返信してしまう不具合の修正。トランスポート層で重複イベントを抑止し、Bot 由来のイベント/履歴を除外することで二系統のループ要因を解消する。
変更の種類
実装された変更
internal/slackbot/dedup.goを新規追加。TTL とサイズで上限を持つ並行安全なeventDedup。event_id(フォールバックchannel:ts)と RTMclient_msg_id(フォールバックchannel:ts)でキー化。SocketBot/Botが Bot 由来イベント(BotIDあり、user空、自身のbotUserID、自身のbot_id)を Processor へ到達する前に破棄。Processor.ProcessMessageにも保険として同等のガードを残す。retry_attempt/retry_reason)をログ出力し可観測性を確保。ThreadContextBuilderは Bot 発話を切り詰めるのではなく履歴から完全に除外。テストをTruncatesBotMessages→SkipsBotMessagesに改名・更新。SocketBotはスレッド機能が有効な場合のみMsgOptionTS(threadTS)を強制するよう修正(enable_thread設定との整合)。動機と背景
本番で再現した二つのループ要因:
event_idを再配信(Ack タイムアウト、WebSocket 再接続、サブスクリプション重複)するたびに新しい返信が生成されていた。ワイヤ層での dedup、ディスパッチャと Processor での Bot 由来イベント除外、コンテキスト構築層での Bot 履歴除外、という多層防御で確実に断ち切る。
Fixes #(issue)
テスト方法
go test ./internal/slackbot/... -timeout 60s)テスト設定
影響分析
影響を受けるコンポーネント
cmd/)internal/vectorizer/)internal/opensearch/)internal/s3vector/)internal/slackbot/)internal/embedding/)internal/config/)AWSリソースへの影響
破壊的変更
挙動の補足: 検索クエリへ供給するスレッド履歴から Bot 発話を除外するように変更。Slack 上で見えるスレッド本文は変更しない。
依存関係
ドキュメント
チェックリスト
go fmt ./...とgo vet ./...)go mod tidyを実行して依存関係をクリーンアップしたパフォーマンスに関する考慮事項
dedup マップは最大 4096 件・TTL 5 分で機会的 GC を行うため、メモリは O(maxSize)。ルックアップは単一 mutex 下で O(1)。
追加ノート
dedup キーは意図的に重ね掛けしている(Events API は
event_id、RTM はclient_msg_id、いずれもフォールバックはchannel:ts)。いずれかのトランスポート都合でキーが欠落しても他層で必ず検出できる。スクリーンショット/ログ
n/a