feat(worker): implement per-task database sessions #654
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Adds a feature-flagged option to use per-task database sessions instead of shared worker-level sessions. This prevents session contamination when a task crashes or times out mid-transaction.
Problem
When tasks share DB sessions at the worker level, a crash or timeout can leave the session in an invalid state ('prepared' or pending rollback). Subsequent tasks fail with SQLAlchemy transaction errors.
Impact: ~10,000 errors/month across Upload, UploadProcessor, UploadFinisher, and Notify tasks.
Sentry Issues
Solution
When
setup.tasks.per_task_db_sessions.enabledistrue:get_db_session.remove()to clear any stale sessionget_db_session.remove()to prevent contaminationConfiguration
Rollout behavior:
queuesandtasksare both empty → enabled for ALL tasksqueuesortasksare specified → only enabled for matching tasksenabled: false→ disabled for everyone (instant rollback)Changes
use_per_task_db_sessions(task_name, queue_name)config helper functionBaseCodecovTask.run()to manage session lifecycle when enabledwrap_up_dbsession()to acceptper_task_sessionsandsession_idparamsRollout Plan
Risk Assessment
High risk — This touches core task infrastructure.
Mitigations:
Linear Issue
CCMRG-2009
Note
Introduces an opt-in per-task SQLAlchemy session lifecycle to prevent cross-task session contamination and enable gradual rollout via config.
use_per_task_db_sessions(task_name, queue_name)readssetup.tasks.per_task_db_sessions(enabled + optionalqueues/tasksfilters)BaseCodecovTask.run()optionally clears any stale session at start, generates a shortsession_idfor logs, and passes flags to cleanupwrap_up_dbsession(db_session, per_task_sessions, session_id)enhanced to always remove the session when enabled, and improve timeout/invalid-state recovery and loggingWritten by Cursor Bugbot for commit 7074895. This will update automatically on new commits. Configure here.