Skip to content

Investigate stale dependency cleanup strategy #7

@TheRockPusher

Description

@TheRockPusher

Context

The TaskDependencyRepository currently creates dependencies between tasks but does not handle cleanup of stale dependencies when blocking tasks are completed or cancelled.

Problem

When a blocker task is marked as completed or cancelled, the dependency relationship remains in the database. This could lead to:

  1. Stale data accumulation - Dependencies that are no longer meaningful persist indefinitely
  2. Query performance degradation - Queries like SELECT_ACTIVE_BLOCKERS must filter out completed/cancelled blockers at runtime
  3. Database bloat - Historical dependencies consume storage without providing value

Possible Solutions

We need to investigate and decide between two approaches:

Option A: Automatic Cleanup on Status Change

Pros:

  • Dependencies always reflect current state
  • Simpler queries (no need to filter by status)
  • No additional maintenance scripts needed

Cons:

  • Loses historical dependency data
  • More complex transaction logic in repository
  • Performance impact on every status update

Option B: Periodic Cleanup Script

Pros:

  • Preserves short-term historical data
  • No performance impact on normal operations
  • Could run during maintenance windows

Cons:

  • Requires cron/scheduled task setup
  • Stale data exists between cleanups
  • Additional operational complexity

Option C: Hybrid Approach

  • Keep dependencies for analysis/reporting
  • Add is_active flag to track current relevance
  • Archive old dependencies instead of deleting

Investigation Tasks

  • Measure typical dependency lifecycle duration
  • Estimate database growth with different retention policies
  • Benchmark query performance with/without status filtering
  • Research industry best practices for task dependency management
  • Consider if historical dependency data has analytical value

Acceptance Criteria

Decision documented on:

  • Which cleanup strategy to implement (A, B, C, or other)
  • Rationale explaining trade-offs considered
  • Implementation plan if cleanup is needed

Related Code

  • src/taskweaver/database/dependency_repository.py:26 - add_dependency()
  • src/taskweaver/database/schema.py:89 - SELECT_ACTIVE_BLOCKERS query
  • src/taskweaver/database/repository.py:131 - update_task() method

Labels

enhancement, database, needs-investigation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions