Skip to content

Skip Segmentation and Tracking if Output Files Already Exist #4

@nezba00

Description

@nezba00

Priority: Medium (Avoid Redundant Computation)

Currently, the pipeline recomputes segmentation and tracking even if valid outputs already exist on disk. This leads to wasted compute time during development, testing, and repeated runs.

Proposed Solution

Add a pre-check mechanism in pipeline.py (or relevant module) to skip segmentation and/or tracking if their respective output files already exist.

Implementation Plan:

  1. Segmentation Pre-Check:
    Before running Segmentation.process(...), check if the following three files exist:

    • MASK_DIR/<filename>.npz (for gt_filtered)
    • DF_DIR/<filename>.csv (for summary_df)
    • DETAILS_DIR/<filename>.pkl (for details)

    If all are present, skip segmentation and load the data from disk.

  2. Tracking Pre-Check:
    Before running Tracking.process(...), check for:

    • TRACK_DF_DIR/<filename>.csv (for merged_df)
    • TRACKED_MASK_DIR/<filename>.npz (for tracked_masks)

    If both are present, skip tracking and load from disk.

  3. Logging and Control:

    • Clearly log when segmentation or tracking is skipped due to existing outputs.
    • Optional: Add a --force CLI flag or config option to force re-computation, overriding the check when needed.

File Location:
pipeline.py — modify main loop logic before calling segmentation/tracking modules.

Metadata

Metadata

Assignees

No one assigned

    Labels

    OptimizationLow Priority, only for optimization of already working codeenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions