Priority: Medium (Avoid Redundant Computation)
Currently, the pipeline recomputes segmentation and tracking even if valid outputs already exist on disk. This leads to wasted compute time during development, testing, and repeated runs.
Proposed Solution
Add a pre-check mechanism in pipeline.py (or relevant module) to skip segmentation and/or tracking if their respective output files already exist.
Implementation Plan:
-
Segmentation Pre-Check:
Before running Segmentation.process(...), check if the following three files exist:
MASK_DIR/<filename>.npz (for gt_filtered)
DF_DIR/<filename>.csv (for summary_df)
DETAILS_DIR/<filename>.pkl (for details)
If all are present, skip segmentation and load the data from disk.
-
Tracking Pre-Check:
Before running Tracking.process(...), check for:
TRACK_DF_DIR/<filename>.csv (for merged_df)
TRACKED_MASK_DIR/<filename>.npz (for tracked_masks)
If both are present, skip tracking and load from disk.
-
Logging and Control:
- Clearly log when segmentation or tracking is skipped due to existing outputs.
- Optional: Add a
--force CLI flag or config option to force re-computation, overriding the check when needed.
File Location:
pipeline.py — modify main loop logic before calling segmentation/tracking modules.
Priority: Medium (Avoid Redundant Computation)
Currently, the pipeline recomputes segmentation and tracking even if valid outputs already exist on disk. This leads to wasted compute time during development, testing, and repeated runs.
Proposed Solution
Add a pre-check mechanism in
pipeline.py(or relevant module) to skip segmentation and/or tracking if their respective output files already exist.Implementation Plan:
Segmentation Pre-Check:
Before running
Segmentation.process(...), check if the following three files exist:MASK_DIR/<filename>.npz(forgt_filtered)DF_DIR/<filename>.csv(forsummary_df)DETAILS_DIR/<filename>.pkl(fordetails)If all are present, skip segmentation and load the data from disk.
Tracking Pre-Check:
Before running
Tracking.process(...), check for:TRACK_DF_DIR/<filename>.csv(formerged_df)TRACKED_MASK_DIR/<filename>.npz(fortracked_masks)If both are present, skip tracking and load from disk.
Logging and Control:
--forceCLI flag or config option to force re-computation, overriding the check when needed.File Location:
pipeline.py— modify main loop logic before calling segmentation/tracking modules.