Implement core correlation change alert pipeline and validation#79
Conversation
senuradp
left a comment
There was a problem hiding this comment.
Great work on this PR. The overview is clear and the implementation provides a strong running version of the correlation change alert pipeline.
I reviewed the structure and the pipeline now connects the main stages end to end:
preprocessing → rolling windows → correlation computation → change comparison → alert generation.
A strong point of this PR is that each function has been tested separately before being connected through the wrapper. The separate tests for preprocessing, windowing, correlation computation, change comparison, alert generation, and the full pipeline make the implementation easier to validate and debug. The dataset-level validation using simple.csv, complex.csv, and 2881821.csv also provides useful evidence that the pipeline works across different data structures.
Before final merge, please make a few minor alignment fixes:
-
Keep the original wrapper signature for compatibility:
method="pearson", strong_corr_threshold=0.7, weak_corr_threshold=0.4, and delta_threshold=0.3. -
In compute_window_correlations(), use the method parameter instead of hardcoding Pearson:
window_df.corr(method=method) -
Standardise naming across the pipeline. The current use of change_results, delta_r, and stream_pair is clear, but other modules and tests should follow the same naming convention.
-
Update the generate_alerts() docstring so it matches the current function parameters and output.
Overall, this is a strong integration contribution and provides a good base for the final pipeline.
|
Thanks for the review and feedback. I’ve updated the wrapper signature for compatibility, parameterised the correlation method, aligned the threshold handling, and updated the docstrings and naming consistency across the pipeline. The latest commit has now been pushed to the same branch. |
|
Thanks for updating the full pipeline implementation. I reviewed the updated wrapper flow, and the overall sequence is correct: This is useful as a full-pipeline reference and confirms the direction of the integrated system. For final integration, I will keep the main wrapper structure aligned with the module-by-module implementation already integrated from the team contributions. One thing to note is that your implementation uses naming such as To avoid breaking the integrated pipeline, these naming differences need to be standardised before merging directly. The severity threshold logic also needs to remain aligned with our final agreed ranges: Overall, this is a strong full-pipeline contribution and will be used as a reference for final integration and testing. |
Overview
This pull request adds the initial implementation of the correlation change alert pipeline for the
correlation_alertmodule. The work focuses on the core data pipeline for handling time-series sensor data, computing rolling correlations, comparing correlation changes between consecutive windows, and generating threshold-based alerts.What was implemented
Core pipeline in
correlation_alert/main.pyImplemented the following functions as part of the end-to-end correlation change alert workflow:
preprocess_timeseries(...)create_rolling_windows(...)window_sizeandstep_sizecompute_window_correlations(...)compare_correlation_changes(...)Δr = |r_current - r_previous|generate_alerts(...)delta_ris above the configured thresholdLOW,MEDIUM,HIGH) based on the change magnitudedetect_correlation_change_alert(...)Testing completed
Function-level testing
Each stage of the pipeline was tested separately before integrating everything into the final wrapper:
test_preprocess.pytest_windows.pytest_correlations.pytest_compare_changes.pydelta_rtest_alerts.pytest_full_pipeline.pyDataset-level validation
After function-level testing, the implementation was validated using three datasets:
simple.csv
complex.csv
2881821.csv
field1tofield8)Additional improvements made
correlation_alert/venura_testingValidation outputs
Created dataset runner scripts to save outputs into
correlation_alert/venura_testing/outputs/for:This was done for:
run_simple_dataset.pyrun_complex_dataset.pyrun_real_dataset.pyCurrent status
The core correlation change alert pipeline is now working end to end and has been validated on both test and dataset-based runs. This provides the foundation for further integration, Jupyter-based validation, and future connection with the alerting and wider project workflow.