Skip to content

CCA - 003: Rolling window segmentation#80

Open
saumyadholia25-source wants to merge 3 commits into
DataBytes-Organisation:feature/correlation-alert/venura/saumyafrom
saumyadholia25-source:feature/correlation-alert/venura/saumya
Open

CCA - 003: Rolling window segmentation#80
saumyadholia25-source wants to merge 3 commits into
DataBytes-Organisation:feature/correlation-alert/venura/saumyafrom
saumyadholia25-source:feature/correlation-alert/venura/saumya

Conversation

@saumyadholia25-source
Copy link
Copy Markdown

Implemented rolling window logic and correlation computation for multi-sensor time-series data

Implemented rolling window logic and correlation computation for multi-sensor time-series data

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>
Copy link
Copy Markdown
Collaborator

@senuradp senuradp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work on exploring rolling window segmentation and validating it across different datasets. The approach and logic are correct and relevant for the correlation alert pipeline.

However, for integration into the project, we need this to follow the modular structure defined in the wrapper.

Please refactor your implementation into a single function:
create_rolling_windows(df, window_size, step_size)

This function should:

  • take the preprocessed dataframe as input
  • return a list of windowed dataframes
  • avoid implementing other pipeline stages (correlation, alerts, etc.)
  • avoid notebook-based execution in the final version

The notebooks are useful for experimentation, but the final PR should contain a clean Python implementation that fits into the wrapper pipeline.

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>
@saumyadholia25-source
Copy link
Copy Markdown
Author

rolling_windows.py.ipynb

uploaded the updated file

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>
@senuradp
Copy link
Copy Markdown
Collaborator

Thanks for updating the PR based on the earlier feedback. I reviewed the updated rolling window implementation, and this version is much better aligned with the intended modular pipeline structure.

The create_rolling_windows(df, window_size, step_size) function now focuses only on rolling window segmentation, accepts the expected inputs, and returns a list of windowed DataFrames. The code is clean, readable, and no longer overlaps with other pipeline stages.

There is one minor issue to fix before final integration: the function currently resets the index inside each window using reset_index(drop=True). Since the preprocessed dataframe uses the timestamp as the index, we should preserve the original index so the correlation stage can correctly capture the start_time and end_time for each window.

I updated:
window = df.iloc[start:end].copy().reset_index(drop=True)

to:
window = df.iloc[start:end].copy()

Overall, this is a solid rolling window implementation and is ready to proceed after this minor fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants