CCA - 003: Rolling window segmentation by saumyadholia25-source · Pull Request #80 · DataBytes-Organisation/Intelligent-IoT-Data-Management

saumyadholia25-source · 2026-04-23T12:45:02Z

Implemented rolling window logic and correlation computation for multi-sensor time-series data

Implemented rolling window logic and correlation computation for multi-sensor time-series data Signed-off-by: saumya dholia <saumyadholia25@gmail.com>

senuradp

Good work on exploring rolling window segmentation and validating it across different datasets. The approach and logic are correct and relevant for the correlation alert pipeline.

However, for integration into the project, we need this to follow the modular structure defined in the wrapper.

Please refactor your implementation into a single function:
create_rolling_windows(df, window_size, step_size)

This function should:

take the preprocessed dataframe as input
return a list of windowed dataframes
avoid implementing other pipeline stages (correlation, alerts, etc.)
avoid notebook-based execution in the final version

The notebooks are useful for experimentation, but the final PR should contain a clean Python implementation that fits into the wrapper pipeline.

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>

saumyadholia25-source · 2026-04-30T12:56:01Z

rolling_windows.py.ipynb

uploaded the updated file

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>

senuradp · 2026-05-13T11:39:36Z

Thanks for updating the PR based on the earlier feedback. I reviewed the updated rolling window implementation, and this version is much better aligned with the intended modular pipeline structure.

The create_rolling_windows(df, window_size, step_size) function now focuses only on rolling window segmentation, accepts the expected inputs, and returns a list of windowed DataFrames. The code is clean, readable, and no longer overlaps with other pipeline stages.

There is one minor issue to fix before final integration: the function currently resets the index inside each window using reset_index(drop=True). Since the preprocessed dataframe uses the timestamp as the index, we should preserve the original index so the correlation stage can correctly capture the start_time and end_time for each window.

I updated:
window = df.iloc[start:end].copy().reset_index(drop=True)

to:
window = df.iloc[start:end].copy()

Overall, this is a solid rolling window implementation and is ready to proceed after this minor fix.

CCA - 003: Rolling window segmentation

f742221

Implemented rolling window logic and correlation computation for multi-sensor time-series data Signed-off-by: saumya dholia <saumyadholia25@gmail.com>

senuradp requested changes Apr 25, 2026

View reviewed changes

Add files via upload

927d05d

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>

Add files via upload

5b9ec57

Signed-off-by: saumya dholia <saumyadholia25@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CCA - 003: Rolling window segmentation#80

CCA - 003: Rolling window segmentation#80
saumyadholia25-source wants to merge 3 commits into
DataBytes-Organisation:feature/correlation-alert/venura/saumyafrom
saumyadholia25-source:feature/correlation-alert/venura/saumya

saumyadholia25-source commented Apr 23, 2026

Uh oh!

senuradp left a comment

Uh oh!

saumyadholia25-source commented Apr 30, 2026

Uh oh!

senuradp commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

saumyadholia25-source commented Apr 23, 2026

Uh oh!

senuradp left a comment

Choose a reason for hiding this comment

Uh oh!

saumyadholia25-source commented Apr 30, 2026

Uh oh!

senuradp commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants