feat: add one-shot resample function for simplified resampling by phillip-wenig-frequenz · Pull Request #78 · frequenz-floss/frequenz-resampling-rs

phillip-wenig-frequenz · 2026-02-26T11:55:29Z

Summary

Add a new resample() convenience function available in both Rust and Python APIs, allowing users to resample data in a single call without needing to instantiate and manage a Resampler instance.

Rust API

use frequenz_resampling::{resample, ResamplingFunction, SimpleSample};
let result = resample(&data, TimeDelta::seconds(5), ResamplingFunction::Average, true);

Python API

from frequenz.resampling import resample, ResamplingFunction
result = resample(data, timedelta(seconds=5), ResamplingFunction.Average)

Changes

Added resample() function in Rust (src/lib.rs) with SimpleSample type
Added resample() Python binding (src/python.rs)
Made epoch_align() public to support the new functionality
Updated README with "One-Shot Resampling" and "Stateful Resampling" sections
Updated type stubs with Sequence input type for flexibility
Added comprehensive test coverage in both Rust and Python
Updated RELEASE_NOTES.md

depends on #75

Copilot

Pull request overview

This PR adds a one-shot resample() convenience API for both Rust and Python to resample a batch of timestamp/value pairs without instantiating a Resampler, and clarifies/fixes interval-boundary semantics so first_timestamp only affects output timestamp labeling.

Changes:

Added one-shot resample() functions to Rust (src/lib.rs) and Python bindings (src/python.rs), plus exports/stubs.
Standardized resampling interval semantics to [start, end) regardless of first_timestamp, updating docs and tests accordingly.
Updated README and release notes to document the new API and the semantics fix.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/lib.rs`	Adds Rust `resample()` convenience function and `SimpleSample`, re-exports `epoch_align`.
`src/python.rs`	Adds Python `resample()` binding and exports it from the extension module.
`src/resampler.rs`	Fixes boundary/grouping behavior to always use `[start, end)`; makes `epoch_align()` public.
`src/tests.rs`	Updates existing Rust tests and adds coverage for boundary/labeling semantics and one-shot resampling.
`tests/test_resampler.py`	Updates Python tests for new semantics and adds one-shot `resample()` tests.
`frequenz/resampling/_rust_backend.pyi`	Exposes `resample()` in stubs and widens input type to `Sequence`.
`frequenz/resampling/__init__.py`	Re-exports `resample()` from the Python package.
`README.md`	Documents one-shot vs stateful resampling for Rust and Python.
`RELEASE_NOTES.md`	Notes the `first_timestamp` semantic fix and the new one-shot API.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

phillip-wenig-frequenz · 2026-02-26T12:11:51Z

Thanks for the review feedback.

Regarding the division-by-zero concern for sub-millisecond/zero intervals: this is a pre-existing issue in the codebase, not introduced by this PR. The epoch_align() function was already called internally by Resampler::new(), so users could already trigger the same panic through the existing public API.

The scope of this PR is adding the resample() convenience function with behavior consistent with the existing Resampler API. Input validation for intervals could be addressed in a follow-up PR if desired.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cwasicki · 2026-02-27T20:58:48Z

+    interval: timedelta,
+    method: ResamplingFunction,
+    *,
+    first_timestamp: bool = True,


You might consider using the same terminology as in pandas, e.g. this would be label which can be "right" or "left" (see also my other comment).

cwasicki · 2026-02-27T21:11:38Z

            resampling_function: The resampling function.
            max_age_in_intervals: The maximum age of a sample in intervals.
            start: The start time of the resampling.
-            first_timestamp: Whether the resampled timestamp should be the first


Not sure about this change. There are two separate features here, the "closedness" definition of the intervals, and the label. I wonder if this parameter was intended to be for the interval definition, namely to distinguish:

left-closed: [t, t+1) (what we typically use in reporting API)

right-closed: (t, t+1] (what we typically use in the microgrid data pipeline in the SDK).

Resampled data cannot be converted from one interval definition to another.

For the other feature, the label, the data can easily be fixed afterwards by shifting the timestamps by the sampling period.

So to me it makes sense to have the original functionality and not change its meaning. Or if so, we better have two parameters.

See also closed and label parameters here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html.

malteschaaf

Only two small comments, but also I can't say much about the rust code.

malteschaaf · 2026-04-08T10:31:35Z

        max_age_in_intervals: int,
        start: datetime,
-        first_timestamp: bool = True,
+        closed: Literal["left", "right"],


One minor thing, because I introduced a similar change for the SDK resampler and was adviced to use an enum instead of a literal here. Did you give this a thought? I know pandas also uses the Literal but it might be benficial here to be a bit more explicit.

Yes, I liked the literal stuff more, but I get your point.

malteschaaf · 2026-04-08T10:37:23Z

+        closed: Which interval edge is closed for sample membership. Use
+            `"left"` for `[start, end)` intervals or `"right"` for
+            `(start, end]` intervals.
+        label: Which interval edge to use for output timestamps. Use `"left"`
+            for the interval start or `"right"` for the interval end.


Small wording suggestion to make this a bit clearer and more consistent with interval terminology to avoid mapping left/right to start/end:

Suggested change

closed: Which interval edge is closed for sample membership. Use

`"left"` for `[start, end)` intervals or `"right"` for

`(start, end]` intervals.

label: Which interval edge to use for output timestamps. Use `"left"`

for the interval start or `"right"` for the interval end.

closed: Which interval edge is closed for sample membership. Use `"left"` for

left-closed, right-open intervals `[start, end)` and `"right"` for

right-closed, left-open intervals `(start, end]`.

label: Which interval edge to use for output timestamps. Use `"left"` for the left bin edge and `"right"` for the right bin edge.

malteschaaf

LGTM

cwasicki · 2026-04-09T15:37:35Z

+
+    assert result == [
+        (start + 5 * step, 20.0),
+        (start + 10 * step, None),


Shouldn't this be

(start, 10.0), (start + 5 * step, 20),

Do you mean

(start, 10.0), (start + 5 * step, 20),

because start*step doesn't work, I think.

Indeed, fixed. Thank you!

With closed=Closed.Right and label=Label.Right, the buckets are (0s, 5s] and (5s, 10s], labeled at 5s and 10s. That means:

the sample at start (0s) is excluded from the first bucket

the sample at start + 5 * step (5s) is included in the first bucket

It depends on whether your reference is the index or the data. If it's the data, the first bucket would be introduced as (-5s, 0s] to cover your data point.

E.g. for pandas:

pd.DataFrame([10,20], index=pd.to_datetime(["1970-01-01T00:00:00Z", "1970-01-01T00:00:05Z"])).resample("5s", closed="right", label="right").sum() 0 1970-01-01 00:00:00+00:00 10 1970-01-01 00:00:05+00:00 20

You can also see the new bucket when changing the label to left:

pd.DataFrame([10,20], index=pd.to_datetime(["1970-01-01T00:00:00Z", "1970-01-01T00:00:05Z"])).resample("5s", closed="right", label="left").sum() 0 1969-12-31 23:59:55+00:00 10 1970-01-01 00:00:00+00:00 20

I can change it to behave exactly like pandas if you like. What will be more helpful for our use cases?

Personally I prefer the pandas behavior since it focuses on keeping the data and not the index.

cwasicki · 2026-04-09T15:38:53Z

+        index=pd.DatetimeIndex([timestamp for timestamp, _ in data]),
+        dtype="float64",
+    )
+    pandas_result = pandas_series.resample("5s", closed="left", label="right").mean()


This can iterate over all combinations of closed/label options to validate consistency between both.

cwasicki · 2026-04-09T15:44:50Z

+
+    # First value in each interval is None
+    data: list[tuple[dt.datetime, float | None]] = [
+        (start + i * step, None if i in (0, 5) else float(i + 1)) for i in range(10)


What if the interval only contains None?

Complete bucket of only None:

Average, Sum, Min, Max, First, Last, Coalesce -> None

Count -> 0

Partially None bucket:

Average, Sum, Min, Max ignore the Nones

First and Last do not ignore them; if the first/last sample is None, result can be None

Coalesce returns the first non-None

Count counts only non-None

cwasicki · 2026-04-09T15:45:15Z

+
+
+def test_resample_function_with_none_values() -> None:
+    """Test the resample function with None values."""


Are NaNs handled the same as Nones?

Complete bucket of only NaN:

all aggregations except Count return NaN

Count -> counts them as present

Partially NaN bucket:

Average, Sum -> NaN

Min, Max -> effectively ignore NaN when real numbers are present

First, Last -> return NaN if the chosen edge sample is NaN

Coalesce -> returns NaN if it is the first non-None

Count -> counts NaN as present

cwasicki · 2026-04-09T15:45:37Z

+    # Interval [5, 10): values 7,8,9,10 → avg = 8.5
+    assert result[0] == (start, 3.5)
+    assert result[1] == (start + 5 * step, 8.5)
+


Maybe add a test for inf too.

Complete bucket of only inf:

all aggregations except Count return inf

Count -> counts them as present

Partially inf bucket:

Average, Sum -> inf

Min -> finite minimum wins if any smaller finite value exists

Max -> inf

First, Last -> return inf if the chosen edge sample is inf

Coalesce -> returns inf if it is the first non-None

Count -> counts inf as present

phillip-wenig-frequenz · 2026-04-10T12:32:24Z

@cwasicki pandas treats both None and NaN as missing in numeric series, while our implementation only treats None as missing.

Should we adapt it to work exactly like pandas or do we accept this difference? I think our way is better, since None and NaN are two different concepts. None means, nothing arrived. NaN means, NaN arrived.

phillip-wenig-frequenz · 2026-04-10T12:32:42Z

I added some more tests.

cwasicki · 2026-04-10T13:03:42Z

Should we adapt it to work exactly like pandas or do we accept this difference? I think our way is better, since None and NaN are two different concepts. None means, nothing arrived. NaN means, NaN arrived.

I haven't thought about this enough but your explanation makes sense to me. But I think this should be clearly explained in the docs, maybe even stating the different behavior in pandas.

cwasicki

Thank you!

…_timestamp The `first_timestamp` parameter was incorrectly affecting both output timestamp labeling and interval grouping semantics. When `first_timestamp=false`, intervals used `(start, end]` instead of `[start, end)`, causing samples at exact interval boundaries to be excluded. This fix makes interval grouping consistently use `[start, end)` regardless of `first_timestamp` value. The parameter now only controls the output timestamp labeling as documented: `true` labels with the interval start, `false` with the interval end. Changes include: - Simplified boundary checking functions to remove first_timestamp logic - Updated documentation to clarify the parameter's sole purpose - Fixed test data to start at t=0 to match README example - Added explicit test for first_timestamp=false behavior - Updated RELEASE_NOTES with bug fix details Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

Add tests to strengthen coverage of the first_timestamp fix: Rust tests (src/tests.rs): - test_first_timestamp_same_values: Verifies both settings produce identical aggregated values, differing only in output timestamps - test_sample_at_interval_boundary: Confirms sample at t=5 goes to interval [5, 10), validating [start, end) semantics - test_data_starting_mid_interval: Tests correct behavior when data doesn't start at interval boundary Python tests (tests/test_resampler.py): - Updated all tests to start data at t=0 to match README example - Fixed expected values to match correct [start, end) interval semantics Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

Add a new `resample()` convenience function available in both Rust and Python APIs, allowing users to resample data in a single call without needing to instantiate and manage a `Resampler` instance. The Rust implementation includes: - A new `SimpleSample` type for simple timestamp/value pairs - The public `resample()` function that internally uses `Resampler` - Made `epoch_align()` public to support the new functionality The Python implementation provides the same functionality through PyO3 bindings with matching semantics. Also updated: - README with "One-Shot Resampling" and "Stateful Resampling" sections - Type stubs with function signature using Sequence for flexibility - Comprehensive test coverage in both Rust and Python - RELEASE_NOTES.md Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

… label parameters Replace the single boolean `first_timestamp` parameter with two explicit enums in both `Resampler::new()` and `resample()` functions: - `Closed`: Controls which edge of the interval is closed for sample membership. Use `Closed::Left` for `[start, end)` intervals or `Closed::Right` for `(start, end]` intervals. - `Label`: Controls which edge of the interval is used for output timestamps. Use `Label::Left` for the interval start or `Label::Right` for the interval end. This clarifies the API by making the distinction between interval membership semantics (closed) and output timestamp labeling (label) explicit, rather than bundling them into a single boolean. BREAKING CHANGE: The `first_timestamp` parameter is removed from both Rust and Python APIs. Replace calls using `first_timestamp=true` with `Closed::Left, Label::Left` (Rust) or `closed="left", label="left"` (Python). Replace `first_timestamp=false` with `Closed::Left, Label::Right` (Rust) or `closed="left", label="right"` (Python). Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

phillip-wenig-frequenz requested review from Copilot, cwasicki, malteschaaf and shsms February 26, 2026 11:55

Copilot started reviewing on behalf of phillip-wenig-frequenz February 26, 2026 11:55 View session

phillip-wenig-frequenz force-pushed the resample-simple-function branch from a394304 to 95c51e4 Compare February 26, 2026 11:56

Copilot AI reviewed Feb 26, 2026

View reviewed changes

Comment thread src/resampler.rs

Comment thread src/python.rs Outdated

phillip-wenig-frequenz marked this pull request as draft February 26, 2026 12:16

phillip-wenig-frequenz force-pushed the resample-simple-function branch from 95c51e4 to a2a4745 Compare February 26, 2026 12:20

phillip-wenig-frequenz marked this pull request as ready for review February 26, 2026 12:20

Copilot AI review requested due to automatic review settings February 26, 2026 12:20

Copilot started reviewing on behalf of phillip-wenig-frequenz February 26, 2026 12:21 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

Comment thread src/lib.rs Outdated

cwasicki reviewed Feb 27, 2026

View reviewed changes

phillip-wenig-frequenz requested a review from cwasicki April 8, 2026 10:07

phillip-wenig-frequenz force-pushed the resample-simple-function branch 3 times, most recently from 966ba31 to f9d418a Compare April 8, 2026 10:22

malteschaaf reviewed Apr 8, 2026

View reviewed changes

phillip-wenig-frequenz force-pushed the resample-simple-function branch 2 times, most recently from eb5c08c to 89fffb9 Compare April 8, 2026 11:58

phillip-wenig-frequenz requested a review from malteschaaf April 8, 2026 11:58

malteschaaf approved these changes Apr 9, 2026

View reviewed changes

cwasicki reviewed Apr 9, 2026

View reviewed changes

phillip-wenig-frequenz force-pushed the resample-simple-function branch from 89fffb9 to b1ac2c1 Compare April 10, 2026 12:33

phillip-wenig-frequenz force-pushed the resample-simple-function branch from b1ac2c1 to aa4820c Compare April 10, 2026 13:47

cwasicki previously approved these changes Apr 10, 2026

View reviewed changes

phillip-wenig-frequenz added this pull request to the merge queue Apr 10, 2026

github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Apr 10, 2026

phillip-wenig-frequenz requested a review from stefan-brus-frequenz April 10, 2026 15:17

phillip-wenig-frequenz added this pull request to the merge queue Apr 21, 2026

github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Apr 21, 2026

phillip-wenig-frequenz added 5 commits April 21, 2026 14:55

docs: update release notes for parameter refactoring

fbae0fc

Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

phillip-wenig-frequenz force-pushed the resample-simple-function branch from aa4820c to fbae0fc Compare April 21, 2026 12:55

phillip-wenig-frequenz enabled auto-merge April 21, 2026 12:55

phillip-wenig-frequenz added this pull request to the merge queue Apr 21, 2026

github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Apr 21, 2026

phillip-wenig-frequenz dismissed cwasicki’s stale review via 32e4b19 April 21, 2026 14:59

fix: merge queue

ba9ed6d

Signed-off-by: Phillip Wenig <phillip.wenig@frequenz.com>

phillip-wenig-frequenz force-pushed the resample-simple-function branch from 32e4b19 to ba9ed6d Compare April 21, 2026 15:00

phillip-wenig-frequenz requested a review from cwasicki April 21, 2026 15:00

cwasicki approved these changes Apr 21, 2026

View reviewed changes

phillip-wenig-frequenz added this pull request to the merge queue Apr 21, 2026

Merged via the queue into frequenz-floss:v0.x.x with commit b0f0a01 Apr 21, 2026
7 checks passed

phillip-wenig-frequenz deleted the resample-simple-function branch April 21, 2026 15:18



		def test_resample_function_with_none_values() -> None:
		"""Test the resample function with None values."""

Conversation

phillip-wenig-frequenz commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Rust API

Python API

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

phillip-wenig-frequenz commented Feb 26, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malteschaaf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malteschaaf left a comment

Choose a reason for hiding this comment

Uh oh!

cwasicki Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malteschaaf Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phillip-wenig-frequenz Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phillip-wenig-frequenz commented Feb 26, 2026 •

edited

Loading

cwasicki Apr 9, 2026 •

edited

Loading

malteschaaf Apr 9, 2026 •

edited

Loading

phillip-wenig-frequenz Apr 10, 2026 •

edited

Loading