Draft
Conversation
Contributor
Reviewer's GuideAdjusts audformat.utils.to_segmented_index() to preserve nanosecond timedelta precision under pandas 3.0’s new default of second precision, and adds a regression test to guard against FutureWarnings and precision loss when filling NaT segment ends from file durations. File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Before calling
ends = ends.astype("timedelta64[ns]"), consider guarding with a check thatendshas a timedelta-like dtype (e.g., usingis_timedelta64_dtype) to avoid surprising failures if the index level type changes in the future. - In
test_to_segmented_index_timedelta_precision, you can simplify and make the duration comparison more robust by usingpandas.testing.assert_index_equal(orassert_series_equal) forresult_endsvsexpected_endsinstead of the manual loop.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Before calling `ends = ends.astype("timedelta64[ns]")`, consider guarding with a check that `ends` has a timedelta-like dtype (e.g., using `is_timedelta64_dtype`) to avoid surprising failures if the index level type changes in the future.
- In `test_to_segmented_index_timedelta_precision`, you can simplify and make the duration comparison more robust by using `pandas.testing.assert_index_equal` (or `assert_series_equal`) for `result_ends` vs `expected_ends` instead of the manual loop.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Member
Author
|
We have now too many changes here, and introduced a lot of new issues that were not present before, so we should maybe better target the updates step by step in several pull requests. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #487
This add tests and fixes for the new
pandas==3.0.0timdelta[s]instead oftimedelta[ns]default.This required the following fixes:
audformat.utils.to_segmented_index()timedelta64[ns]before iloc assignmentaudformat.utils.union()timedelta64[ns]in all code pathsaudformat.utils.intersect()timedelta64[ns]audformat.utils.set_index_dtypes().astype(dtype)afterpd.to_timedelta()for empty levelsaudformat.segmented_index()set_index_dtypesto ensuretimedelta64[ns]audformat.testing.add_table()pd.to_timedelta()callaudformat.utils.hash()objectdtype for string columns to get same hash under Python 3.14Summary by Sourcery
Ensure segmented index duration handling preserves sub-second precision with pandas 3.0 and later.
Bug Fixes:
Tests: