Skip to content

Add MiniSEED read support#666

Merged
d-chambers merged 8 commits intodevfrom
mseed-support
May 5, 2026
Merged

Add MiniSEED read support#666
d-chambers merged 8 commits intodevfrom
mseed-support

Conversation

@d-chambers
Copy link
Copy Markdown
Contributor

@d-chambers d-chambers commented May 1, 2026

Summary

  • add optional PyMseed-backed MiniSEED v2/v3 reader
  • map MiniSEED source IDs to channel-time patches with NSLC metadata as per-channel coords
  • add partial read support and include a small real DAS MiniSEED fixture in common IO coverage

Summary by CodeRabbit

  • New Features

    • Added MiniSEED support (v2 & v3): automatic format detection, metadata scanning, and reading with time/channel/source-patch filtering.
  • Tests

    • Added comprehensive MiniSEED test suite covering detection, reading, scanning, filtering, edge cases, and missing-dependency behavior.
  • Chores

    • Added optional dependency for MiniSEED handling (pymseed[numpy]).
    • Updated data registry with a sample MiniSEED entry.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a MiniSEED IO backend (MSeedV2/MSeedV3) with format detection, scanning, and reading; implements MiniSEED decoding/grouping utilities; registers entry points and optional pymseed dependency; adds a data registry fixture, a small file-header helper, and extensive unit/integration tests.

Changes

MiniSEED IO feature

Layer / File(s) Summary
Registry / Fixtures
dascore/data_registry.txt
Adds one Etna MiniSEED sample entry mapping etna_9n_3chan_10s.mseed to its hash and remote URL.
Filesystem helper
dascore/utils/io.py
Adds _read_file_header to safely read an initial byte prefix from a path, returning empty bytes on OSError.
Decoding & grouping logic (core data shape & algorithms)
dascore/io/mseed/utils.py
Implements MiniSEED internals: identity/timing dataclasses, record→segment decoding and trimming, dtype/encoding mapping, contiguity coalescing for decoded segments and header-only summaries, grouping by compatibility keys, stable channel-index mapping, deterministic _source_patch_id formatting, patch and ScanPayload construction, and _detect_format for v2/v3.
Fiber IO implementation (wiring & public API)
dascore/io/mseed/core.py
Adds MSeedV2(FiberIO) with name="MSEED", preferred_extensions, version="2", and methods get_format, scan, and read that delegate to _detect_format, _scan_patches, and _get_patches with lazy optional_import("pymseed"); adds MSeedV3 overriding version="3".
Package surface
dascore/io/mseed/__init__.py
Adds package initializer enabling postponed annotations, a docstring, and re-exports MSeedV2 and MSeedV3 from dascore.io.mseed.core.
Packaging / entry points & extras
pyproject.toml
Adds optional dependency pymseed[numpy] and registers entry points MSEED__V2 = "dascore.io.mseed.core:MSeedV2" and MSEED__V3 = "dascore.io.mseed.core:MSeedV3" under project.entry-points."dascore.fiber_io".
Tests
tests/test_io/test_common_io.py, tests/test_io/test_mseed/test_mseed.py
Adds MSeedV2 to common IO read fixtures; introduces a comprehensive MiniSEED test module covering format detection, read/scan semantics, grouping/coalescing rules, many edge cases, dependency-missing behavior, and a real Etna MiniSEED integration test.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description provides a clear summary of the changes and objectives, but does not follow the repository's template structure with labeled sections and checklist items. Consider using the repository's PR template to include structured sections (Description, Checklist) and ensure issue references and testing documentation are explicitly noted.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: adding MiniSEED read support. It is concise, clear, and directly reflects the main feature being introduced.
Docstring Coverage ✅ Passed Docstring coverage is 87.39% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mseed-support

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e903d857b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread dascore/io/mseed/utils.py Outdated
Comment on lines +229 to +231
sample_count = min(x.sample_count for x in segments)
source_ids = tuple(x.source_id for x in segments)
data = np.stack([x.data[:sample_count] for x in segments])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve all samples when channel lengths differ

In _patch_from_segments, channel data are truncated to the shortest trace via sample_count = min(...) and x.data[:sample_count], so any group where traces share start/rate but have different lengths will silently drop valid trailing samples from longer channels. Because _get_group_key does not include end time or sample count, this case is reachable for real MiniSEED files with uneven channel coverage, and the truncation affects both read results and scan-derived metadata.

Useful? React with 👍 / 👎.

Comment thread dascore/io/mseed/utils.py Outdated
"""Return a stable source patch ID for a MiniSEED group."""
version, network, location, seed_channel, start_ns, sample_rate = group_key
source_key = ".".join((network, location, seed_channel))
return f"v{version}:{source_key}:{start_ns}:{sample_rate:g}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Generate source_patch_id without float rounding collisions

_source_patch_id formats sample_rate with :g, which keeps only ~6 significant digits, so distinct groups can collapse to the same ID (for example, close non-integer rates). That makes source_patch_id non-unique and can cause read(..., source_patch_id=...) to match multiple logical patches or the wrong one. The ID should encode sample rate at full precision (or via an unambiguous derived value like step nanoseconds).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
dascore/io/mseed/utils.py (1)

225-254: 💤 Low value

Consider documenting the truncation behavior for mismatched segment lengths.

Line 229 uses min(x.sample_count for x in segments) which silently truncates longer segments when merging traces with different sample counts. While this may be intentional for handling slight timing variations, it could silently drop data. Consider adding a brief comment or logging when truncation occurs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dascore/io/mseed/utils.py` around lines 225 - 254, _patches_from_segments
currently computes sample_count = min(x.sample_count for x in segments) which
silently truncates longer segments; update _patch_from_segments to detect when
any segment.sample_count > sample_count and either add a brief inline comment
explaining intentional truncation or emit a warning log that includes the
group_key, the computed sample_count and the original per-segment sample counts
(use the sample_count variable and the segments list to build the message) so
callers are informed when data is being truncated before stacking into data =
np.stack(...).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@dascore/io/mseed/utils.py`:
- Around line 225-254: _patches_from_segments currently computes sample_count =
min(x.sample_count for x in segments) which silently truncates longer segments;
update _patch_from_segments to detect when any segment.sample_count >
sample_count and either add a brief inline comment explaining intentional
truncation or emit a warning log that includes the group_key, the computed
sample_count and the original per-segment sample counts (use the sample_count
variable and the segments list to build the message) so callers are informed
when data is being truncated before stacking into data = np.stack(...).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f2106f71-0b81-4a3d-96d6-1f3b0e81274e

📥 Commits

Reviewing files that changed from the base of the PR and between 906e90b and 7e903d8.

📒 Files selected for processing (7)
  • dascore/data_registry.txt
  • dascore/io/mseed/__init__.py
  • dascore/io/mseed/core.py
  • dascore/io/mseed/utils.py
  • pyproject.toml
  • tests/test_io/test_common_io.py
  • tests/test_io/test_mseed/test_mseed.py

@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (906e90b) to head (a6f03ee).

Additional details and impacted files
@@            Coverage Diff             @@
##               dev      #666    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          140       143     +3     
  Lines        12890     13273   +383     
==========================================
+ Hits         12890     13273   +383     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot added the IO Work for reading/writing different formats label May 1, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
tests/test_io/test_common_io.py (1)

93-94: ⚡ Quick win

Add the v3 reader to the shared matrix too.

This common coverage currently exercises only MSeedV2; if MSeedV3 is a separate public entry point, regressions in that plugin won't be caught by the generic IO tests.

Suggested change
-from dascore.io.mseed.core import MSeedV2
+from dascore.io.mseed.core import MSeedV2, MSeedV3
...
     NetCDFCFV18(): ("xdas_netcdf.nc",),
     MSeedV2(): ("etna_9n_3chan_10s.mseed",),
+    MSeedV3(): ("etna_9n_3chan_10s.mseed",),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_io/test_common_io.py` around lines 93 - 94, The shared test matrix
currently only includes the MSeedV2 entry so MSeedV3 isn't exercised; update the
matrix in tests/test_io/test_common_io.py by adding a parallel MSeedV3 entry
(e.g., MSeedV3(): ("etna_9n_3chan_10s.mseed",) alongside MSeedV2) so the generic
IO tests also run the v3 reader; ensure the key uses the public MSeedV3 symbol
exactly as declared to pick up that plugin.
dascore/io/mseed/utils.py (1)

93-99: ⚡ Quick win

Narrow this fallback to the parser failures you expect.

Catching Exception here will hide real bugs in pymseed.sourceid2nslc and silently turn unexpected failures into generic NSLC metadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dascore/io/mseed/utils.py` around lines 93 - 99, The helper
_source_id_to_nslc should only swallow the specific parsing errors from
pymseed.sourceid2nslc instead of catching Exception; update the except to catch
the parser error types you expect (e.g., ValueError, TypeError or the specific
pymseed parse exception class if it exposes one, such as pymseed.SourceIDError)
and return the empty-fallback only for those, while letting all other exceptions
propagate (re-raise) so real bugs in pymseed.sourceid2nslc are not hidden.
tests/test_io/test_mseed/test_mseed.py (1)

467-473: ⚡ Quick win

Avoid a direct registry fetch in the test body.

This makes the integration test depend on downloader/network state. Reuse the shared fixture or wrap the fetch in skip_timeout() so the test stays stable on slow/offline runs.

Suggested change
 from dascore.io.mseed import utils as mseed_utils
 from dascore.io.mseed.core import MSeedV2
+from tests.test_io._common_io_test_utils import skip_timeout
...
-        path = fetch("etna_9n_3chan_10s.mseed")
+        with skip_timeout():
+            path = fetch("etna_9n_3chan_10s.mseed")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_io/test_mseed/test_mseed.py` around lines 467 - 473, The test
test_read_das_station_channels currently calls fetch("etna_9n_3chan_10s.mseed")
directly which makes it depend on network/downloader timing; instead either use
the shared downloader fixture (if one exists in the test suite) to obtain the
path or wrap the fetch call with skip_timeout(fetch(...)) so it will be skipped
on slow/offline runs. Locate the fetch invocation in
test_read_das_station_channels and replace it with the fixture variable (e.g.,
use the provided fetch_path or downloader fixture) or change the line to path =
skip_timeout(lambda: fetch("etna_9n_3chan_10s.mseed")) (or equivalent helper
used elsewhere in tests) so the test no longer directly depends on network
timing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@dascore/io/mseed/utils.py`:
- Around line 93-99: The helper _source_id_to_nslc should only swallow the
specific parsing errors from pymseed.sourceid2nslc instead of catching
Exception; update the except to catch the parser error types you expect (e.g.,
ValueError, TypeError or the specific pymseed parse exception class if it
exposes one, such as pymseed.SourceIDError) and return the empty-fallback only
for those, while letting all other exceptions propagate (re-raise) so real bugs
in pymseed.sourceid2nslc are not hidden.

In `@tests/test_io/test_common_io.py`:
- Around line 93-94: The shared test matrix currently only includes the MSeedV2
entry so MSeedV3 isn't exercised; update the matrix in
tests/test_io/test_common_io.py by adding a parallel MSeedV3 entry (e.g.,
MSeedV3(): ("etna_9n_3chan_10s.mseed",) alongside MSeedV2) so the generic IO
tests also run the v3 reader; ensure the key uses the public MSeedV3 symbol
exactly as declared to pick up that plugin.

In `@tests/test_io/test_mseed/test_mseed.py`:
- Around line 467-473: The test test_read_das_station_channels currently calls
fetch("etna_9n_3chan_10s.mseed") directly which makes it depend on
network/downloader timing; instead either use the shared downloader fixture (if
one exists in the test suite) to obtain the path or wrap the fetch call with
skip_timeout(fetch(...)) so it will be skipped on slow/offline runs. Locate the
fetch invocation in test_read_das_station_channels and replace it with the
fixture variable (e.g., use the provided fetch_path or downloader fixture) or
change the line to path = skip_timeout(lambda: fetch("etna_9n_3chan_10s.mseed"))
(or equivalent helper used elsewhere in tests) so the test no longer directly
depends on network timing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ad91f41d-e62f-4d59-b0ee-986f6a6a86b3

📥 Commits

Reviewing files that changed from the base of the PR and between 7e903d8 and ae45a49.

📒 Files selected for processing (5)
  • dascore/io/mseed/core.py
  • dascore/io/mseed/utils.py
  • dascore/utils/io.py
  • tests/test_io/test_common_io.py
  • tests/test_io/test_mseed/test_mseed.py
✅ Files skipped from review due to trivial changes (2)
  • dascore/utils/io.py
  • dascore/io/mseed/core.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
dascore/io/mseed/utils.py (1)

532-538: 💤 Low value

Narrow the exception handler to avoid masking unexpected errors.

The bare except Exception could mask bugs (e.g., logic errors in the detection code). Since unpack raises struct.error on malformed data, catching that specifically is clearer.

♻️ Proposed fix
+from struct import error as StructError, unpack
-from struct import unpack
...
     try:
         year, day = unpack(">HH", header[20:24])
         hour, minute, second = header[24:27]
         _unused, frac_seconds, sample_count = unpack(">BHH", header[27:32])
         data_offset, blockette_offset = unpack(">HH", header[44:48])
-    except Exception:
+    except (StructError, IndexError):
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dascore/io/mseed/utils.py` around lines 532 - 538, The try/except around the
unpack calls is too broad; replace the bare "except Exception" with a specific
catch for struct.error (the exception raised by unpack) so only malformed/parse
errors are handled—e.g., "except struct.error:" (or "from struct import error as
StructError" then "except StructError:") and ensure the struct.error symbol is
imported/available; keep the existing unpack/header logic (year, day =
unpack(...), hour, minute, second = header..., _unused, frac_seconds,
sample_count = unpack(...), data_offset, blockette_offset = unpack(...)) so only
struct unpack failures are swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@dascore/io/mseed/utils.py`:
- Around line 532-538: The try/except around the unpack calls is too broad;
replace the bare "except Exception" with a specific catch for struct.error (the
exception raised by unpack) so only malformed/parse errors are handled—e.g.,
"except struct.error:" (or "from struct import error as StructError" then
"except StructError:") and ensure the struct.error symbol is imported/available;
keep the existing unpack/header logic (year, day = unpack(...), hour, minute,
second = header..., _unused, frac_seconds, sample_count = unpack(...),
data_offset, blockette_offset = unpack(...)) so only struct unpack failures are
swallowed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 51f27d00-c4e1-472d-9b49-726e5e9f5176

📥 Commits

Reviewing files that changed from the base of the PR and between ae45a49 and fdb6e0f.

📒 Files selected for processing (4)
  • dascore/io/mseed/core.py
  • dascore/io/mseed/utils.py
  • tests/test_io/test_common_io.py
  • tests/test_io/test_mseed/test_mseed.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • dascore/io/mseed/core.py
  • tests/test_io/test_mseed/test_mseed.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dascore/io/mseed/utils.py`:
- Around line 354-360: can_merge currently allows merging segments with
different original encodings if their decoded dtype matches, causing
scan()/read() to disagree about patch boundaries; update can_merge (used by
_coalesce_source_segments) to require identical original encoding metadata in
addition to the existing checks (sample_rate, format_version, next_start_ns,
data.dtype). Concretely, add a equality check for the attribute that records the
source encoding (e.g., pending.encoding == seg.encoding or the actual encoding
field used on _TraceSegment in your codebase) so _coalesce_source_segments
preserves encoding boundaries the same way _coalesce_source_summaries does,
preventing source_patch_id mismatches between scan and read.
- Around line 147-149: The current next_start_ns method compounds per-sample
rounding by multiplying sample_count by sample_step_ns directly; change it to
compute the nanosecond offset with exact rational/rounded arithmetic to avoid
drift (e.g., use fractions.Fraction or Decimal to compute offset =
int(round(Fraction(self.sample_step_ns) * self.sample_count))) and then return
self.start_ns + offset; update next_start_ns to use this precise offset
calculation referencing the existing attributes start_ns, sample_count and
sample_step_ns.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4442308a-494f-4db2-9296-0de19517334a

📥 Commits

Reviewing files that changed from the base of the PR and between fdb6e0f and 5bc7805.

📒 Files selected for processing (3)
  • dascore/io/mseed/core.py
  • dascore/io/mseed/utils.py
  • tests/test_io/test_mseed/test_mseed.py
✅ Files skipped from review due to trivial changes (1)
  • tests/test_io/test_mseed/test_mseed.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • dascore/io/mseed/core.py

Comment thread dascore/io/mseed/utils.py Outdated
Comment thread dascore/io/mseed/utils.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/test_io/test_common_io.py (1)

207-214: 💤 Low value

_get_coord_trim_values produces a degenerate single-point trim for small coordinates.

For any coordinate with fewer than 10 unique values (e.g. the 3-channel MiniSEED fixture's channel axis), len // 10 == 0, so start_ind == stop_ind == 1 and both endpoints are the same value. The double-ended test then exercises only exact single-point selection rather than a genuine range, which is valid but less thorough.

♻️ Proposed fix
 def _get_coord_trim_values(coord):
     """Return existing coordinate values to use for read selections."""
     values = coord.values
     if len(values) == 1:
         return values[0], values[0]
-    start_ind = min(max(1, len(values) // 10), len(values) - 1)
-    stop_ind = min(max(start_ind, 2 * len(values) // 10), len(values) - 1)
+    quarter = max(1, len(values) // 4)
+    start_ind = min(quarter, len(values) - 2)
+    stop_ind = min(start_ind + max(1, len(values) // 4), len(values) - 1)
     return values[start_ind], values[stop_ind]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_io/test_common_io.py` around lines 207 - 214, The helper
_get_coord_trim_values currently can return the same value for start and stop
when len(values) < 10 because start_ind and stop_ind both compute to 1; change
the logic so the returned endpoints always form a proper range when possible:
after computing start_ind and stop_ind (or instead of the current math), ensure
stop_ind > start_ind by adjusting them within bounds (e.g., if stop_ind ==
start_ind, decrement start_ind if >0 else increment stop_ind if < len(values)-1)
and then return values[start_ind], values[stop_ind]; reference
_get_coord_trim_values, start_ind, stop_ind, and values when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dascore/io/mseed/utils.py`:
- Around line 89-91: The __post_init__ method currently uses an assert (assert
len(self.data) == self.sample_count) which is removed under -O; replace it with
an explicit runtime check that raises a ValueError when len(self.data) !=
self.sample_count so the invariant is always enforced. Update the __post_init__
in the class (method name __post_init__) to perform: if len(self.data) !=
self.sample_count: raise ValueError(f"sample_count mismatch: expected
{self.sample_count}, got {len(self.data)}") so the error references the
sample_count and data lengths.
- Around line 639-645: The bare except should be narrowed to struct.error:
replace "except Exception:" with "except struct.error:" around the unpack calls
(the try block containing unpack(">HH", header[20:24]) and unpack(">BHH",
header[27:32]) etc.) and ensure the module is imported so struct.error is
available (add "import struct" at the top if not present).

---

Nitpick comments:
In `@tests/test_io/test_common_io.py`:
- Around line 207-214: The helper _get_coord_trim_values currently can return
the same value for start and stop when len(values) < 10 because start_ind and
stop_ind both compute to 1; change the logic so the returned endpoints always
form a proper range when possible: after computing start_ind and stop_ind (or
instead of the current math), ensure stop_ind > start_ind by adjusting them
within bounds (e.g., if stop_ind == start_ind, decrement start_ind if >0 else
increment stop_ind if < len(values)-1) and then return values[start_ind],
values[stop_ind]; reference _get_coord_trim_values, start_ind, stop_ind, and
values when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f4a5e34c-a572-4421-bfd4-8650975bb98c

📥 Commits

Reviewing files that changed from the base of the PR and between 5bc7805 and 25d2154.

📒 Files selected for processing (3)
  • dascore/io/mseed/utils.py
  • tests/test_io/test_common_io.py
  • tests/test_io/test_mseed/test_mseed.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_io/test_mseed/test_mseed.py

Comment thread dascore/io/mseed/utils.py
Comment on lines +89 to +91
def __post_init__(self) -> None:
"""Validate that trace metadata matches the decoded samples."""
assert len(self.data) == self.sample_count
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

assert in __post_init__ is silently stripped under -O.

Use an explicit ValueError so the invariant is enforced regardless of optimization flags.

🛡️ Proposed fix
     def __post_init__(self) -> None:
         """Validate that trace metadata matches the decoded samples."""
-        assert len(self.data) == self.sample_count
+        if len(self.data) != self.sample_count:
+            msg = (
+                f"Decoded sample count ({len(self.data)}) does not match "
+                f"header sample_count ({self.sample_count})."
+            )
+            raise ValueError(msg)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dascore/io/mseed/utils.py` around lines 89 - 91, The __post_init__ method
currently uses an assert (assert len(self.data) == self.sample_count) which is
removed under -O; replace it with an explicit runtime check that raises a
ValueError when len(self.data) != self.sample_count so the invariant is always
enforced. Update the __post_init__ in the class (method name __post_init__) to
perform: if len(self.data) != self.sample_count: raise ValueError(f"sample_count
mismatch: expected {self.sample_count}, got {len(self.data)}") so the error
references the sample_count and data lengths.

Comment thread dascore/io/mseed/utils.py
@d-chambers d-chambers merged commit d683eda into dev May 5, 2026
25 checks passed
@d-chambers d-chambers deleted the mseed-support branch May 5, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO Work for reading/writing different formats

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant