Skip to content

feat: schema validation, ipsatize unification, and CircEResults container#132

Open
MitchellAcoustics wants to merge 9 commits intodevfrom
feature/api-improvements
Open

feat: schema validation, ipsatize unification, and CircEResults container#132
MitchellAcoustics wants to merge 9 commits intodevfrom
feature/api-improvements

Conversation

@MitchellAcoustics
Copy link
Owner

Summary

  • Fix SATPSchema silent drop — changed drop_invalid_rows = False so invalid rows raise a SchemaErrors exception rather than being silently discarded; added errors: Literal["raise", "warn"] parameter to fit_circe following pandas convention
  • Fix centering bug + add ipsatize()fit_circe now uses grand-mean centering (one scalar per participant, matching the published R analysis) instead of the previous column-wise centering; ipsatize(method=...) added to surveys.processing with "grand_mean" (default), "column_wise", and "row_wise" methods; person_center() becomes a thin wrapper with a deprecation note
  • Add CircEResults containerfit_circe now returns a typed CircEResults dataclass (.table, .for_model(), .__len__(), ._repr_html_()) instead of a bare pd.DataFrame; exported from soundscapy.satp and lazily from soundscapy

Test plan

  • uv run pytest test/satp/ -m optional_deps — CircE integration tests (requires R + CircE)
  • uv run pytest test/surveys/test_survey_processing.py::TestIpsatize — new ipsatize unit tests (no R required)
  • uv run pytest --xdoctest src/soundscapy/satp/circe.py — doctests
  • Quarto tutorial renders cleanly: quarto render docs/tutorials/SATP_CircE_Analysis.qmd — verified locally, all 16 languages complete, df values match canonical exactly

🤖 Generated with Claude Code

Andrew Mitchell and others added 9 commits March 2, 2026 22:51
…iner

Implements the top-priority recommendations from the SATP circumplex
validation analysis review:

- fix: SATPSchema now raises SchemaErrors by default instead of silently
  dropping invalid rows (drop_invalid_rows=False); fit_circe gains an
  errors='raise'|'warn' parameter for opt-in lenient behaviour

- fix: fit_circe uses grand-mean centering (1 scalar/participant) matching
  the published SATP R analysis; column-wise centering (8 scalars/participant)
  was the previous incorrect default, causing SRMR values ~0.005 higher than
  canonical R values

- feat: add soundscapy.surveys.ipsatize(method='grand_mean'|'column_wise'|
  'row_wise') — unified ipsatization entry point living in core surveys/
  module (no R dependency); person_center() becomes a thin wrapper with a
  deprecation note

- feat: fit_circe returns CircEResults instead of a bare DataFrame; provides
  .table (DataFrame), .for_model(CircModelE), ._repr_html_(), and len();
  CircE dataclass instances are preserved with all typed attributes

- test: new TestIpsatize in test/surveys/ (core, no R); new CircEResults
  tests; existing test_circe.py updated for new return type

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- fix(ipsatize): use np.nanmean() for grand-mean centering so participants
  with partial NaN data get a valid grand mean from available values, rather
  than the whole participant being silently wiped when any NaN is present

- fix(fit_circe): use .loc[~index.isin()] instead of .drop(index=) in the
  errors='warn' path to correctly handle duplicate DataFrame indices; report
  accurate remaining row count; wrap second validate() call to prevent
  unhandled SchemaErrors leaking from the warn path

- docs: clarify models=[] returns CircEResults not a DataFrame; disambiguate
  'the default' in person_center deprecation note

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test introduces NaN values to verify listwise deletion, but with
drop_invalid_rows=False (the new default), NaN PAQ values fail schema
validation before the assertion is reached.  Passing errors="warn" keeps
the test focused on its intent (listwise deletion) rather than schema
validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v0.24.2 had a bug handling dotted keys in inline tables that surfaced
under Python 3.14 in CI, causing toml-sort-fix to modify pyproject.toml
on every run.  v0.24.3 fixes this.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Causes spurious CI failures under Python 3.14 — modifies pyproject.toml
on every run despite the file already being correctly sorted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant