feat: schema validation, ipsatize unification, and CircEResults container#132
Open
MitchellAcoustics wants to merge 9 commits intodevfrom
Open
feat: schema validation, ipsatize unification, and CircEResults container#132MitchellAcoustics wants to merge 9 commits intodevfrom
MitchellAcoustics wants to merge 9 commits intodevfrom
Conversation
…iner Implements the top-priority recommendations from the SATP circumplex validation analysis review: - fix: SATPSchema now raises SchemaErrors by default instead of silently dropping invalid rows (drop_invalid_rows=False); fit_circe gains an errors='raise'|'warn' parameter for opt-in lenient behaviour - fix: fit_circe uses grand-mean centering (1 scalar/participant) matching the published SATP R analysis; column-wise centering (8 scalars/participant) was the previous incorrect default, causing SRMR values ~0.005 higher than canonical R values - feat: add soundscapy.surveys.ipsatize(method='grand_mean'|'column_wise'| 'row_wise') — unified ipsatization entry point living in core surveys/ module (no R dependency); person_center() becomes a thin wrapper with a deprecation note - feat: fit_circe returns CircEResults instead of a bare DataFrame; provides .table (DataFrame), .for_model(CircModelE), ._repr_html_(), and len(); CircE dataclass instances are preserved with all typed attributes - test: new TestIpsatize in test/surveys/ (core, no R); new CircEResults tests; existing test_circe.py updated for new return type Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- fix(ipsatize): use np.nanmean() for grand-mean centering so participants with partial NaN data get a valid grand mean from available values, rather than the whole participant being silently wiped when any NaN is present - fix(fit_circe): use .loc[~index.isin()] instead of .drop(index=) in the errors='warn' path to correctly handle duplicate DataFrame indices; report accurate remaining row count; wrap second validate() call to prevent unhandled SchemaErrors leaking from the warn path - docs: clarify models=[] returns CircEResults not a DataFrame; disambiguate 'the default' in person_center deprecation note Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test introduces NaN values to verify listwise deletion, but with drop_invalid_rows=False (the new default), NaN PAQ values fail schema validation before the assertion is reached. Passing errors="warn" keeps the test focused on its intent (listwise deletion) rather than schema validation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v0.24.2 had a bug handling dotted keys in inline tables that surfaced under Python 3.14 in CI, causing toml-sort-fix to modify pyproject.toml on every run. v0.24.3 fixes this. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Causes spurious CI failures under Python 3.14 — modifies pyproject.toml on every run despite the file already being correctly sorted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SATPSchemasilent drop — changeddrop_invalid_rows = Falseso invalid rows raise aSchemaErrorsexception rather than being silently discarded; addederrors: Literal["raise", "warn"]parameter tofit_circefollowing pandas conventionipsatize()—fit_circenow uses grand-mean centering (one scalar per participant, matching the published R analysis) instead of the previous column-wise centering;ipsatize(method=...)added tosurveys.processingwith"grand_mean"(default),"column_wise", and"row_wise"methods;person_center()becomes a thin wrapper with a deprecation noteCircEResultscontainer —fit_circenow returns a typedCircEResultsdataclass (.table,.for_model(),.__len__(),._repr_html_()) instead of a barepd.DataFrame; exported fromsoundscapy.satpand lazily fromsoundscapyTest plan
uv run pytest test/satp/ -m optional_deps— CircE integration tests (requires R + CircE)uv run pytest test/surveys/test_survey_processing.py::TestIpsatize— new ipsatize unit tests (no R required)uv run pytest --xdoctest src/soundscapy/satp/circe.py— doctestsquarto render docs/tutorials/SATP_CircE_Analysis.qmd— verified locally, all 16 languages complete, df values match canonical exactly🤖 Generated with Claude Code