feat: SATP circumplex SEM module — functional API and architecture refactor by MitchellAcoustics · Pull Request #130 · MitchellAcoustics/Soundscapy

MitchellAcoustics · 2026-03-01T03:17:52Z

Summary

Replaces the SATP class-based API with a clean functional design and delivers a canonical verification notebook confirming numerical parity with the original R analysis.

New fit_circe(data, language, datasource) — primary public API; validates, ipsatizes, fits all four circumplex model types, and returns a tidy DataFrame directly
Deleted SATP class and ModelType class — replaced with stateless function; equal_ang/equal_com folded into CircModelE enum properties
CircE converted from pydantic dataclass → stdlib dataclasses.dataclass — removes dead BeforeValidator machinery (already handled by extract_bfgs_fit)
polar_angles: pd.Series | None (was pd.DataFrame | None) — PAQ_IDS index, estimates only; gdiff property computes RMSD against ideal circumplex
ipsatize() promoted to public module-level function
Listwise deletion (complete.dropna() before correlation) — consistent with R's na.omit
Unified case-insensitive column normalization via _COLUMN_ALIASES constant — handles PAQ label names ("Pleasant" → "PAQ1"), PAQ IDs ("paq1" → "PAQ1"), and participant field ("PARTICIPANT" → "participant")
SATP CircE Analysis notebook — Quarto .qmd verifying 16 languages × 4 models against canonical R CSV; confirms RMSEA.L/U swap bug in canonical data
41 tests covering numerical regression anchors, all new API paths, edge cases (n=0, error rows, ipsatize_data=False, RMSEA bounds ordering, gdiff, case-insensitive schema)

Test plan

uv run pytest test/satp/test_circe.py -v — 41 tests pass
uv run quarto render docs/tutorials/SATP_CircE_Analysis.qmd — all 12 cells execute cleanly
Verify fit_circe(data, language=..., datasource=...) returns a 4-row DataFrame with correct columns
Verify CircModelE.UNCONSTRAINED.equal_ang is False, CircModelE.CIRCUMPLEX.equal_ang is True
Verify "PARTICIPANT" and "Pleasant" column names are accepted by SATPSchema

…ebook - Add `gdiff` computed property to `CircE` dataclass: RMSD between fitted polar angles and ideal 45°-spaced circumplex positions. Returns None for models with fixed angles (EQUAL_ANG, CIRCUMPLEX). Adds module-level `_IDEAL_ANGLES` and `_IDEAL_ANGLES_REV` constants mirroring the R `sem_funcs.R` implementation. - Add `test/satp/fixtures/sem-fit-ipsatized-canonical.csv`: canonical reference output from the original R analysis (2024-06-13, SATP v1.4, 16 languages × 4 models). Documents a known RMSEA.L/RMSEA.U swap bug in the original R CSV export code. - Add `docs/tutorials/SATP_CircE_Analysis.qmd`: Quarto notebook replicating the SATP circumplex SEM analysis using Soundscapy. Confirms numerical consistency against the canonical: all df values match exactly, RMSEA bounds are correctly ordered, and 6 reflected (equivalent) angular solutions are detected and documented. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rchitecture - Delete SATP class and ModelType class; replace with fit_circe() function returning a tidy DataFrame directly (one row per model) - Fold equal_ang/equal_com boolean properties into CircModelE enum directly, removing the redundant ModelType wrapper - Convert CircE from pydantic dataclass to stdlib dataclasses.dataclass; remove dead BeforeValidator/length_1_array_to_number machinery (already handled by extract_bfgs_fit()) - Change polar_angles: pd.DataFrame|None → pd.Series|None with PAQ_IDS index; fix extraction to correctly use pd.DataFrame(raw_pa).T.iloc[0] for R matrix orientation (variables × stats) - Add CircE.to_dict() with PAQ angle columns expanded for DataFrame construction - Add public ipsatize() function (was private SATP._ipsatize_df()) - Fix n/correlation to use listwise deletion (complete cases), consistent with R's na.omit — resolves n discrepancies for languages with NaN PAQ values - Update exports: fit_circe, ipsatize added; SATP, ModelType removed - Rewrite test suite: preserve all numerical regression anchors in TestBfgsWrapper unchanged; replace TestSATP with TestFitCirce using new API; add TestCircModelEProperties; add to_dict and listwise deletion tests - Update SATP_CircE_Analysis.qmd: use fit_circe() loop, normalize canonical CSV columns to lowercase/snake_case for comparison (language, model, chisq_can etc.), remove mixed-case column gymnastics Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- circe.py: generalize SATPSchema.column_alias to normalize all schema field names case-insensitively via a lowercase→canonical mapping dict, covering PAQ_IDS and 'participant' without hardcoded special cases - circe.py: add pre-validation empty-data guard in fit_circe() — raises ValueError immediately rather than producing 4 cryptic R error rows - circe.py: add post-ipsatization n=0 guard for cases where validation passes but no complete PAQ rows survive listwise deletion - circe.py: fix to_dict() return annotation dict → dict[str, Any] - _circe_wrapper.py: fix docstring example (sspy.spi.bfgs → sspyr.bfgs), add Any import, fix extract_bfgs_fit() return annotation dict → dict[str, Any] - test_circe.py: add 8 new tests — gdiff None/float for constrained/free-angle models, rmsea_l≤rmsea≤rmsea_u invariant, ipsatize_data=False path, models=[] returns empty DataFrame, error row structure via mock, n=0 raises ValueError, case-insensitive PARTICIPANT → participant schema normalization Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ions - Extract _COLUMN_ALIASES module-level constant combining PAQ label names, PAQ IDs, and participant field into a single lowercase→canonical lookup. Built once at import time instead of inside column_alias on every call. - Extend case-insensitive normalization to PAQ label names: 'Pleasant', 'PLEASANT' etc. now correctly map to 'PAQ1' (previously only exact-match lowercase labels like 'pleasant' were handled). - Simplify column_alias parser to a single dict comprehension over _COLUMN_ALIASES replacing the two-pass rename_dict construction. - Fix CircE dataclass field type annotations: m, chisq, d, p, cfi, gfi, agfi, srmr, mcsc, rmsea, rmsea_l, rmsea_u declared as T|None to match from_bfgs() which uses .get(key, None) for all fit statistics. - Add test_satp_schema_paq_label_case_insensitive: verifies title-cased PAQ label names ('Pleasant', 'Vibrant', ...) are normalized to PAQ_IDS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

SATP class was replaced by fit_circe() in the refactor; the smoke test hadn't been updated and was failing in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…pant optional - polar_angles extraction: use label-based column access ("estimates") with iloc[:, 0] fallback instead of fragile positional .T.iloc[0] - extract_bfgs_fit: explicit int() cast for m/d/dfnull stats to guarantee annotation holds regardless of rpy2 storage type - fit_circe error rows: populate all expected columns with None to prevent pandas from promoting numeric dtypes across successful rows - SATPSchema: make participant Optional so ipsatize_data=False callers do not need a participant column; add runtime ValueError if ipsatize_data=True without participant - Tests: 3 new tests covering ipsatize_data=False sans participant, the ValueError path, and dtype preservation under partial failure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Initial plan * fix: code review fixes — install_r_packages empty list, docstring format, error message Co-authored-by: MitchellAcoustics <22335636+MitchellAcoustics@users.noreply.github.com> --------- Co-authored-by: MitchellAcoustics <22335636+MitchellAcoustics@users.noreply.github.com>

The old name clashed with circumplex.ipsatize(), which implements a different operation (row-wise centering within a single observation vs. column-wise centering per participant across observations). Changes: - Rename ipsatize() -> person_center() throughout; update module header, __init__.py exports, and all test references - Rename fit_circe() parameter ipsatize_data -> center_by_participant - Expand docstring with psychometric background: explains the distinction between column-wise within-person centering (SATP) and row-wise ipsatization (circumplex package), and the rationale for each - Improve implementation: replace lambda-based groupby.transform with explicit column selection + transform("mean") + vectorised subtraction (~2.3x faster, forward-compatible with pandas 3.x include_groups changes) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

#1 — RRuntimeError added to fit_exceptions rpy2's RRuntimeError inherits from Exception, not RuntimeError, so R-level convergence failures were escaping the per-model except block and crashing the entire fit_circe() call. Added import and included it in the fit_exceptions tuple. #2 — Guard p-value key access in extract_bfgs_fit Direct py_res["chisq"] / py_res["d"] bracket access replaced with .get() + None guard so a missing or None key raises a clear diagnostic rather than a bare KeyError or scipy TypeError. #4 — dtype preservation for d and m in mixed error/success DataFrames numpy int64 cannot hold NaN, so any error row with "d": None promoted the entire column to float64. Fixed by casting n, d, m to pd.Int64Dtype() (pandas nullable integer) after building the DataFrame. Extended test_fit_circe_error_row_preserves_numeric_dtypes to assert integer dtype for all three columns (n, d, m), not just n. Also: pandera participant field comment (| None vs nullable=True semantics) and models=[] edge case noted in fit_circe docstring. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

CircE's BFGS optimisation can converge to a reflected solution where polar angles decrease (clockwise) rather than increase (counter-clockwise). Add normalize_polar_angles() which detects this via a monotonicity check (PAQ2 < PAQ3 < PAQ4) and corrects by applying 360 - angle to PAQ2-PAQ8. Apply normalization automatically in CircE.from_bfgs() so polar_angles are always in canonical orientation. Simplify gdiff to compare directly against _IDEAL_ANGLES without the fragile sum-threshold heuristic. Remove _IDEAL_ANGLES_REV and _ANGLE_REV_THRESHOLD constants. Export normalize_polar_angles from soundscapy.satp and soundscapy top-level (lazy-loaded) so downstream analyses can use the same correction without re-implementing it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The notebook lives in a separate analysis repo and should not be part of the soundscapy package. Add gitignore entries for both the .qmd source and .html output so they stay excluded going forward. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Integer Series input produces integer output; the doctest expected floats. Switching to float literals in the example makes got/want match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Andrew Mitchell and others added 14 commits March 1, 2026 02:06

fix: update test_basic to check fit_circe instead of removed SATP class

1c13eb3

SATP class was replaced by fit_circe() in the refactor; the smoke test hadn't been updated and was failing in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

minor linting fixes

2dd2ba7

pre-commit lint

400e1eb

fix: use float literals in normalize_polar_angles doctest

adfadab

Integer Series input produces integer output; the doctest expected floats. Switching to float literals in the example makes got/want match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

MitchellAcoustics merged commit 6fa45e6 into dev Mar 1, 2026
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: SATP circumplex SEM module — functional API and architecture refactor#130

feat: SATP circumplex SEM module — functional API and architecture refactor#130
MitchellAcoustics merged 14 commits intodevfrom
analysis/satp-circe-notebook

MitchellAcoustics commented Mar 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

MitchellAcoustics commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MitchellAcoustics commented Mar 1, 2026 •

edited

Loading