Skip to content

test: CSV-ingestion, scale, and cross-file-collision stress guards#106

Merged
jodeleeuw merged 3 commits into
mainfrom
test/more-stress-tests
Jun 18, 2026
Merged

test: CSV-ingestion, scale, and cross-file-collision stress guards#106
jodeleeuw merged 3 commits into
mainfrom
test/more-stress-tests

Conversation

@Mandyx22

Copy link
Copy Markdown
Contributor

Stacked on #104 (test/stress-tests-in-ci). Review/merge #104 first, then this. Test-only — no library or CLI behavior changes.

Adds three more Jest stress suites, filling gaps the earlier suites left:

@jspsych/metadatacsv-input.stress.test.ts

Pins how generate(data, {}, "csv") re-infers types from string cells (the path the nested/scale suites don't exercise, since they feed JSON):

  • numeric coercion incl. whitespace-trim, scientific notation, negatives
  • Infinity/NaN rejected as non-numeric → string levels
  • mixed-column numeric→categorical downgrade (boundary preserved as a level)
  • "true"/"false" stay categorical (post-fix(metadata): no levels for boolean variables; keep string true/false as levels #90)
  • RFC-4180 quoting: embedded commas, quotes, newlines
  • unicode, empty + literal-"null" cells (stay unknown)
  • 50-char per-level cap
  • JSON-object/array embedded in a cell → extracted
  • CSV/JSON parity for unambiguously-typed columns

@jspsych/metadatascale.stress.test.ts

5,000-row dataset: exact numeric extremes, low-cardinality dedup, high-cardinality level accumulation (no count cap — only the per-level length cap), boolean carries no levels/range, and a throughput ceiling guarding against accidental O(n²) regressions.

@jspsych/metadata-cliarray-collision.stress.test.ts

The cross-file collision gap the rename suite explicitly left untested: two subject-001.json files in different subdirectories, both with a nested array column. Asserts processDirectory disambiguates the main CSV, the array sidecar, and the preserved raw original — no overwrites, mains+sidecars still Psych-DS compliant.

Auto-runs in CI

All three match packages/*/tests/*.test.ts, so the root jest run (CI's test job) discovers them with no new wiring, and they honor the no-build constraint (import from src; CLI uses the prepare-built @jspsych/metadata).

Verification

  • New tests: 16 metadata + 7 cli, all green
  • Full local suites: 186 metadata + 140 cli passing (only the pre-existing interactive-rename.e2e fails locally — needs node-pty, installs in CI)

Includes a changeset (both packages, patch).

🤖 Generated with Claude Code

Mandyx22 and others added 2 commits June 12, 2026 14:15
Port the standalone stress-tests/ harnesses into the automated Jest suite so
regressions in already-fixed behavior are caught by plain `npm test` (and CI).
The tests import from source (the CI test job runs no build) and add no library
or CLI behavior.

- metadata: nested-generation coherence over a comprehensive fixture, and the
  Psych-DS filename-normalization helper invariants.
- cli: processDirectory end-to-end (compliant main CSV, data/raw/ preservation,
  variableMeasured <-> CSV-column cross-check, best-effort Psych-DS validation),
  and refusal to write a non-compliant filename non-interactively.

Shared fixture lives at dev/stress/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stacked on the nested-data/filename stress suites (PR #104). Three more
Jest suites, all test-only (no library/CLI behavior change):

- metadata csv-input.stress: type re-inference from CSV string cells
  (numeric coercion, Infinity/NaN rejection, mixed-column downgrade,
  RFC-4180 quoting, unicode, empty/literal-null, 50-char level cap,
  JSON-in-cell extraction) + CSV/JSON parity.
- metadata scale.stress: 5,000-row exact extremes, dedup, high-cardinality
  levels, boolean handling, throughput ceiling.
- cli array-collision.stress: two same-stem files in different subdirs
  sharing a nested array column; asserts main CSV, sidecar, and raw
  originals each disambiguate without overwrite, staying Psych-DS valid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@changeset-bot

changeset-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: a9061a9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@jspsych/metadata Patch
@jspsych/metadata-cli Patch
frontend Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@jodeleeuw jodeleeuw changed the base branch from test/stress-tests-in-ci to main June 18, 2026 22:31
# Conflicts:
#	packages/metadata/tests/nested-generation.stress.test.ts
@jodeleeuw jodeleeuw merged commit ca8dc75 into main Jun 18, 2026
2 checks passed
jodeleeuw added a commit that referenced this pull request Jun 18, 2026
…ayer overlap

- Rename dataProcessingStress.test.ts -> dataProcessing.stress.test.ts to match the
  repo's *.stress.test.ts convention (#104/#106).
- Add a header note explaining the intentional overlap with packages/metadata's stress
  suites: these guard generate() through the frontend bundle's own resolution path, plus
  frontend-only coverage (multi-file CSV+JSON accumulation, validator partitioning).

No changeset: the frontend package is private and unversioned, so changesets ignores it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants