test: CSV-ingestion, scale, and cross-file-collision stress guards#106
Merged
Conversation
Port the standalone stress-tests/ harnesses into the automated Jest suite so regressions in already-fixed behavior are caught by plain `npm test` (and CI). The tests import from source (the CI test job runs no build) and add no library or CLI behavior. - metadata: nested-generation coherence over a comprehensive fixture, and the Psych-DS filename-normalization helper invariants. - cli: processDirectory end-to-end (compliant main CSV, data/raw/ preservation, variableMeasured <-> CSV-column cross-check, best-effort Psych-DS validation), and refusal to write a non-compliant filename non-interactively. Shared fixture lives at dev/stress/. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stacked on the nested-data/filename stress suites (PR #104). Three more Jest suites, all test-only (no library/CLI behavior change): - metadata csv-input.stress: type re-inference from CSV string cells (numeric coercion, Infinity/NaN rejection, mixed-column downgrade, RFC-4180 quoting, unicode, empty/literal-null, 50-char level cap, JSON-in-cell extraction) + CSV/JSON parity. - metadata scale.stress: 5,000-row exact extremes, dedup, high-cardinality levels, boolean handling, throughput ceiling. - cli array-collision.stress: two same-stem files in different subdirs sharing a nested array column; asserts main CSV, sidecar, and raw originals each disambiguate without overwrite, staying Psych-DS valid. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: a9061a9 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
# Conflicts: # packages/metadata/tests/nested-generation.stress.test.ts
jodeleeuw
added a commit
that referenced
this pull request
Jun 18, 2026
…ayer overlap - Rename dataProcessingStress.test.ts -> dataProcessing.stress.test.ts to match the repo's *.stress.test.ts convention (#104/#106). - Add a header note explaining the intentional overlap with packages/metadata's stress suites: these guard generate() through the frontend bundle's own resolution path, plus frontend-only coverage (multi-file CSV+JSON accumulation, validator partitioning). No changeset: the frontend package is private and unversioned, so changesets ignores it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #104 (
test/stress-tests-in-ci). Review/merge #104 first, then this. Test-only — no library or CLI behavior changes.Adds three more Jest stress suites, filling gaps the earlier suites left:
@jspsych/metadata—csv-input.stress.test.tsPins how
generate(data, {}, "csv")re-infers types from string cells (the path the nested/scale suites don't exercise, since they feed JSON):Infinity/NaNrejected as non-numeric → string levels"true"/"false"stay categorical (post-fix(metadata): no levels for boolean variables; keep string true/false as levels #90)"null"cells (stayunknown)@jspsych/metadata—scale.stress.test.ts5,000-row dataset: exact numeric extremes, low-cardinality dedup, high-cardinality level accumulation (no count cap — only the per-level length cap), boolean carries no levels/range, and a throughput ceiling guarding against accidental O(n²) regressions.
@jspsych/metadata-cli—array-collision.stress.test.tsThe cross-file collision gap the rename suite explicitly left untested: two
subject-001.jsonfiles in different subdirectories, both with a nested array column. AssertsprocessDirectorydisambiguates the main CSV, the array sidecar, and the preserved raw original — no overwrites, mains+sidecars still Psych-DS compliant.Auto-runs in CI
All three match
packages/*/tests/*.test.ts, so the rootjestrun (CI'stestjob) discovers them with no new wiring, and they honor the no-build constraint (import fromsrc; CLI uses theprepare-built@jspsych/metadata).Verification
interactive-rename.e2efails locally — needsnode-pty, installs in CI)Includes a changeset (both packages, patch).
🤖 Generated with Claude Code