Version Packages#47
Open
github-actions[bot] wants to merge 1 commit into
Open
Conversation
ddf6c99 to
b03d248
Compare
218c1d4 to
e2c5f5d
Compare
b4ccefb to
2d8c232
Compare
e232b25 to
85f7d25
Compare
85f7d25 to
e488e39
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.
Releases
@jspsych/metadata-cli@0.2.0
Minor Changes
a5af08c: Detect and expand JSON-serialized nested columns in
generate(). Flat JSON objects (e.g.response: {"Q0":4,"Q1":3}) are expanded into dotted sub-variables (response.Q0,response.Q1) invariableMeasuredwith correct types and min/max tracking. JSON arrays of objects are extracted into separate Psych-DS compliant CSV files ({stem}_measure-{col}_data.csv) withtrial_indexandelement_indexas join keys.aab8da8: Extract plain (non-array) object columns into separate Psych-DS CSV files so their expanded sub-variables resolve to real columns.
expandObjectFieldsregisters dotted sub-variables for object columns (e.g.response.cb_1,calibration_data.type), but those names previously had no corresponding CSV column, so Psych-DS validation reportedVARIABLE_MISSING_FROM_CSV_COLUMNSfor every one. Object columns are now accumulated into a newextractedObjectsmap (exposed viagetExtractedObjects()) as one row per trial, and the CLI writes a per-file sidecar CSV ({stem}_measure-{col}_data.csv) — mirroring the existing array-of-objects extraction. The row is threaded through the recursive expansion so a column is recorded for every registered descendant (leaf scalars, intermediate object nodes, and nested-array parents), and it reuses the same configurablearrayJoinKeys(one row per trial, noelement_index).35de4b6: Extract arrays of primitives into sidecar CSVs so their elements become real, typed variables. Previously an array of numbers or strings (
block_order: [16,100,4,1],images: [...]) was recorded only as a singlevalue:"array"column with no per-element detail. Such arrays are now extracted like arrays-of-objects, but — since primitives have no field name — each element is recorded under a synthetic<column>.valuecolumn (distinct from the array parent, which staysvalue:"array"). The element variable gets its proper type withminValue/maxValue(numeric) orlevels(string), joinable to its row via the existing join keys +element_index. This composes with the nested-array recursion (an array of arrays of numbers yields a grandchild table with a.valuecolumn) and completes Psych-DS round-tripping for all four cell shapes: scalar, object, array-of-objects, and array-of-primitives.Tradeoff: every non-empty primitive-array column now produces its own sidecar CSV, so datasets with many such columns generate substantially more files (e.g. one eye-tracking export grew from 304 to 380 data files). Extraction is the default and there is no new prompt. A future opt-in
primitiveArrayMode: "extract" | "summarize"could offer an in-place summary alternative, but is intentionally not added here to avoid complicating the CLI flow.686093e: Convert jsPsych JSON data files to CSV and normalize all generated data filenames to the Psych-DS
[keyword-value_]+data.csvpattern, so generated projects pass the Psych-DS validator..jsondata file is converted to a.csvindata/(nested objects/arrays serialized as JSON strings viaobjectsToCSV, so no data is lost), with the untouched original preserved underdata/raw/. The project scaffold creates thedata/raw/directory, anddataset_description.jsonis left untouched.data/and the original indata/raw/— instead of being skipped or silently overwritten. Non-interactive runs fail with a clear message rather than inventing a keyword.3960e63: Integrate psych-ds validator: the CLI now runs Psych-DS validation after loading an existing dataset and after writing the final dataset_description.json, printing a compliance summary with errors always shown and warnings shown under --verbose.
d9e4485: Recursively unnest nested data inside extracted array elements. Previously an array-of-objects column was extracted one level deep, so an element field that was itself an object (
pointData.point) or an array (pointData.gazeSamples) was kept as a single opaque JSON column. Now element fields recurse: a nested plain object is expanded into deeper dotted columns in the same sidecar row (pointData.point.x,pointData.point.y), and a nested array-of-objects is extracted into its own grandchild CSV (..._measure-...GazeSamples_data.csv). Grandchild tables remain joinable to their specific parent element via a qualified<column>.element_indexkey carried alongside the existing join keys (e.g.trial_index+validation_data.pointData.element_index+ the grandchild's ownelement_index), and every such key/column is registered invariableMeasured. This completes Psych-DS round-tripping for arbitrarily nested object/array data — arrays nested inside arrays inside objects now fully expand instead of bottoming out as JSON.5fcce14: Register array-of-objects element fields in
variableMeasuredso extracted sidecar CSVs have no undeclared columns. PreviouslyaccumulateArrayColumnwrote each element's fields as bare columns (e.g.x,y) pluselement_indexinto the extracted-array CSV, but never added them tovariableMeasured, so Psych-DS validation reportedCSV_COLUMN_MISSING_FROM_METADATA. Element fields are now emitted under dotted names (tobii_data.x,validation_data.pointData.point) — avoiding collisions between same-named fields of different array columns — and each is registered with its correct type and min/max/levels tracking.element_indexis registered once. Object- and array-valued element fields are recorded one level deep (a single dotted JSON column,value:"object"/"array"); they are not further expanded or extracted. This is the array-side counterpart to the plain-object sidecar fix and completes Psych-DS column/variable round-tripping for nested data.7921a10: Add smart rename strategies for data files whose names don't follow the Psych-DS pattern. Instead of a single keyword prompt, the CLI now offers a strategy menu with a live old → new sample preview per option: use an identifier column found inside the data (e.g. participant_id, recommended when available), keep only the part that differs between the filenames, give the files fresh sequential names (subject-001, subject-002, …), or keep the whole old filename as the value. Every strategy ends in a full rename preview with collision detection, per-file manual editing, and the option to switch strategies before anything is written. The preview now also lists the sidecar CSVs each file will produce (one per extracted array/object column, e.g.
subject-01_measure-mouseTracking_data.csv), and a single planner resolves every output name — mains and sidecars together — so the names shown are exactly the names written; if the data and the approved plan ever disagree the run aborts rather than writing an unapproved name. Files whose names are technically valid but use unofficial keywords (e.g. data-xyz.json, which draw a validator warning) now get an opt-in to join the rename flow instead of being silently kept.58ebde8: Add interactive prompt for unknown variable descriptions. After data processing and metadata options, the CLI now detects user-data variables whose descriptions could not be resolved from plugin source and asks whether to fill them in. Users can skip the entire step or skip individual variables by pressing Enter.
1435184: Exit code 1 on validation errors; re-prompt for missing required fields; suggest missing recommended fields.
Patch Changes
28f1d57: Improve CLI prompt wording for clarity. Messages, choice labels, descriptions, and error text have been rewritten to use plain language, avoid jargon, and be more actionable for researchers unfamiliar with Psych-DS terminology.
585d337: The CLI now writes a
.psychds-ignoreat the dataset root when it preserves raw jsPsych originals underdata/raw/, so the validator no longer flags them asFILE_NOT_CHECKED. This mirrors the behavior the frontend already had.The
.psychds-ignorefilename and content (**/raw/plus a self-reference, dictated by validator quirks) are now exported from@jspsych/metadataasPSYCHDS_IGNORE_FILENAMEandPSYCHDS_IGNORE_CONTENT, so the CLI and frontend share one definition instead of duplicating the literal string.3752739: Send Psych-DS validation errors and warnings to stderr. The failure header and error details use console.error; warning details and the verbose hint use console.warn. The success line remains on stdout.
2706ca7: Add unit tests for validatePsychDS covering clean pass, errors, warnings (verbose and non-verbose), plural forms, and validator-throws scenarios.
da2e8d2: Add Jest test infrastructure and tests for the CLI package. Tests cover
utils.ts,validatefunctions.ts, anddata.ts(27 tests). Also modernizessaveTextToPathfrom a fire-and-forget callback to anasyncfunction returningPromise<void>.07b78e5: Add unit tests for
preAnalyzeDirectoryindata.ts, covering unreadable directories, JSON and CSV duplicate detection, ignored files, worst-file selection, one-subdirectory-deep traversal, and custom join keys.9e02b78: Deduplicate directory traversal logic in data.ts. Extracts a shared
collectDataFileshelper used byprocessDirectory,enumerateDataFiles, andpreAnalyzeDirectory, replacing three near-identical implementations of the top-level + one-subdir-deep walk. Behavior is preserved:processDirectorystill sortsdataset_description.jsonfirst and counts directory read errors as failures. Diagnostics (the "can only read subdirectories one level deep" warning and directory-read errors) are gated behind awarnflag that onlyprocessDirectorysets, so the silent pre-passes (enumerateDataFiles,preAnalyzeDirectory) don't duplicate warnings the user already sees once on the same directory in the same run.8edc7c2: Drop unnamed columns so R-exported datasets validate. R's
write.csv(with the defaultrow.names = TRUE) prepends an unnamed row-index column, so the exported CSV header starts with a bare comma — an empty-string column name. Psych-DS variables require a name, so the column can never appear invariableMeasured; left in the on-disk CSV it fails validation withCSV_COLUMN_MISSING_FROM_METADATA.The strip now lives in the shared data-file path so the CLI and frontend behave identically:
generate()strips empty/whitespace-only columns from the parsed data up front, with a single warning instead of per-row spam (keepsvariableMeasuredclean and standalone library use safe), via a new exportedstripUnnamedColumnshelper.buildPsychDSDataFilesstrips the main table before emitting it: a clean CSV keeps its exact bytes (verbatimmainContent), while a file with an unnamed column is re-serialised from the cleaned rows. Both the CLI (rename-plan and non-plan paths) and the frontend feed parsedmainRows, so the written/zipped/validated CSV always matches the metadata.Fixes finding Incorporating Psych-DS validator with the CLI package workflow #2 of Validation fails on real jsPsych dataset: time_elapsed always seeded, unnamed columns dropped, join-key prompt not gated #109.
06a84fb: fix(cli): don't print a spurious validation failure for existing projects
When opening an existing project, validation ran before the data files were
copied into the project, so it always failed with
MISSING_DATA_DIRECTORYandprinted a misleading
✘ Psych-DS validation failedto stderr even when the finaloutput was valid. Removed that pre-write call; the post-write validation that
actually gates the result is unchanged.
a5311ba: Fix Psych-DS validation always failing on Windows. The relative path passed to the validator contained backslashes on Windows, which the validator could not resolve — causing spurious MISSING_DATAFILE and MISSING_DATASET_DESCRIPTION errors even when the project was generated correctly. Normalize path separators to forward slashes before validation.
585d337: Convert uploaded JSON data to Psych-DS CSV in the frontend so datasets validate instead of failing with
MISSING_DATAFILE.Previously the frontend placed uploaded jsPsych JSON files into
data/unchanged, so the in-browser validator (and the downloadable zip) always failed — Psych-DS only recognises CSV/TSV datafiles whose names match its keyword pattern.@jspsych/metadatagains two shared, filesystem-agnostic helpers,buildPsychDSDataFilesandderiveFallbackBase, that turn a parsed data file (plus any extracted nested array/object columns) into its set of Psych-DS-named CSV outputs. Used by both the CLI and the frontend so the conversion lives in one place.data/payload during generation — a compliant main CSV, one sidecar per nested array/object column, and the original JSON preserved underdata/raw/— and Review uses it for both validation and the zip. Auto-derived filenames use the officialsubjectkeyword (subject-<stem>) to avoid the unofficial-keyword warning, and a.psychds-ignoreis emitted so the preserveddata/raw/originals don't surface asFILE_NOT_CHECKED.buildPsychDSDataFiles. No behaviour change.3c7d1f7: Accept JSON-Lines (JSONL) experiment data, not just a single JSON array. Several jsPsych labs — and JATOS exports — write data as newline-delimited JSON, with one JSON value per line (typically one participant's full trial array per line) rather than one big array. Previously
generate()ranJSON.parseon the whole string, so every such file failed withUnexpected non-whitespace character after JSONand produced no metadata.A new exported
parseJsonDatahelper handles both shapes: a well-formed single document is returned unchanged (no behaviour change for existing single-array callers), and only when whole-string parsing fails does it fall back to parsing line by line, flattening any per-line arrays into one observation stream. It is now used wherever JSON data files are parsed:generate()(the library) for the main ingestion path.The
.jsonlfile extension is now also recognised as a JSON data file (these exports are conventionally named.jsonl). The CLI processes.jsonlexactly like.json— including filename-normalization, raw-original preservation, and CSV conversion — and the frontend normalises a.jsonlupload to the JSON path.Verified end to end against the raw
.jsonlexports invucml/online_experiments: all 15 files now generate metadata and pass the Psych-DS validator with zero errors (they failed at parse time before).3c7d1f7: Synthesize a
source_record_idjoin key for multi-record JSON-Lines exports. Raw jsPsych exports carry no per-row identifier, so once JSONL is flattened (one record per line)trial_indexrepeats across records and can't uniquely key the extracted array/object sidecar CSVs — every record's trial 0 collapsed onto the same(trial_index, element_index)key, making the sidecars impossible to join back to a single parent trial.The synthesized column is named
source_record_idrather thanparticipant_idbecause a JSON-Lines line is only guaranteed to be one source record — usually, but not always, one participant. The honest name avoids overclaiming for exports where a line isn't a single subject.parseJsonDatanow takes an opt-in{ tagSourceRecordId }flag: in the JSON-Lines path it stamps each line's object rows with a 0-basedsource_record_id(a no-op on the single-array fast path), and reports via an optionalstatsout-param whether it actually synthesized the id. A line that already carries asource_record_idor a realparticipant_idis left untouched — the experiment's own identifier already groups those rows.generate()enables this for JSON input and promotes the identifier to the leading join key, preferring the synthesizedsource_record_idand falling back to a realparticipant_idalready present in the export (['source_record_id', 'trial_index']or['participant_id', 'trial_index']), so the sidecars join unambiguously. CSV inputs are unaffected.When — and only when — the id was actually synthesized (i.e. absent from the source), it is given an explicit description that makes its synthetic origin unmistakable ("Synthetic source-record identifier … NOT a real subject ID from the experiment …") so a downstream user can't mistake it for a real subject ID; this also avoids serializing an empty
{}description (an object with no@type, which trips the validator'sOBJECT_TYPE_MISSING). The CLI's join-key pre-analysis/prompt and the frontend's pre-flight mirror this promotion so multi-record JSONL is no longer falsely flagged as having a non-unique join key.Verified end to end against the raw
.jsonlexports invucml/online_experiments(block_cat): the combined 30-record export generates metadata, passes the Psych-DS validator (0 errors), synthesizessource_record_id0–29, and writes sidecars whose(source_record_id, trial_index, element_index)keys are fully unique — including the doubly-nestedrecall_responsescase. NotablysubjectIdcollides across the two merged datasets (two records share601), whichsource_record_idcorrectly keeps distinct.ca8dc75: Extend the stress-test regression guards with three more Jest suites covering the CSV ingestion path, generation at scale, and cross-file output-name collisions.
@jspsych/metadata—csv-input.stress: pins howgenerate(data, {}, "csv")re-infers types from string cells (numeric coercion incl. whitespace/scientific-notation/Infinity/NaNrejection, mixed-column downgrade,"true"/"false"staying categorical, RFC-4180 quoting, unicode, empty/literal-nullcells, the 50-char level cap, JSON-in-a-cell extraction), and asserts CSV/JSON parity for unambiguously-typed columns.@jspsych/metadata—scale.stress: feeds a 5,000-row dataset and checks exact numeric extremes, categorical dedup, high-cardinality level accumulation, boolean handling, and a throughput ceiling that guards against accidental O(n²) regressions.@jspsych/metadata-cli—array-collision.stress: two same-stem files in different subdirectories sharing a nested array column, assertingprocessDirectorydisambiguates every main CSV, sidecar, and preserved raw original (no overwrites, all still Psych-DS compliant) — the cross-file collision gap left by the earlier rename suite.Test-only change; no library or CLI behavior is modified.
5fcd392: Don't block non-interactive runs on the join-key prompt. When
trial_indexisn't unique (the norm for multi-subject data, where it restarts per subject), the CLI previously always opened an interactive checkbox to pick additional join keys — even in a fully-flagged headless run (--psych-ds-dir+--data-dir+--metadata-options, no TTY), which aborted with✘ User force closed the prompt. The prompt is now gated on having a terminal; without one, join keys are resolved deterministically viaresolveJoinKeysNonInteractive(add a sufficient single column, else a minimal sufficient combination, else proceed with a warning that extracted CSVs may contain duplicate rows). Fixes finding Incorporating Psych-DS validator with the website workflow #3 of Validation fails on real jsPsych dataset: time_elapsed always seeded, unnamed columns dropped, join-key prompt not gated #109.Also hardens the rest of the non-interactive path so that "no terminal ⇒ never prompt" holds universally, not just when all three flags are supplied. The remaining prompts (metadata-options fallback, unknown-variable descriptions, missing-required-field loop) now gate on
canPromptrather than the flag-onlyisNonInteractive, so a no-TTY run that omits--metadata-optionsfalls back to generated defaults with a notice instead of aborting with✘ User force closed the prompt.fa17a9e: Add stress-test regression guards to the automated suite so previously-fixed nested-data and filename-normalization behavior can't silently regress.
Four Jest suites, ported from the standalone
stress-tests/harnesses so they run under plainnpm test(and CI) without a build step:@jspsych/metadata:generate()coherence over a comprehensive nested-data fixture (deep objects, arrays of objects/arrays, mixed-type columns, atrial_type-less row, unicode, empties), plus the Psych-DS filename-normalization helper invariants.@jspsych/metadata-cli: theprocessDirectoryconversion end-to-end (compliant main CSV,data/raw/preservation, two-wayvariableMeasured↔ CSV-column cross-check, and a best-effort Psych-DS validation pass), plus the refusal to write a non-compliant filename non-interactively.Test-only change; no library or CLI behavior is modified. The shared fixture lives at
dev/stress/.4fa760d: Add unit tests for createDirectoryWithStructure in handlefiles.ts.
Updated dependencies [8731c30]
Updated dependencies [585d337]
Updated dependencies [f96e1e6]
Updated dependencies [ed9c25c]
Updated dependencies [0f4cc4a]
Updated dependencies [1511d20]
Updated dependencies [8edc7c2]
Updated dependencies [a5af08c]
Updated dependencies [aab8da8]
Updated dependencies [35de4b6]
Updated dependencies [e80e57c]
Updated dependencies [06a84fb]
Updated dependencies [03a3ce4]
Updated dependencies [ae0d01c]
Updated dependencies [c2426be]
Updated dependencies [e1cb44e]
Updated dependencies [585d337]
Updated dependencies [3c7d1f7]
Updated dependencies [3c7d1f7]
Updated dependencies [72f8a4b]
Updated dependencies [6b0d1d4]
Updated dependencies [ca8dc75]
Updated dependencies [d9e4485]
Updated dependencies [5fcce14]
Updated dependencies [fa17a9e]
Updated dependencies [55f2f91]
@jspsych/metadata@0.1.0
Minor Changes
0f4cc4a: Recursively expand nested JSON objects more than one level deep. Previously
expandObjectFieldsonly expanded a single level, so a value likeresponse: {"Q0":{"score":4,"meta":{"valid":true}}}registeredresponse.Q0as an opaquevalue:"object"leaf and lost its sub-fields. Now nested plain objects are fully expanded into dotted sub-variables (response.Q0.score,response.Q0.meta.valid) with correct types and min/max/levels tracking at any depth. Arrays nested inside objects are now correctly typed asvalue:"array"instead of"object", and nested arrays-of-objects are extracted into their own Psych-DS CSV files keyed by their dotted column name — mirroring how top-level array columns are handled.a5af08c: Detect and expand JSON-serialized nested columns in
generate(). Flat JSON objects (e.g.response: {"Q0":4,"Q1":3}) are expanded into dotted sub-variables (response.Q0,response.Q1) invariableMeasuredwith correct types and min/max tracking. JSON arrays of objects are extracted into separate Psych-DS compliant CSV files ({stem}_measure-{col}_data.csv) withtrial_indexandelement_indexas join keys.aab8da8: Extract plain (non-array) object columns into separate Psych-DS CSV files so their expanded sub-variables resolve to real columns.
expandObjectFieldsregisters dotted sub-variables for object columns (e.g.response.cb_1,calibration_data.type), but those names previously had no corresponding CSV column, so Psych-DS validation reportedVARIABLE_MISSING_FROM_CSV_COLUMNSfor every one. Object columns are now accumulated into a newextractedObjectsmap (exposed viagetExtractedObjects()) as one row per trial, and the CLI writes a per-file sidecar CSV ({stem}_measure-{col}_data.csv) — mirroring the existing array-of-objects extraction. The row is threaded through the recursive expansion so a column is recorded for every registered descendant (leaf scalars, intermediate object nodes, and nested-array parents), and it reuses the same configurablearrayJoinKeys(one row per trial, noelement_index).35de4b6: Extract arrays of primitives into sidecar CSVs so their elements become real, typed variables. Previously an array of numbers or strings (
block_order: [16,100,4,1],images: [...]) was recorded only as a singlevalue:"array"column with no per-element detail. Such arrays are now extracted like arrays-of-objects, but — since primitives have no field name — each element is recorded under a synthetic<column>.valuecolumn (distinct from the array parent, which staysvalue:"array"). The element variable gets its proper type withminValue/maxValue(numeric) orlevels(string), joinable to its row via the existing join keys +element_index. This composes with the nested-array recursion (an array of arrays of numbers yields a grandchild table with a.valuecolumn) and completes Psych-DS round-tripping for all four cell shapes: scalar, object, array-of-objects, and array-of-primitives.Tradeoff: every non-empty primitive-array column now produces its own sidecar CSV, so datasets with many such columns generate substantially more files (e.g. one eye-tracking export grew from 304 to 380 data files). Extraction is the default and there is no new prompt. A future opt-in
primitiveArrayMode: "extract" | "summarize"could offer an in-place summary alternative, but is intentionally not added here to avoid complicating the CLI flow.585d337: Convert uploaded JSON data to Psych-DS CSV in the frontend so datasets validate instead of failing with
MISSING_DATAFILE.Previously the frontend placed uploaded jsPsych JSON files into
data/unchanged, so the in-browser validator (and the downloadable zip) always failed — Psych-DS only recognises CSV/TSV datafiles whose names match its keyword pattern.@jspsych/metadatagains two shared, filesystem-agnostic helpers,buildPsychDSDataFilesandderiveFallbackBase, that turn a parsed data file (plus any extracted nested array/object columns) into its set of Psych-DS-named CSV outputs. Used by both the CLI and the frontend so the conversion lives in one place.data/payload during generation — a compliant main CSV, one sidecar per nested array/object column, and the original JSON preserved underdata/raw/— and Review uses it for both validation and the zip. Auto-derived filenames use the officialsubjectkeyword (subject-<stem>) to avoid the unofficial-keyword warning, and a.psychds-ignoreis emitted so the preserveddata/raw/originals don't surface asFILE_NOT_CHECKED.buildPsychDSDataFiles. No behaviour change.6b0d1d4: Export Psych-DS utility functions from the core package:
isValidPsychDSDataFilename,toPsychDSValue,deriveArrayFilename,objectsToCSV,disambiguateArrayFilename. Previously these lived only in the CLI. Moving them to core makes them available to any downstream consumer (e.g. the frontend) and ensures the CLI and any future tools share a single implementation.The CLI now imports these functions from
@jspsych/metadatainstead of defining them locally. No behaviour change.d9e4485: Recursively unnest nested data inside extracted array elements. Previously an array-of-objects column was extracted one level deep, so an element field that was itself an object (
pointData.point) or an array (pointData.gazeSamples) was kept as a single opaque JSON column. Now element fields recurse: a nested plain object is expanded into deeper dotted columns in the same sidecar row (pointData.point.x,pointData.point.y), and a nested array-of-objects is extracted into its own grandchild CSV (..._measure-...GazeSamples_data.csv). Grandchild tables remain joinable to their specific parent element via a qualified<column>.element_indexkey carried alongside the existing join keys (e.g.trial_index+validation_data.pointData.element_index+ the grandchild's ownelement_index), and every such key/column is registered invariableMeasured. This completes Psych-DS round-tripping for arbitrarily nested object/array data — arrays nested inside arrays inside objects now fully expand instead of bottoming out as JSON.5fcce14: Register array-of-objects element fields in
variableMeasuredso extracted sidecar CSVs have no undeclared columns. PreviouslyaccumulateArrayColumnwrote each element's fields as bare columns (e.g.x,y) pluselement_indexinto the extracted-array CSV, but never added them tovariableMeasured, so Psych-DS validation reportedCSV_COLUMN_MISSING_FROM_METADATA. Element fields are now emitted under dotted names (tobii_data.x,validation_data.pointData.point) — avoiding collisions between same-named fields of different array columns — and each is registered with its correct type and min/max/levels tracking.element_indexis registered once. Object- and array-valued element fields are recorded one level deep (a single dotted JSON column,value:"object"/"array"); they are not further expanded or extracted. This is the array-side counterpart to the plain-object sidecar fix and completes Psych-DS column/variable round-tripping for nested data.Patch Changes
8731c30: Boolean variables no longer record
levels. Genuine boolean values (typeof === "boolean") are typedvalue:"boolean"with nolevels/minValue/maxValue, and string"true"/"false"values are kept as strings so they surface aslevels: ["true","false"](no longer coerced to boolean). A manualvalue:"boolean"override now drops any detected levels and warns when the detected values don't map cleanly to true/false (anything other thantrue/false/0/1). This also fixes a bug where raw booleans were pushed into thelevelsarray, producing inconsistent[false]/empty output.585d337: The CLI now writes a
.psychds-ignoreat the dataset root when it preserves raw jsPsych originals underdata/raw/, so the validator no longer flags them asFILE_NOT_CHECKED. This mirrors the behavior the frontend already had.The
.psychds-ignorefilename and content (**/raw/plus a self-reference, dictated by validator quirks) are now exported from@jspsych/metadataasPSYCHDS_IGNORE_FILENAMEandPSYCHDS_IGNORE_CONTENT, so the CLI and frontend share one definition instead of duplicating the literal string.f96e1e6: Add tests verifying variableMeasured completeness for CSV input. Covers always-empty columns, null-string columns, partially-empty columns, and sparse multi-trial-type CSVs where different trial types populate different columns.
ed9c25c: Fix stray empty-string expression in parseCSV and remove stale tsconfig paths entry for csv-parse/browser/esm (was pointing to a non-existent path in the installed csv-parse version).
1511d20:
variableMeasured.descriptionis now always serialized as a single schema.org Text value. When a column accumulated genuinely different descriptions from multiple plugins,getList()previously emitteddescriptionas an object ({ pluginType: text }), which made the Psych-DS validator raise anOBJECT_TYPE_MISSINGwarning. The distinct descriptions are now joined into one string with" | ".getList()is also idempotent now (a second call no longer mangles an already-collapsed string description), and empty descriptions collapse to"unknown".8edc7c2: Drop unnamed columns so R-exported datasets validate. R's
write.csv(with the defaultrow.names = TRUE) prepends an unnamed row-index column, so the exported CSV header starts with a bare comma — an empty-string column name. Psych-DS variables require a name, so the column can never appear invariableMeasured; left in the on-disk CSV it fails validation withCSV_COLUMN_MISSING_FROM_METADATA.The strip now lives in the shared data-file path so the CLI and frontend behave identically:
generate()strips empty/whitespace-only columns from the parsed data up front, with a single warning instead of per-row spam (keepsvariableMeasuredclean and standalone library use safe), via a new exportedstripUnnamedColumnshelper.buildPsychDSDataFilesstrips the main table before emitting it: a clean CSV keeps its exact bytes (verbatimmainContent), while a file with an unnamed column is re-serialised from the cleaned rows. Both the CLI (rename-plan and non-plan paths) and the frontend feed parsedmainRows, so the written/zipped/validated CSV always matches the metadata.Fixes finding Incorporating Psych-DS validator with the CLI package workflow #2 of Validation fails on real jsPsych dataset: time_elapsed always seeded, unnamed columns dropped, join-key prompt not gated #109.
e80e57c: Fix always-empty columns being silently dropped from variableMeasured. Columns whose values are null or empty across all rows in a dataset now appear in variableMeasured with a minimal
"value": "unknown"entry, satisfying the Psych-DS requirement that every CSV column header has a corresponding entry.06a84fb: fix(metadata): make the Node ESM entry (
dist/index.js) loadableThe build runs esbuild (which emits the bundled
dist/index.js) followed bytsc. Withdeclaration: trueandoutDir: ./distbut noemitDeclarationOnly,tscre-emitted an unbundleddist/index.jsover esbuild's bundle, leavingextensionless relative imports (e.g.
./utils) that Node's ESM loader rejects.Added
emitDeclarationOnly: truesotscemits only the.d.tsdeclarations andesbuild's working bundle survives; type-checking and
dist/index.d.tsare unchanged.03a3ce4: fix(metadata): preserve string descriptions and primitive column types across generate() calls
Two related bugs fixed in metadata generation:
String descriptions wiped on re-generate —
VariablesMap.updateDescriptionpreviouslyreplaced any non-object description with
{}before merging, discarding user-writtendescriptions loaded from an existing
dataset_description.json. Non-object descriptionsare now promoted to
{ default: string }so they survive subsequentgenerate()calls.Mixed-type column typed as "array" instead of "string" — When a column's rows contain
a mix of primitive values and arrays/objects (e.g. a
responsecolumn with keyboard-trialstrings and survey-trial objects), later rows previously overwrote the column type to
"array". The array-type override now only fires when the existing type is not already aconcrete primitive (
"string","number", or"boolean").ae0d01c: fix(metadata): treat mixed-type columns as categorical, not numeric+categorical
A column containing both numeric and non-numeric values previously produced
contradictory metadata:
value: "number"alongside bothminValue/maxValueand
levels. The fix decides at the cell level — once a non-numeric valuearrives in a column that had numeric min/max (or vice versa), the column is
downgraded to categorical: min/max fields are removed, boundary values are
preserved as string levels, and a
console.warnis emitted once per column.c2426be: Fix
PluginCacheparsing errors for standard and custom jsPsych plugins. The data block was extracted with a lazy regex that overshot into the rest of the info object; replaced with brace-counting extraction that handles any nesting depth. Non-ok HTTP responses (e.g. 404 for unknown plugins) are now caught before reaching the parser rather than passing HTML error pages as source code. Additionally, JSDoc descriptions for parameters inside anested:sub-object (e.g.view_history'spage_indexandviewing_timeinjsPsych-instructions) are now correctly extracted; previously the first nested parameter was silently consumed by the parent variable's regex match and never added to the cache.e1cb44e: Fix whitespace-only string values being misdetected as numeric (bug: whitespace-only string values misdetected as numeric → NaN min/max serialized as null #70). A cell containing only whitespace (e.g. a single space) passed the
isNaN(Number(value))check becauseNumber(" ")is0, butparseFloat(" ")isNaN— leaking through asNaNminValue/maxValue(serialized tonull) on otherwise-categorical string columns. The numeric check now requires non-empty trimmed content and usesNumberfor both the test and the conversion so they cannot disagree.3c7d1f7: Accept JSON-Lines (JSONL) experiment data, not just a single JSON array. Several jsPsych labs — and JATOS exports — write data as newline-delimited JSON, with one JSON value per line (typically one participant's full trial array per line) rather than one big array. Previously
generate()ranJSON.parseon the whole string, so every such file failed withUnexpected non-whitespace character after JSONand produced no metadata.A new exported
parseJsonDatahelper handles both shapes: a well-formed single document is returned unchanged (no behaviour change for existing single-array callers), and only when whole-string parsing fails does it fall back to parsing line by line, flattening any per-line arrays into one observation stream. It is now used wherever JSON data files are parsed:generate()(the library) for the main ingestion path.The
.jsonlfile extension is now also recognised as a JSON data file (these exports are conventionally named.jsonl). The CLI processes.jsonlexactly like.json— including filename-normalization, raw-original preservation, and CSV conversion — and the frontend normalises a.jsonlupload to the JSON path.Verified end to end against the raw
.jsonlexports invucml/online_experiments: all 15 files now generate metadata and pass the Psych-DS validator with zero errors (they failed at parse time before).3c7d1f7: Synthesize a
source_record_idjoin key for multi-record JSON-Lines exports. Raw jsPsych exports carry no per-row identifier, so once JSONL is flattened (one record per line)trial_indexrepeats across records and can't uniquely key the extracted array/object sidecar CSVs — every record's trial 0 collapsed onto the same(trial_index, element_index)key, making the sidecars impossible to join back to a single parent trial.The synthesized column is named
source_record_idrather thanparticipant_idbecause a JSON-Lines line is only guaranteed to be one source record — usually, but not always, one participant. The honest name avoids overclaiming for exports where a line isn't a single subject.parseJsonDatanow takes an opt-in{ tagSourceRecordId }flag: in the JSON-Lines path it stamps each line's object rows with a 0-basedsource_record_id(a no-op on the single-array fast path), and reports via an optionalstatsout-param whether it actually synthesized the id. A line that already carries asource_record_idor a realparticipant_idis left untouched — the experiment's own identifier already groups those rows.generate()enables this for JSON input and promotes the identifier to the leading join key, preferring the synthesizedsource_record_idand falling back to a realparticipant_idalready present in the export (['source_record_id', 'trial_index']or['participant_id', 'trial_index']), so the sidecars join unambiguously. CSV inputs are unaffected.When — and only when — the id was actually synthesized (i.e. absent from the source), it is given an explicit description that makes its synthetic origin unmistakable ("Synthetic source-record identifier … NOT a real subject ID from the experiment …") so a downstream user can't mistake it for a real subject ID; this also avoids serializing an empty
{}description (an object with no@type, which trips the validator'sOBJECT_TYPE_MISSING). The CLI's join-key pre-analysis/prompt and the frontend's pre-flight mirror this promotion so multi-record JSONL is no longer falsely flagged as having a non-unique join key.Verified end to end against the raw
.jsonlexports invucml/online_experiments(block_cat): the combined 30-record export generates metadata, passes the Psych-DS validator (0 errors), synthesizessource_record_id0–29, and writes sidecars whose(source_record_id, trial_index, element_index)keys are fully unique — including the doubly-nestedrecall_responsescase. NotablysubjectIdcollides across the two merged datasets (two records share601), whichsource_record_idcorrectly keeps distinct.72f8a4b: Register jsPsych system variables (
trial_type,trial_index,time_elapsed,extension_type,extension_version) lazily instead of seeding them in theVariablesMapconstructor. They now appear invariableMeasuredonly when their column is actually present in the data. Previouslytime_elapsed(and the others) were always emitted, so any dataset whose CSVs omittime_elapsed— common for processed/aggregated jsPsych exports — failed Psych-DS validation withVARIABLE_MISSING_FROM_CSV_COLUMNS. Datasets that do contain these columns are unaffected.This also removes the eager
generateDefaultExtensionVariables()seeding path, which registered bothextension_typeandextension_versionwheneverextension_typewas observed — orphaningextension_versionfor any dataset that lacked that column. The extension variables now register lazily per-column like the other system variables.ca8dc75: Extend the stress-test regression guards with three more Jest suites covering the CSV ingestion path, generation at scale, and cross-file output-name collisions.
@jspsych/metadata—csv-input.stress: pins howgenerate(data, {}, "csv")re-infers types from string cells (numeric coercion incl. whitespace/scientific-notation/Infinity/NaNrejection, mixed-column downgrade,"true"/"false"staying categorical, RFC-4180 quoting, unicode, empty/literal-nullcells, the 50-char level cap, JSON-in-a-cell extraction), and asserts CSV/JSON parity for unambiguously-typed columns.@jspsych/metadata—scale.stress: feeds a 5,000-row dataset and checks exact numeric extremes, categorical dedup, high-cardinality level accumulation, boolean handling, and a throughput ceiling that guards against accidental O(n²) regressions.@jspsych/metadata-cli—array-collision.stress: two same-stem files in different subdirectories sharing a nested array column, assertingprocessDirectorydisambiguates every main CSV, sidecar, and preserved raw original (no overwrites, all still Psych-DS compliant) — the cross-file collision gap left by the earlier rename suite.Test-only change; no library or CLI behavior is modified.
fa17a9e: Add stress-test regression guards to the automated suite so previously-fixed nested-data and filename-normalization behavior can't silently regress.
Four Jest suites, ported from the standalone
stress-tests/harnesses so they run under plainnpm test(and CI) without a build step:@jspsych/metadata:generate()coherence over a comprehensive nested-data fixture (deep objects, arrays of objects/arrays, mixed-type columns, atrial_type-less row, unicode, empties), plus the Psych-DS filename-normalization helper invariants.@jspsych/metadata-cli: theprocessDirectoryconversion end-to-end (compliant main CSV,data/raw/preservation, two-wayvariableMeasured↔ CSV-column cross-check, and a best-effort Psych-DS validation pass), plus the refusal to write a non-compliant filename non-interactively.Test-only change; no library or CLI behavior is modified. The shared fixture lives at
dev/stress/.55f2f91: Strip JSDoc continuation
*markers when parsing multi-line plugin/extension variable descriptions, so descriptions like the webgazer extension'swebgazer_datano longer contain stray asterisks. Adds a regression test for webgazer-shaped multi-line JSDoc.frontend@0.1.0
Minor Changes
585d337: Convert uploaded JSON data to Psych-DS CSV in the frontend so datasets validate instead of failing with
MISSING_DATAFILE.Previously the frontend placed uploaded jsPsych JSON files into
data/unchanged, so the in-browser validator (and the downloadable zip) always failed — Psych-DS only recognises CSV/TSV datafiles whose names match its keyword pattern.@jspsych/metadatagains two shared, filesystem-agnostic helpers,buildPsychDSDataFilesandderiveFallbackBase, that turn a parsed data file (plus any extracted nested array/object columns) into its set of Psych-DS-named CSV outputs. Used by both the CLI and the frontend so the conversion lives in one place.data/payload during generation — a compliant main CSV, one sidecar per nested array/object column, and the original JSON preserved underdata/raw/— and Review uses it for both validation and the zip. Auto-derived filenames use the officialsubjectkeyword (subject-<stem>) to avoid the unofficial-keyword warning, and a.psychds-ignoreis emitted so the preserveddata/raw/originals don't surface asFILE_NOT_CHECKED.buildPsychDSDataFiles. No behaviour change.03a3ce4: Add in-browser Psych-DS validation to the Review step. A "Validate dataset" button runs the official
psychds-validatorweb bundle directly in the browser against the generateddataset_description.jsonand the uploaded data files, showing a pass/error/warning report inline instead of only pointing users to the CLI. The validator bundle is code-split and lazy-loaded on first use, and the command-line instructions remain available as a fallback.Patch Changes
8edc7c2: Drop unnamed columns so R-exported datasets validate. R's
write.csv(with the defaultrow.names = TRUE) prepends an unnamed row-index column, so the exported CSV header starts with a bare comma — an empty-string column name. Psych-DS variables require a name, so the column can never appear invariableMeasured; left in the on-disk CSV it fails validation withCSV_COLUMN_MISSING_FROM_METADATA.The strip now lives in the shared data-file path so the CLI and frontend behave identically:
generate()strips empty/whitespace-only columns from the parsed data up front, with a single warning instead of per-row spam (keepsvariableMeasuredclean and standalone library use safe), via a new exportedstripUnnamedColumnshelper.buildPsychDSDataFilesstrips the main table before emitting it: a clean CSV keeps its exact bytes (verbatimmainContent), while a file with an unnamed column is re-serialised from the cleaned rows. Both the CLI (rename-plan and non-plan paths) and the frontend feed parsedmainRows, so the written/zipped/validated CSV always matches the metadata.Fixes finding Incorporating Psych-DS validator with the CLI package workflow #2 of Validation fails on real jsPsych dataset: time_elapsed always seeded, unnamed columns dropped, join-key prompt not gated #109.
3c7d1f7: Accept JSON-Lines (JSONL) experiment data, not just a single JSON array. Several jsPsych labs — and JATOS exports — write data as newline-delimited JSON, with one JSON value per line (typically one participant's full trial array per line) rather than one big array. Previously
generate()ranJSON.parseon the whole string, so every such file failed withUnexpected non-whitespace character after JSONand produced no metadata.A new exported
parseJsonDatahelper handles both shapes: a well-formed single document is returned unchanged (no behaviour change for existing single-array callers), and only when whole-string parsing fails does it fall back to parsing line by line, flattening any per-line arrays into one observation stream. It is now used wherever JSON data files are parsed:generate()(the library) for the main ingestion path.The
.jsonlfile extension is now also recognised as a JSON data file (these exports are conventionally named.jsonl). The CLI processes.jsonlexactly like.json— including filename-normalization, raw-original preservation, and CSV conversion — and the frontend normalises a.jsonlupload to the JSON path.Verified end to end against the raw
.jsonlexports invucml/online_experiments: all 15 files now generate metadata and pass the Psych-DS validator with zero errors (they failed at parse time before).3c7d1f7: Synthesize a
source_record_idjoin key for multi-record JSON-Lines exports. Raw jsPsych exports carry no per-row identifier, so once JSONL is flattened (one record per line)trial_indexrepeats across records and can't uniquely key the extracted array/object sidecar CSVs — every record's trial 0 collapsed onto the same(trial_index, element_index)key, making the sidecars impossible to join back to a single parent trial.The synthesized column is named
source_record_idrather thanparticipant_idbecause a JSON-Lines line is only guaranteed to be one source record — usually, but not always, one participant. The honest name avoids overclaiming for exports where a line isn't a single subject.parseJsonDatanow takes an opt-in{ tagSourceRecordId }flag: in the JSON-Lines path it stamps each line's object rows with a 0-basedsource_record_id(a no-op on the single-array fast path), and reports via an optionalstatsout-param whether it actually synthesized the id. A line that already carries asource_record_idor a realparticipant_idis left untouched — the experiment's own identifier already groups those rows.generate()enables this for JSON input and promotes the identifier to the leading join key, preferring the synthesizedsource_record_idand falling back to a realparticipant_idalready present in the export (['source_record_id', 'trial_index']or['participant_id', 'trial_index']), so the sidecars join unambiguously. CSV inputs are unaffected.When — and only when — the id was actually synthesized (i.e. absent from the source), it is given an explicit description that makes its synthetic origin unmistakable ("Synthetic source-record identifier … NOT a real subject ID from the experiment …") so a downstream user can't mistake it for a real subject ID; this also avoids serializing an empty
{}description (an object with no@type, which trips the validator'sOBJECT_TYPE_MISSING). The CLI's join-key pre-analysis/prompt and the frontend's pre-flight mirror this promotion so multi-record JSONL is no longer falsely flagged as having a non-unique join key.Verified end to end against the raw
.jsonlexports invucml/online_experiments(block_cat): the combined 30-record export generates metadata, passes the Psych-DS validator (0 errors), synthesizessource_record_id0–29, and writes sidecars whose(source_record_id, trial_index, element_index)keys are fully unique — including the doubly-nestedrecall_responsescase. NotablysubjectIdcollides across the two merged datasets (two records share601), whichsource_record_idcorrectly keeps distinct.Updated dependencies [8731c30]
Updated dependencies [585d337]
Updated dependencies [f96e1e6]
Updated dependencies [ed9c25c]
Updated dependencies [0f4cc4a]
Updated dependencies [1511d20]
Updated dependencies [8edc7c2]
Updated dependencies [a5af08c]
Updated dependencies [aab8da8]
Updated dependencies [35de4b6]
Updated dependencies [e80e57c]
Updated dependencies [06a84fb]
Updated dependencies [03a3ce4]
Updated dependencies [ae0d01c]
Updated dependencies [c2426be]
Updated dependencies [e1cb44e]
Updated dependencies [585d337]
Updated dependencies [3c7d1f7]
Updated dependencies [3c7d1f7]
Updated dependencies [72f8a4b]
Updated dependencies [6b0d1d4]
Updated dependencies [ca8dc75]
Updated dependencies [d9e4485]
Updated dependencies [5fcce14]
Updated dependencies [fa17a9e]
Updated dependencies [55f2f91]