Property-based testing framework for generating OHDSI cohort definitions that conform to the Atlas/Circe JSON schema.
This project uses test.check to generate random, valid OHDSI cohort definitions. Instead of manually writing cohort definitions, property-based testing automatically generates hundreds of valid cohorts to ensure your cohort processing code can handle a wide variety of inputs.
-
Generators for all cohort definition components:
- Concepts with OMOP vocabulary metadata
- Concept sets with unique sequential IDs
- Primary criteria with various domain types (always references existing concept sets)
- Correlated criteria with temporal windows and occurrence constraints
- Inclusion rules with complex correlation expressions
- Optional components (observation windows, limits, collapse settings, demographic criteria)
-
Property-based tests that verify:
- Generated cohorts have required fields
- All nested structures are valid
- Primary criteria is never empty
- Concept set IDs are unique
- CodesetIds always reference existing concept sets (in both primary criteria and inclusion rules)
- Correlated criteria have valid temporal windows
- JSON serialization/deserialization preserves structure
- Schema compliance (ConceptSets, PrimaryCriteria, InclusionRules, etc.)
-
21 property tests running 100+ iterations each (2100+ generated cohorts) to validate cohort structure
Generate cohorts and display them in the console:
clj -M:run 10Generate cohorts and save them to disk (one JSON file per cohort):
clj -M:run 20 --output-dir outputSave to a custom directory:
clj -M:run 5 --output-dir C:\my-cohortsArguments:
<count>- Number of cohorts to generate (required)--output-dir <directory>- Optional directory to save cohorts
Output files (when using --output-dir):
cohort-0.jsoncohort-1.jsoncohort-2.json- etc.
Build a standalone executable JAR file:
clj -T:build uberThis creates target/cohort-tests-0.1.0-standalone.jar.
Run the uberjar:
# Generate 10 cohorts
java -jar target/cohort-tests-0.1.0-standalone.jar 10
# Generate 20 cohorts and save to directory
java -jar target/cohort-tests-0.1.0-standalone.jar 20 --output-dir output
# Generate 5 cohorts and save to custom location
java -jar target/cohort-tests-0.1.0-standalone.jar 5 --output-dir C:\cohortsClean build artifacts:
clj -T:build cleanThe uberjar includes all dependencies and can be distributed as a single file. No Clojure installation required to run it (only Java).
clj -M:testThis runs both property-based tests (1000+ generated cohorts) and unit tests.
Alternatively, run tests directly:
clj -M:test -e '(require (quote cohort-tests.core-test)) (clojure.test/run-tests (quote cohort-tests.core-test))'clj -M:replThen in the REPL:
(require '[cohort-tests.core :as cohort])
(require '[clojure.test.check.generators :as gen])
;; Generate a single cohort
(gen/generate cohort/gen-cohort-definition)
;; Generate 10 cohorts
(cohort/generate-cohorts 10)
;; Generate and validate
(def cohorts (cohort/generate-cohorts 5))
(every? cohort/validate-cohort-definition cohorts)
; => true
;; Convert to JSON
(cohort/cohort->json (first cohorts))
;; Generate specific components
(gen/generate cohort/gen-concept)
(gen/generate cohort/gen-concept-set)
(gen/generate cohort/gen-primary-criteria)Generated cohorts conform to the OHDSI Atlas/Circe JSON schema (draft-04):
ConceptSets- Array of concept set definitionsPrimaryCriteria- Entry criteria with CriteriaList
QualifiedLimit- String or object with TypeExpressionLimit- String or object with TypeInclusionRules- Array of inclusion rule objectsCollapseSettings- Era collapse configurationcdmVersionRange- CDM version specification
Basic Components:
gen-concept- OMOP concept with ID, name, domain, vocabularygen-concept-set-item- Concept with inclusion/exclusion flagsgen-concept-set- Named set with concept expressiongen-observation-window- Prior/post days windowgen-criteria-item- Domain criteria (Condition, Drug, Procedure, Measurement, Observation)gen-primary-criteria- Entry criteria with optional window (always references valid concept sets)
Correlated Criteria Components:
gen-window- Temporal windows (StartWindow/EndWindow) with start/end daysgen-correlated-criteria-item- Criteria with temporal windows and occurrence constraintsgen-demographic-criteria- Age and gender criteriagen-correlated-criteria- Complete correlation expression with criteria lists and groupsgen-inclusion-rule- Inclusion rule with name and correlated criteria expressiongen-inclusion-rules- List of inclusion rules
Top-Level Generator:
gen-cohort-definition- Complete valid cohort with:- Unique sequential concept set IDs (0, 1, 2, ...)
- Primary criteria that always references existing concept sets
- Optional inclusion rules with correlated criteria
- Optional qualified/expression limits, collapse settings, CDM version
21 property-based tests verify structural correctness (2100+ cohorts validated):
Basic Structure:
generated-cohorts-are-valid- All required fields presentgenerated-cohorts-have-concept-sets- ConceptSets is non-empty vectorgenerated-cohorts-have-primary-criteria- PrimaryCriteria is valid mapconcept-sets-have-required-fields- Each set has id, name, expressionconcept-set-items-have-concepts- Items contain concept objectsconcepts-have-required-fields- Concepts have ID and namecriteria-list-is-vector- CriteriaList is a vector
Enhanced Validations:
8. concept-set-ids-are-unique - No duplicate concept set IDs
9. primary-criteria-never-empty - CriteriaList always has ≥1 item
10. codeset-ids-reference-existing-concept-sets - All CodesetIds in primary criteria are valid
Correlated Criteria:
11. inclusion-rules-have-valid-structure - Rules have name and expression
12. inclusion-rule-codeset-ids-reference-existing-concept-sets - CodesetIds in inclusion rules are valid
13. correlated-criteria-have-valid-windows - Temporal windows have Start and End
Optional Fields:
14. observation-window-has-valid-structure - Window has PriorDays/PostDays
15. qualified-limit-is-valid - Limit is string or object
16. expression-limit-is-valid - Limit is string or object
17. collapse-settings-is-valid - Settings have valid structure
Serialization:
18. json-roundtrip-preserves-structure - Serialization is lossless
{
"ConceptSets": [
{
"id": 0,
"name": "abc123",
"expression": {
"items": [
{
"concept": {
"CONCEPT_ID": 42,
"CONCEPT_NAME": "xyz789",
"DOMAIN_ID": "Condition",
"VOCABULARY_ID": "SNOMED",
"CONCEPT_CODE": "def456"
},
"includeDescendants": true,
"isExcluded": false,
"includeMapped": false
}
]
}
}
],
"PrimaryCriteria": {
"CriteriaList": [
{
"ConditionOccurrence": {
"CodesetId": 0
}
}
],
"ObservationWindow": {
"PriorDays": 30,
"PostDays": 0
}
}
}- Clojure 1.11.1
- test.check 1.1.1 (property-based testing)
- data.json 2.4.0 (JSON serialization)
Copyright © 2026