Context
Concept sets in this repository use a local integer id that is only meaningful within our system. The OHDSI Concept Set Specification explicitly states that id is "a unique identifier for the concept set within a given system" and does not define any cross-system identifier.
This makes it difficult to:
- Share concept sets with other organizations or OHDSI community members
- Reference a specific concept set unambiguously across systems (e.g., in publications, study packages, or federated analyses)
- Track provenance and attribution when concept sets are reused
This is a well-known gap in the OHDSI ecosystem. See notably:
- OHDSI/Atlas#496 — extensive discussion on adding UUID/GUID to cohort definitions (2017–2019, never implemented)
- OHDSI/Strategus#114 — modern JSON schema using URI references for portable artifacts
- OHDSI/Athena#48 — proposal for a shared design repository with versioning
Proposal
Add two new fields to the concept set metadata object:
1. uniqueId (string, UUID v4)
A globally unique identifier for the concept set, generated once at creation and stable across versions.
{
"metadata": {
"uniqueId": "550e8400-e29b-41d4-a716-446655440000",
...
}
}
Rationale: The OHDSI community debated UUID vs content-hash (Atlas #496). A UUID v4 is simpler and more practical for our use case:
- It stays stable when the concept set is updated (content hashes would change on every edit)
- It can be generated offline without any central registry
- It is universally understood and supported
The local id (integer) would remain for internal use (file naming, URLs, backward compatibility).
2. organization (object)
Attribution to the organization that created/maintains the concept set.
{
"metadata": {
"organization": {
"name": "INDICATE Consortium",
"url": "https://indicate-eu.org"
},
...
}
}
This would make it clear where a concept set comes from when shared externally.
Example of a full metadata block after these changes
{
"id": 1,
"name": "3-minute Diagnostic Interview for CAM-defined Delirium (3D-CAM) score",
"version": "1.0.0",
"metadata": {
"uniqueId": "550e8400-e29b-41d4-a716-446655440000",
"organization": {
"name": "INDICATE Consortium",
"url": "https://indicate-eu.org"
},
"translations": { "..." : "..." },
"createdByDetails": { "..." : "..." },
"reviews": [],
"versions": []
}
}
Design decisions
- UUID v4 for
uniqueId (not a content hash): stable across edits, generated offline, universally supported.
uniqueId inside metadata: respects the current OHDSI Concept Set Specification, which treats metadata as extensible. This avoids deviating from the spec while we propose changes upstream.
Open questions
- Additional
organization fields? Should we add fields like contactEmail, license, or a ROR ID?
References
Next step: If we reach consensus on this proposal, we will open an issue on OHDSI/TAB to propose extending the official Concept Set Specification accordingly.
Context
Concept sets in this repository use a local integer
idthat is only meaningful within our system. The OHDSI Concept Set Specification explicitly states thatidis "a unique identifier for the concept set within a given system" and does not define any cross-system identifier.This makes it difficult to:
This is a well-known gap in the OHDSI ecosystem. See notably:
Proposal
Add two new fields to the concept set
metadataobject:1.
uniqueId(string, UUID v4)A globally unique identifier for the concept set, generated once at creation and stable across versions.
{ "metadata": { "uniqueId": "550e8400-e29b-41d4-a716-446655440000", ... } }Rationale: The OHDSI community debated UUID vs content-hash (Atlas #496). A UUID v4 is simpler and more practical for our use case:
The local
id(integer) would remain for internal use (file naming, URLs, backward compatibility).2.
organization(object)Attribution to the organization that created/maintains the concept set.
{ "metadata": { "organization": { "name": "INDICATE Consortium", "url": "https://indicate-eu.org" }, ... } }This would make it clear where a concept set comes from when shared externally.
Example of a full metadata block after these changes
{ "id": 1, "name": "3-minute Diagnostic Interview for CAM-defined Delirium (3D-CAM) score", "version": "1.0.0", "metadata": { "uniqueId": "550e8400-e29b-41d4-a716-446655440000", "organization": { "name": "INDICATE Consortium", "url": "https://indicate-eu.org" }, "translations": { "..." : "..." }, "createdByDetails": { "..." : "..." }, "reviews": [], "versions": [] } }Design decisions
uniqueId(not a content hash): stable across edits, generated offline, universally supported.uniqueIdinsidemetadata: respects the current OHDSI Concept Set Specification, which treatsmetadataas extensible. This avoids deviating from the spec while we propose changes upstream.Open questions
organizationfields? Should we add fields likecontactEmail,license, or a ROR ID?References