Skip to content

Unifying Subject-Sample mapping files for HDD data  #56

@nboukharov

Description

@nboukharov

Subject-Sample mapping files for Expression, Metabolomics, MIRNA_QPCR, MIRNA_SEQ, Protein, RBM and RNASeq Data have slight differences that create unnecessary issues for curators. We would like to have one format for all HDD data maping files, the same as is used for Expression data: STUDY_ID, SITE_ID, SUBJECT_ID, SAMPLE_ID, PLATFORM, TISSUETYPE, ATTR1, ATTR2, CATEGORY_CD, SOURCE_CD
The only mandatory fields should be STUDY_ID, SUBJECT_ID, SAMPLE_ID PLATFORM and CATEGORY_CD. Other columns should be allowed to be null. If a specific loading procedure requires one of the optional columns to have a value, a default value should be inserted (e.g. "Unknown" for TISSUETYPE, "STD" for CATEGORY_CD). Unified mapping file loading procedure should be back compatible and flexible. Both ATTR1 and ATTRIBUTE_1, STUDY_ID and TRIAL_NAME should be acceptable for respective columns. All "tokens" (SITE_ID, PLATFORM, TISSUETYPE, ATTR1, ATTR2) should be allowed to be used in the CATEGORY_CD in any order (don't have to have values in ATTR1 to use ATTR2)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions