Skip to content

Overture Playground Dataset  #167

@MitchellShiell

Description

@MitchellShiell

Overture Playground Dataset

Outline and organize a representative open-access mock data set to include with the DMS.

  • Metadata fields should be representative of a Cancer genomics research group(s)
  • Should allow us to show all arranger faceted search options (quick search, radial filters, date range)
  • Must comply with Songs base schema and base meta schema
  • For demo purposes, the metadata should consider Song functionalities, including analysis types required (separate ticket) and publication states (Most data will be published, but we should consider including some unpublished and some suppressed)


  • File data should be as big as necessary (the smaller, the better)
  • File data should support all Jbrowse functionalities (VCFs, BAMs, BCFs, BEDs, etc.)
  • To show arrangers pagination, we are looking for ~550 records; however, we do not need unique file data for all ~550 records
  • For demo purposes, the file data should also consider Score functionalities (BAM/CRAM slicing, etc.)

Metadata Tasks

  • Outline required metadata fields
  • Review Outline
  • Produce Song Schema
  • Generate metadata payloads
  • Review Schema + Update google sheet (Mitchell)

File Data Tasks

  • Outline required file data
  • Review Outline
  • Organize/generate file data (Synthetic VFCs + Links to open access file data)

General Tasks

  • Song Schemas, metadata payloads and dummy file data with links to open access file data placed in a folder within the DMS repo link to the relevant PR

Links

Co-ordination doc
Song Metadata Fields
Base Schema Updates
Link to issue

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions