-
Notifications
You must be signed in to change notification settings - Fork 1
JBrowse Configuration
The main configuration file (the "JBrowse metadata file") is in the pombase-config Git repo
This file is included in every nightly load.
users see (in documentation, submission template, etc.): Data type
e.g. "Chromatin binding" or "Poly(A) sites"
If "yes" this track will be displayed in JBrowse. Use "no" for tracks that aren't ready. Hidden from users.
If "yes" the label will be shown beside the feature in the track. Hidden from users.
users see: Track label
The track label displayed to the user and also the key used by JBrowse to identify tracks internally. Labels can't include commas (use semicolons to separate parts).
A free-text comment. Ignored by JBrowse.
Only required for chromatin binding data to specify the protein binding to chromatin.
users see: Strain background
Any background alleles present in cells for technical purposes; can include mating type, ploidy, markers, etc. For vegetative cells haploid is assumed unless diploid is specified.
e.g. h90 or h+/h+ pat1-114 or h- ura4-D18 leu1-32
users see: WT or mutant
Notes whether the strain is wild type (WT) or has one or more mutations of experimental interest. WT designation ignores background mutations.
users see: Mutant alleles
Specify mutant alleles of interest to the experiment.
e.g. prp2-1 or pcr1delta
e.g. YES, high temperature or glucose MM
e.g. vegetative growth or meiosis or quiescence or glucose starvation or oxidative stress or heat shock
"forward" or "reverse". The default is no strand.
e.g. "DamID / tiling array", "NGS" or "mRNA end sequencing"
PubMed ID if known.
e.g. GEO or ArrayExpress
e.g. GSE41773 or E-MTAB-1154
e.g. GSM1024004 or ERS078453
"bigwig", "rnaseq", "GFF", "bed" or "vcf".
The data files for "rnaseq" should be .bam format.
The URL of the data file on pombase.org.
Raw files from publications should be placed on oliver1 in a subdirectory of /data/pombase/external_datasets/originals/ like Thodberg_2018_PMID_30566651.
These original files often need to be fixed to have the correct chromosome IDs. Once fixed, they'll need to be processed for use by JBrowse (sorted, compressed and indexed, depending on the format). Once processed they should be copied to a sub-directory of /data/pombase/external_datasets/processed/. For example Thodberg_2018_PMID_30566651.
Name directories following the convention: Author_YYYY_PMID_nnnnnnn_(optional-short-description).
The URL should begin with https://www.pombase.org/external_datasets/, then have the dataset directory name (Thodberg_2018_PMID_30566651), then the file name (GSE110976_TSSs.sorted.bed.gz).
So if the data file has this path:
/data/pombase/external_datasets/processed/Thodberg_2018_PMID_30566651/GSE110976_TSSs.sorted.bed.gz
use this URL:
https://www.pombase.org/external_datasets/Thodberg_2018_PMID_30566651/GSE110976_TSSs.sorted.bed.gz
(Note that the URL doesn't include processed)
See Formatting-data-files-for-JBrowse for notes on processing files for JBrowse.
Users submit metadata in a file that collects entries for most of the columns (see https://www.pombase.org/documentation/data-submission-form-for-HTP-sequence-linked-data); we have to add display_in_jbrowse, show_feature_label, and source_url.
Two columns, ensembl_source_name andshort_description, were carried over from the Ensembl Genomes configuration file for a while but have now been deprecated and removed.
PomBase is funded by the Wellcome Trust