Skip to content

Latest commit

 

History

History
40 lines (32 loc) · 1.6 KB

File metadata and controls

40 lines (32 loc) · 1.6 KB

Data Organization

Directory Structure

repo/
├── data/                    # Raw and processed data
│   ├── raw/                # Original, unmodified data
│   ├── processed/          # Cleaned, intermediate datasets
│   └── metadata/           # Sample info, metadata, mappings
├── results/                # Analysis outputs, statistics, tables
├── figures/                # Generated plots and visualizations
├── src/                    # Source code and scripts
│   ├── data_processing/    # Data cleaning and preparation
│   ├── analysis/           # Statistical analysis
│   └── utils/              # Helper functions
├── paper/                  # Manuscripts and writeups
└── README.md               # Project overview

File Naming

  • Use snake_case for all filenames
  • Include date for versioned analyses: analysis_2025-03-10.R
  • Separate logical units: combined_dose_data_processed.csv
  • Use meaningful descriptors: no data1.csv, final_final.csv
  • Data formats: .csv (text), .fst (fast storage), .rds (R objects), .tsv (tab-delimited)

Example good names:

  • combined_dose_data_LFQ_only.fst
  • samples_model_data.tsv
  • kinome_profiling_novartis_combined.fst

Data Files

  • Raw data stored in data/raw/ — never modified
  • Processed data in data/processed/ — scripts that create them are tracked in version control
  • Metadata/mappings in data/metadata/ or as explicit *_mapping.csv files
  • Create *_mapping.csv or *_metadata.tsv for sample IDs, library IDs, etc.