Every project should have a README with sections in this order:
# Project Name
Brief 1-2 sentence description of what this project does.
## Overview
Expanded description: problem it solves, methods used, key findings.
## Quick Start
Commands to run the analysis from scratch:
- List dependencies (R packages, Python packages, external tools)
- Installation steps if non-standard
- Command to reproduce main results
## Directory Structure
Brief description of key folders (link to docs if detailed).
See [data-organization.md](./data-organization.md).
## Key Files
| File | Purpose |
|------|---------|
| analysis/main.R | Primary analysis |
| R/utils.R | Helper functions |
## Data
Source of raw data, any preprocessing notes, where to find metadata mappings.
## Results
Summary of main outputs. Link to manuscripts or figure descriptions.
## References
Citations to papers, external data sources, related projects.
### Setting up bibliography
All new projects should include the lab master bibliography as a git submodule:
```bash
git submodule add https://github.com/cujoisa/master_bibliography refs/In Quarto .qmd files, reference in YAML frontmatter:
---
title: "Analysis Title"
bibliography: refs/master_bibliography/master_compressed.bib
---Then cite in text using [@citation_key] syntax:
This analysis uses limma [@Ritchie2015limma] and pathway analysis [@Subramanian2005gsea].Quarto will automatically format citations and generate a References section at the end.
How to update or extend this project.
## Code Comments
- Comment the **why**, not the **what**
- Bad: `x <- x + 1 # Add 1 to x`
- Good: `# Adjust for batch effect by adding offset`
- Section headers: `# ---- Data Loading ----` with 4 dashes
## R Function Documentation
Use roxygen comments for functions:
```r
#' Calculate response metric
#'
#' @param dose_response tibble with columns dose_nM, response_pct
#' @param threshold numeric, response cutoff (default 50)
#'
#' @return tibble with responders classified
#' @examples
#' calculate_response(dose_data, threshold = 20)
#'
#' @export
calculate_response <- function(dose_response, threshold = 50) {
# Implementation
}
Use Google-style docstrings:
def calculate_response(dose_response: pd.DataFrame, threshold: float = 50) -> pd.DataFrame:
"""Calculate response metric.
Args:
dose_response: DataFrame with columns dose_nM, response_pct
threshold: Response cutoff percentage (default 50)
Returns:
DataFrame with responders classified
Examples:
>>> result = calculate_response(dose_data, threshold=20)
"""
# ImplementationFor data files, include README or metadata file:
samples.tsv→ paired withsamples_metadata.mdor_sample_key.txtcombined_dose_data.csv→ describe columns, units, missing value codes