Skip to content

Latest commit

 

History

History
130 lines (88 loc) · 2.96 KB

File metadata and controls

130 lines (88 loc) · 2.96 KB

Documentation Standards

README.md Structure

Every project should have a README with sections in this order:

# Project Name

Brief 1-2 sentence description of what this project does.

## Overview

Expanded description: problem it solves, methods used, key findings.

## Quick Start

Commands to run the analysis from scratch:
- List dependencies (R packages, Python packages, external tools)
- Installation steps if non-standard
- Command to reproduce main results

## Directory Structure

Brief description of key folders (link to docs if detailed).
See [data-organization.md](./data-organization.md).

## Key Files

| File | Purpose |
|------|---------|
| analysis/main.R | Primary analysis |
| R/utils.R | Helper functions |

## Data

Source of raw data, any preprocessing notes, where to find metadata mappings.

## Results

Summary of main outputs. Link to manuscripts or figure descriptions.

## References

Citations to papers, external data sources, related projects.

### Setting up bibliography

All new projects should include the lab master bibliography as a git submodule:

```bash
git submodule add https://github.com/cujoisa/master_bibliography refs/

In Quarto .qmd files, reference in YAML frontmatter:

---
title: "Analysis Title"
bibliography: refs/master_bibliography/master_compressed.bib
---

Then cite in text using [@citation_key] syntax:

This analysis uses limma [@Ritchie2015limma] and pathway analysis [@Subramanian2005gsea].

Quarto will automatically format citations and generate a References section at the end.

Contributing

How to update or extend this project.


## Code Comments

- Comment the **why**, not the **what**
- Bad: `x <- x + 1  # Add 1 to x`
- Good: `# Adjust for batch effect by adding offset`
- Section headers: `# ---- Data Loading ----` with 4 dashes

## R Function Documentation

Use roxygen comments for functions:

```r
#' Calculate response metric
#'
#' @param dose_response tibble with columns dose_nM, response_pct
#' @param threshold numeric, response cutoff (default 50)
#'
#' @return tibble with responders classified
#' @examples
#' calculate_response(dose_data, threshold = 20)
#'
#' @export
calculate_response <- function(dose_response, threshold = 50) {
  # Implementation
}

Python Docstrings

Use Google-style docstrings:

def calculate_response(dose_response: pd.DataFrame, threshold: float = 50) -> pd.DataFrame:
    """Calculate response metric.

    Args:
        dose_response: DataFrame with columns dose_nM, response_pct
        threshold: Response cutoff percentage (default 50)

    Returns:
        DataFrame with responders classified

    Examples:
        >>> result = calculate_response(dose_data, threshold=20)
    """
    # Implementation

Inline Metadata

For data files, include README or metadata file:

  • samples.tsv → paired with samples_metadata.md or _sample_key.txt
  • combined_dose_data.csv → describe columns, units, missing value codes