Skip to content

fchen13/catalyze

Repository files navigation

Catalyze Smart Catheter — Analysis Guide

NIH Catalyze · Pig biomarker and titration pipeline


1. Pipeline and run order

Step What Input Main outputs
1 ammonia_analysis.ipynb Master sheets/*.xlsx (10 workbooks) output/ammonia_concentration.csv, output/co2_concentration.csv, output/ammonia_co2_ratio.csv
2 spline_interpolation.py (or .ipynb) Titration_Data_*.xlsxACIDITY + ALKALINITY output/naoh_pH11_results.csv, output/citric_acid_pH4_results.csv, output/interpolation_parameters_table.csv, output/titration_curves_healthy_vs_nonhealthy.png
3 biomarker_analysis.py (or .ipynb) Titration workbook in repo root + CSVs in output/ output/biomarker_summary.csv, by-pig / pooled PNGs in output/

Typical order: (1) ammonia notebook → (2) spline → (3) biomarker. Step 1 can run in parallel with step 2 if paths are set; step 3 needs outputs from both.

Shared code: All loading, splines, stats, and plotting live in shared_utils.py.

pip install -r requirements.txt

2. Quick start

From the catalyze directory (with Master sheets/, titration .xlsx in the repo root, and output/ for artifacts):

# Option A (CLI): create ammonia/CO2/ratio CSVs into output/
python ammonia_analysis.py --no-plots

# Then run spline + biomarker:
python spline_interpolation.py
python biomarker_analysis.py

Scripts create output/ if needed (shared_utils.DEFAULT_OUTPUT_DIR). The titration workbook is read from shared_utils.CATALYZE_DIR (same folder as shared_utils.py) unless you pass absolute paths.

Notebooks (ammonia_analysis.ipynb, spline_interpolation.ipynb, biomarker_analysis.ipynb) mirror the same logic for interactive use.

CLI overrides (common)

python ammonia_analysis.py --master-sheets-dir "Master sheets" --output-dir "output" --no-plots
python spline_interpolation.py --titration-xlsx "Titration_Data_02052026.xlsx" --output-dir "output"
python biomarker_analysis.py --titration-xlsx "Titration_Data_02052026.xlsx" --input-dir "output" --output-dir "output"

Each script supports --help.


3. Spline interpolation (step 2)

  • NaOH @ pH 11: PCHIP (monotonic) on ACIDITY sheet — NaOH vs pH.
  • Citric @ pH 4: Linear interpolation on ALKALINITY sheet.

Default outputs (from spline_interpolation.py): naoh_pH11_results.csv, citric_acid_pH4_results.csv, interpolation_parameters_table.csv, titration_curves_healthy_vs_nonhealthy.png (300 DPI).

Edit constants at the top of spline_interpolation.py if needed: INPUT_FILE, SHEET_ACIDITY, SHEET_ALKALINITY, TARGET_pH_NAOH, TARGET_pH_CITRIC, HEALTHY_PIGS / NONHEALTHY_PIGS, and output filenames.

biomarker_analysis resolves citric via citric_acid_pH4_results.csv or alternate citric_pH4_results.csv if present.


4. Group definitions (all analyses)

HEALTHY_PIGS     = PIG-04, PIG-06, PIG-07, PIG-08
NONHEALTHY_PIGS  = PIG-01, PIG-02, PIG-03, PIG-05, PIG-09, PIG-10

5. Statistics (how to read outputs)

p-value Label Effect
< 0.001 Highly significant ***
< 0.01 Very significant **
< 0.05 Significant *
≥ 0.05 Not significant ns
Cohen’s d Effect size
< 0.2 None
0.2–0.5 Small
0.5–0.8 Medium
> 0.8 Large

Percent difference (reported in pipeline):
100 × (Healthy mean − Non-healthy mean) / Non-healthy mean
Positive ⇒ healthy higher.

Tests: shared_utils.calculate_statistics uses independent t-test, Mann–Whitney U, and Cohen’s d (pooled SD).


6. Biomarker results (summary)

Rank Biomarker p (approx.) Cohen’s d Note
1 Phosphate < 0.001 ~1.14 Primary discriminator
2 Creatinine ~0.045 ~0.46 Weak
3 NaOH @ pH 11 ~0.054 ~0.46 NS
4 Urine pH ~0.62 ~0.08 NS
5 Min pH (if computed) ~0.19 ~0.28 NS
6 Citric @ pH 4 ~0.97 ~0.01 NS

Phosphate (primary): healthy ~176.95 vs non-healthy ~120.6 mM; ~+47% in healthy; large effect.
Interpretation (short): Non-healthy animals show proximal tubular phosphate wasting; urine pH and citric buffering did not separate groups, arguing against a generic “global metabolic failure” story and toward selective tubular signal.

Manuscript snippets (templates):

  • Phosphate: “Urinary phosphate was lower in non-healthy animals (e.g. ~120.6 vs ~176.9; p < 0.001, Cohen’s d ~ 1.1), consistent with impaired tubular reabsorption.”
  • pH (negative): “Direct urine pH did not differ between groups (p > 0.5), consistent with preserved baseline acid–base readout in this dataset.”

7. Key files

File Role
output/ Default folder for CSV and PNG artifacts
shared_utils.py Ammonia/CO₂ loaders, splines, biomarker stats, figures
ammonia_analysis.py CLI: export ammonia/CO₂/ratio CSVs from Master sheets
spline_interpolation.py CLI: NaOH + citric CSVs + plot
biomarker_analysis.py CLI: full biomarker run → biomarker_summary.csv
*.ipynb Interactive runs

8. Extending with new biomarkers

  1. Inspect the Excel layout (pd.read_excel, sheet_name=…, print columns).
  2. Pool values across HEALTHY_PIGS / NONHEALTHY_PIGS (see read_phosphate_creatinine_by_pig, load_naoh_from_csv, etc. in shared_utils.py).
  3. Call calculate_statistics(healthy_arr, nonhealthy_arr).
  4. Add your step to run_full_biomarker_analysis (or a separate script) and append to the summary dict passed to create_biomarker_summary.

Reuse existing plot helpers (plot_biomarkers_by_pig_figure, …) for consistent styling.


9. Troubleshooting

  • Import errors: install packages (section 1).
  • Missing CSV: run steps in order; check working directory.
  • Sheet not found: pd.ExcelFile(path).sheet_names.
  • NaN at target pH: measured pH range may not reach the target (e.g. 11); inspect raw titration rows.
  • p or d disagrees with a past run: confirm same file version, same group labels, and same test (t-test vs Mann–Whitney).

License: Proprietary to the Catalyze project; internal use unless otherwise agreed.

When publishing, replace summary numbers with those from your current biomarker_summary.csv and locked analysis date.

About

Biomarker Data Analysis for Catalyze Smart Catheter Project

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors