DTW DynaCLR Monorepo#398
Open
edyoshikun wants to merge 146 commits intomodular-viscy-stagingfrom
Open
Conversation
…thods (i.e phate, pca,umap)
Add normalization columns (norm_mean/std/median/iqr/max/min), z_focus_mean, and TCZYX shape columns to the cell index schema. preprocess_cell_index reads per-FOV zattrs and writes stats as parquet columns for fast per-row normalization at training time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ExperimentRegistry.from_cell_index: build registry directly from preprocessed parquet + zarr metadata (no collection YAML needed) - datamodule: cell_index_path as primary entry point, _train_final_crop changed from BatchedRandSpatialCropd to BatchedCenterSpatialCropd (random crop for Z/XY translation is now a user-configured augmentation) - dataset: read norm stats from parquet columns, build_norm_meta fallback - index: _align_parquet_columns, _resolve_dims from parquet Y/X_shape Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DynaCLR-3D-BagOfChannels-v2: z_window=32, yx_patch=256, RandSpatialCrop(40,228,228) after affine for Z focus invariance + XY translation, CenterCrop(32,160,160) auto-appended. batch_size=256, 2 GPUs, 2-day wall time. - Add dataloader_demo.py: Jupyter-style visualization of raw vs augmented anchor/positive batches with per-sample metadata - Update demo configs and inspection scripts for new pipeline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
np.nanmin/nanmax fail on scipy sparse arrays. Convert to dense before computing range stats so the command works on Seurat-exported anndata zarr stores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CLI for running evals - DAG for evals - yaml files for evals
… 3 base callbacks - model/contrastive_encoder_convnext_tiny.yml: ConvNeXt-Tiny class_paths - model/dinov3_frozen_mlp.yml: frozen DINOv3 + MLP projection block - augmentations/ops_2d_mild.yml: OPS-specific mild augmentation pipeline - data/ops_gene_reporter.yml: OPS data defaults (patch sizes, sampling)
- train_linear_classifier() now returns a third value: raw val outputs (y_val, y_val_proba, classes) for downstream ROC curve plotting - orchestrated run-linear-classifiers generates metrics_summary.pdf alongside the CSV: bar chart of AUROC/accuracy/F1 + per-task ROC curves - Delete evaluate_dataset.py (argparse-based, not in CLI, superseded by orchestrator) and its example config - Strip generate_comparison_report and its helpers from report.py; file is now CV-only - Remove dead _detect_n_features() from cross_validation.py - Update all callers of train_linear_classifier() to unpack 3-tuple - Update DAG doc and linear classifiers README Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- FOVRecord.channel_markers: dict[str, str] maps zarr channel name to marker for a specific well (populated from Airtable channel_N_marker fields) - ChannelEntry.wells: list[str] restricts a channel to a subset of wells; empty means valid in all wells - build_collection auto-populates wells by comparing which wells have a non-None marker for each channel across all FOVRecords - _build_experiment_tracks skips channel rows where ch.wells is non-empty and the current well is not in that set, preventing noise rows from mixed-plate experiments (e.g. viral sensor only in B/3, C/2) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The glob */*/* on zarr v3 stores yields zarr.json files (e.g. A/2/zarr.json) in addition to position directories. The previous check only stripped names starting with "." (.zattrs, .zgroup) but missed zarr.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ollection - DynaCLR-2D-MIP-BagOfChannels: add viral_sensor + Phase3D for 2025_01_28, 2024_10_09, 2024_10_16; fix dragonfly tracks_path to point to inner zarr store (tracking.zarr/2024_08_14_...zarr) - DynaCLR-3D-BagOfChannels-v2: add viral_sensor + Phase3D for 2025_01_28, 2024_10_09, 2024_10_16 - DynaCLR-3D-BagOfChannels-v3: new collection copied from v2 with dragonfly tracks_path fix; v2 left intact for running training job - DynaCLR-BoC-lc-evaluation-v1: add viral_sensor for all datasets; add Phase3D for 2025_01_28 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wire load_config to delegate to load_composed_config so eval configs support base: recipe inheritance (same mechanism as training configs) - Extract shared eval settings into 4 recipes: predict.yml, reduce.yml, plot_infectomics.yml, linear_classifiers_infectomics.yml - Slim down DynaCLR-2D-BagOfChannels-v3, DynaCLR-2D-MIP-BagOfChannels-v1, DINOv3-temporal-MLP-2D-BagOfChannels-v1, and test_evaluation configs to use base: references — eliminating copy-pasted 14-experiment annotation blocks and shared step configs - Fix ONNX inference to use GPU (CUDAExecutionProvider) and suppress pthread_setaffinity_np noise with intra/inter_op_num_threads=1 - Switch CTC tracking SLURM script to gpu partition Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix \bbf[\b_] -> \bbf(\b|_): inside a character class, \b is a backspace character, not a word boundary - Add \bphc\b to detect phase-contrast (PhC) as label-free Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pandas 3+ uses Arrow-backed strings by default, which breaks anndata's
zarr writer. Apply the same fix in two code paths:
- embedding_writer.py: replace select_dtypes("string") with per-column
isinstance checks for pd.StringDtype and Arrow-backed Categoricals
- zarr_utils.py: convert ArrowStringArray columns and index to object
dtype before calling append_to_anndata_zarr
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PHATE: default n_jobs from -1 (all cores) to 1 to prevent hogging shared SLURM nodes; exposed in PHATEConfig and compute_phate() - Annotation: support (fov_name, t, track_id) join as fallback when both sides lack an 'id' column; normalize fov_name by stripping leading/trailing slashes to prevent join mismatches Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
For multiclass problems, compute one-vs-rest AUROC per class and report
as val_{class_name}_auroc columns in the results DataFrame.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- viscy-utils: add onnx, onnxscript to core deps; copairs to eval extras - dynaclr: add tracking optional group (gurobipy, onnxruntime-gpu, py-ctcmetrics, tabulate, tracksdata) for CTC tracking benchmark - Regenerate uv.lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- index.py: replace O(N*tau) Python loop in _compute_valid_anchors with vectorized pd.MultiIndex.isin(); add fit=False predict-mode fast path that skips anchor computation; add precomputed_valid_anchors to clone_with_subset() to avoid redundant recomputation; accept cell_index_df to avoid double-reading parquet - dataset.py: replace per-row loops in _build_match_lookup with groupby().indices; skip lookup build in predict mode; add organelle, well, microscope to exported metadata columns - datamodule.py: tune defaults (num_workers=4, cache_pool=500MB, pin_memory=True, buffer_size=4); use vectorized MultiIndex.isin for FOV split; reuse pre-loaded cell_index_df from ExperimentRegistry - experiment.py: from_cell_index returns (registry, dataframe) tuple so callers can reuse the DataFrame without re-reading from disk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use .get() with None default for transcriptome_anndata and skip the barcode join when it is absent, allowing embeddings on datasets that lack paired scRNA-seq. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Centralize cell_index_path to shared /hpc/projects/.../collections/ dir across all training configs - MIP model: extend z_extraction_window 11->20, z_focus_offset 0.5->0.3, yx_patch_size 192->256, add BatchedRandSpatialCropd for Z-invariance - 3D BoC: num_workers 2->4; SLURM time limit 2d->4d - Collection: mark DynaCLR-2D-BagOfChannels-v3 as [LEGACY]; fix well assignments in BoC-lc-evaluation-v1 (add A/1 for 07_24, remove incorrect B/1 and B/2 from 01_28) - Add new collections: annotated MIP subset, test subset, alfi-eval (ALFI mitosis, 3 cell lines), microglia-eval (5 perturbations), benchmark_2exp (dataloader profiling) - predict.yml: add TQDMProgressBar callback (refresh_rate=10) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- evaluate.py: remove all SLURM script generation (_generate_*_sh, _slurm_header, _run_local*); replace with prepare_configs() that generates YAML configs and prints a JSON manifest to stdout; rename CLI command evaluate -> prepare-eval-configs; add MMD config generators - evaluate_config.py: remove SlurmConfig; add MMDStepConfig and ComparisonSpec imports; split PlotStepConfig.color_by into per-exp and combined_color_by; update TaskSpec.marker_filters docstring for auto-expand behaviour - cli.py: add prepare-eval-configs, check-evals, append-annotations, append-predictions, split-embeddings, compute-mmd, plot-mmd-heatmap, evaluate-tracking-accuracy commands - split_embeddings.py: new CLI to split combined embeddings.zarr by experiment, replacing inline SLURM script logic - check_evals.py: new CLI to print eval completion status from registry - eval_registry.yaml: declarative registry of models to evaluate - Delete 4 stale SLURM-era eval configs (SlurmConfig schema removed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three modes for measuring embedding-space distribution shifts: - Per-experiment (explicit comparison pairs, faceted by marker) - Combined (pairwise cross-experiment with batch centering) - Pooled (concatenates all experiments, BH FDR correction) Core implementation: - viscy_utils/evaluation/mmd.py: kernel MMD with median heuristic, Gaussian RBF kernel, unbiased estimator, and vectorized permutation test (avoids Python loops via binary label matrix multiplication) - viscy_utils/evaluation/embedding_map.py: mAP via copairs for phenotypic profiling (optional dependency) - evaluation/mmd/config.py: Pydantic config hierarchy for all three modes; temporal binning, shared bandwidth, balance_samples - evaluation/mmd/compute_mmd.py: orchestrates the three analysis modes; computes activity_zscore = (mmd2 - null_mean) / null_std for cross-marker comparability; outputs per-marker CSV files - evaluation/mmd/plotting.py: kinetics lines, heatmaps, activity z-score heatmaps, combined cross-experiment heatmaps, multi-panel grids, paired heatmaps with shared colorbar - configs/evaluation/recipes/mmd_defaults.yml: shared algorithm defaults (1000 permutations, max 2000 cells, seed 42) for YAML inheritance - tests/test_mmd.py: unit tests for MMD implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver-time
- orchestrated.py: when marker_filters is None, auto-discover all unique
obs["marker"] values and run one classifier per marker; save trained
pipelines as {task}_{marker}.joblib with manifest.json; add
_plot_f1_over_time for per-class F1 at each timepoint; output one
{task}_summary.pdf per task (was a single merged PDF)
- orchestrated_test.py: update fixtures to expect 2 rows per task with
auto-expansion; add test for sparse-marker skipping and F1-over-time
plot generation
- append_annotations.py: new CLI to persist ground-truth annotation
columns directly into per-experiment zarr obs
- append_predictions.py: new CLI to apply saved classifier pipelines to
all cells in per-experiment zarrs, writing predicted_{task} to obs and
predicted_{task}_proba to obsm
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When group_by is set (default "marker"), evaluate_smoothness iterates over unique group values, computes smoothness per group, saves per-group CSV, generates per-group plots, then aggregates via mean/std. Output filenames now include experiment_name for disambiguation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Evaluates whether DynaCLR embeddings improve cell tracking on Cell Tracking Challenge datasets vs an IoU baseline. - tracking_accuracy/config.py: Pydantic models for ONNX model entries, CTC dataset entries, ILP solver weights, and full benchmark config - tracking_accuracy/utils.py: seg_dir layout helper, pad_to_shape, normalize_crop (z-score using whole-frame statistics) - tracking_accuracy/evaluate_tracking.py: main benchmark driver - ctc_tracking_2d_mip_boc.yaml: DynaCLR-2D-MIP vs IoU on DIC-C2DL-HeLa - ctc_tracking_2d_mip_boc_all.yaml: all CTC sequences variant - export_onnx_2d_mip_boc.yml: config for exporting the MIP model to ONNX Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pairplot: change diag_kind kde -> hist; rasterize scatter points to prevent PDF bloat; improve legend (alpha=1.0, larger marker sizes) - Scatter 2D: improve legend (markerscale=6, fontsize=10, framealpha=1.0, edgecolor="black") Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cohorts Mock, bystander, abortive, and unannotated_productive cohorts have t_zero=NaN and therefore t_rel_minutes=NaN. The original per_cell_baseline_distance required pre-window frames in t_rel_minutes, which dropped the entire mock cohort and made fov_stratified_threshold crash with "Mock cohort has no finite signal values". Per discussion §3.7, mocks contribute as a per-frame null distribution and don't need a t_zero. Fix: when t_rel_minutes is entirely NaN for a track, use the whole-track mean as the baseline. Productive lineages (and their sibling daughters) still use the pre-window mean. End-to-end validation on zikv_productive_07_24: - Path A-anno + G3BP1 readout produced 28 productive cells with oscillation metrics + 12-FOV mock thresholds. - Path A-anno + phase readout produced 18,589-row signal parquet. - SEC61 readouts still fail because the cohort's mock cells are all in C/1 (G3BP1 mock well); the SEC61 zarr does not contain SEC61 fluorescence for those wells. This is a cohort-construction issue (A/1 SEC61 mock not included in the candidate set), not a code bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a DINOv3-frozen evaluation row that uses raw HuggingFace DINOv3 convnext-tiny weights with no contrastive fine-tuning and no projection head. Tests whether DynaCLR's contrastive training adds value over pre-trained features alone. Required orchestrator changes for HF-loaded models (no Lightning checkpoint to restore from): - evaluate_config.py: ckpt_path is now Optional[str] = None. - evaluate.py: _generate_predict_yaml omits ckpt_path from the generated predict YAML when null, so Lightning skips checkpoint restoration cleanly. New files under configs/evaluation/DINOv3-frozen/: - infectomics-annotated.yaml leaf with ckpt_path: null. - run_infectomics_annotated.sh SLURM submitter. - training_config_dinov3_frozen.yaml — synthetic Lightning config mirroring DINOv3-temporal-MLP-v1 byte-for-byte except encoder.init_args.projection: null. Same backbone, same data pipeline, only difference is the projection head — apples-to-apples comparison isolating the contribution of contrastive training. eval_registry.yaml: refactored from check-evals format to compare_evals format (output_dir + per-entry eval_dir), with all 7 active rows listed for cross-model overlay. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cross-model comparison was silently producing empty smoothness plots and crashing on linear classifier plots because of two stale column/filename references: 1. _load_smoothness globbed *_smoothness_stats.csv but the smoothness step writes *_per_marker_smoothness.csv (one file per experiment-marker pair). Switched the glob and changed the aggregation to concat all per-marker CSVs and compute mean ± std across (experiment, marker) rows when plotting. 2. _plot_linear_classifiers indexed columns auroc and marker but the LC writer emits val_auroc / train_auroc and marker_filter. Switched to val_auroc (held-out generalization, the headline number) and marker_filter; updated the y-axis label. Smoothness comparison now shows a bar per model with mean ± std error bars instead of a single value from a missing CSV. LC comparison renders the per-marker val_auroc grid as intended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the leaf YAML + SLURM run script pairs for the 4 BoC-family
sibling variants and the 2 other-architecture rows that complete
the infectomics-annotated column of the eval matrix:
- DynaCLR-2D-MIP-BagOfChannels-single-marker-fix-shuffler (192->160 zext11, ckpt jbrwhzr3)
- DynaCLR-2D-MIP-BagOfChannels-mixed-markers-fix-shuffler (192->160 zext11, ckpt dlzt3s65)
- DynaCLR-2D-MIP-BagOfChannels-single-marker-192 (384->192 zext16, ckpt p6vlebcu) — uses
predict.batch_size=128 to fit the larger 384px patches on gpu_2d (default 400 OOMs)
- DynaCLR-2D-BagOfChannels-v3/run_infectomics_annotated.sh
- DINOv3-temporal-MLP-2D-BagOfChannels-v1/run_infectomics_annotated.sh
Each leaf publishes LCs to its own per-model registry under
linear_classifiers/{model}/infectomics/ so siblings don't clobber
each other's vN/ bundles.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related fixes for the mock cohort selection on real data: 1. Per-dataset control well: replace global mock_well_patterns iteration with each dataset's own control_fov_pattern from datasets.yaml. Previously every (dataset_id, mock_well) pair was emitted, double- tagging C/1 cells under both 2025_07_24_SEC61 and 2025_07_24_G3BP1. Now each dataset claims only its own control well (A/1 for SEC61, C/1 for G3BP1). Fallback to cohort-level mock_well_patterns kept for configs without per-dataset control_fov_pattern. 2. Zarr fallback: new _select_mock_from_zarr() reads (fov, track_id, parent_track_id, t) directly from the channel embedding zarr's .obs when the source annotation CSV has no rows for the control well. Synthesizes infection_state="uninfected" by well design. Fixes the case where A/1 was tracked + LC-predicted but never manually annotated. Validated on 2025_07_24: SEC61 mock cohort now sources from A/1 cells in the SEC61 zarr (6053 cells available); G3BP1 mock cohort still sources from C/1 cells via the annotation CSV. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Prints a markdown table at the start of each PHATE run showing resolved values and whether each came from the YAML or the default, so logs alone are enough to reproduce a run. Also relaxes n_pca to Optional[int] so configs can disable PHATE's internal pre-PCA via null. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Stage 0's mock-from-zarr fallback was hardcoded to the sensor (NS3) zarr pattern. SEC61's A/1 control well is absent from the sensor zarr but present in the SEC61 organelle zarr, so the fallback returned empty and Stage 3 SEC61 readouts crashed with "Mock cohort empty". Pass the full embeddings dict from datasets.yaml into _build_dataset_cohorts and pick the per-dataset organelle pattern based on the dataset suffix (e.g. SEC61 -> organelle_sec61). Verified: SEC61 A/1 now contributes 123 mock tracks; total mock cohort grew from 184 to 206 lineages on zikv_productive_07_24. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previously assign_lineage_ids assigned a positional integer counter to
each lineage. The counter incremented in iteration order through tracks,
so a track's lineage_id changed whenever Stage 0 cohort composition
shifted (e.g., adding mock cells from a new control well). Downstream
parquets keyed on lineage_id then desynced from each other unless every
stage was re-run together — silently producing zero-row merges in
Stage 4 readouts.
Replace the integer counter with a deterministic string id derived from
the branch endpoints:
"{dataset_id}|{fov_name}|root{root_track_id}|leaf{leaf_track_id}"
Stable across re-runs as long as the underlying tracks haven't changed.
Orphan sentinel becomes "" instead of -1.
Update consumers to use string equality / emptiness checks:
- select_candidates.py / manual_candidates.py: t_zero + divides lookups
- align_anno.py / align_lc.py: per-lineage anchor dicts
- align_embedding.py: fillna("") instead of fillna(-1).astype(int)
- readout_common.py: per-cell metric rows
- compare_phase_to_fluor.py: per-cell onset rows
Verified end-to-end: Stage 0 -> 4 produces identical Spearman ρ values
for the A-anno track (g3bp1 ρ=0.943, sec61 ρ=-0.200) on the
zikv_productive_07_24 cohort.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three small changes that together let us inspect and pool comparisons: 1. fov_stratified_threshold gains a well-row fallback. Previously the per-FOV check fell back straight to global because mock cells live in sibling wells (A/1) that never share a FOV with productive cells (A/2). The new ladder is same_fov -> same_row -> global, recording which level supplied the threshold via n_mock_source. Numbers don't change for 07_24 because SEC61 mock only exists in A-row anyway. 2. plot_paired_traces.py renders per-cell phase + fluor cosine-distance traces side-by-side for every paired cell that contributes to the compare_phase_to_fluor Spearman ρ. Used to see *why* g3bp1 ρ is high and sec61 ρ is unstable. 3. zikv_productive_pooled candidate set adds 07_22_G3BP1 + 08_26_SEC61 alongside 07_24. Stage 0 now handles empty productive_df cleanly (was crashing on KeyError: 'fov_name'). Note: under the current productive_filter the additional datasets contribute 0 productive lineages — 07_22's 10-min frame interval makes the 600-min track length unreachable (1/73 candidates pass), and 08_26's annotations have no tracks spanning both infected and uninfected states. The pooled cohort therefore equals the 07_24-only cohort until per- dataset productive_filter overrides are introduced or the annotations are extended. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
LC tasks are heavily imbalanced (cell_division_state ~99/1, organelle_state
~91/9), so val_auroc alone misleads — it ranks well on the minority class
but doesn't expose whether the classifier is usable at the decision
threshold.
- linear_classifier.py: persist train_<class>_support and val_<class>_support
in metrics_summary.csv so downstream plots can show per-class N.
- compare_evals.py: new linear_classifiers_per_class.pdf with grouped bars
for (class × {precision, recall, f1}) per (task, marker_filter), plus
a panel-title annotation "val N=… | minority <cls>=N (%)" when support
columns are present. Falls back gracefully on older metrics_summary.csv
files without support columns.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Curated registry listing every model with an evaluations/infectomics-annotated/linear_classifiers/metrics_summary.csv on the cluster. Drives compare_evals.py — produces smoothness and linear_classifiers comparison PDFs in one place rather than ad hoc per-pair invocations. Six models compared on commit: DINOv3-frozen, DINOv3-temporal-MLP-v1, and four DynaCLR-2D-MIP-BagOfChannels variants (zext11 baseline plus single-marker-fix, mixed-markers-fix, and 384to192/zext16 ablations). Add new entries here as more models are run. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Earlier search missed this one (deeper path: SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/v3/). It uses the same standard layout and metrics_summary.csv schema, so the comparison now spans all 7 models with infectomics-annotated runs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two structural changes that together let us pool 07_22 and 08_26 into
the productive cohort:
1. cohort_rules.min_non_productive_minutes (default 300) decouples the
bystander/abortive/mock track-length floor from the productive
filter's pre+post window. Previously _select_well_tracks was gated
by min_pre + min_post (600 min for 07_24), which meant short tracks
from comparison cohorts were silently dropped along with productives
that genuinely don't qualify. The decoupled gate stops over-filtering
the null distributions.
2. dataset_cfg.productive_source ("annotation_csv" default, or "lc_zarr")
selects whether productive cells come from the manual annotation CSV
or from predicted_infection_state in the embedding zarr. Used for
datasets like 08_26 whose annotation CSV is per-frame rather than
track-linked, making A-anno impossible but A-LC viable.
_select_productive_from_zarr applies the same pre/post window logic
as _select_productive_tracks, but uses the first sustained run of
min_run consecutive predicted-positive frames as the anchor (same
convention as align_lc.py).
Also: skip cells with NaN t_rel_minutes in compare_phase_to_fluor and
plot_paired_traces — when 08_26 LC-zarr productive cells flow through
the A-anno track, they have signal computed (whole-track-mean fallback)
but no anchor, so onset is undefined; previously this produced NaN ρ.
Pooled cohort (zikv_productive_pooled, 07_24 + 07_22 + 08_26) now
yields 164 productive lineages (vs 59 strict 07_24-only) and a
SEC61 A-LC ρ = 0.810, perm p < 0.001 (n=37) — first well-powered
correlation between phase and SEC61 onset times.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Each plot function used to build its own model_color dict by interpolating through tab10 with len(models) — producing slightly different mappings across plots, and outright wrong colors in the smoothness panel where np.arange(len(means.index)) walked the per-aggregate index rather than the global model list. Now _build_model_palette generates the dict once from the registry's model list (sorted, discrete tab10 indices, falls back to tab20 for >10 models) and main() threads it through smoothness, LC AUROC, LC per-class, and MMD plots. Same model is the same color everywhere. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
torch.as_tensor() crashes on numpy U/S/O dtypes. Take the all_gather_object path for string columns, keep the fast tensor path for numeric ones. Without this, training with any string metadata key (perturbation, marker) crashes on the first multi-rank online-eval step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
os.cpu_count() returns the node's physical core count, not the SLURM-allocated count. On a 48-core node where SLURM gave us 16, ad-hoc users of os.cpu_count() oversubscribe. Centralize the SLURM_CPUS_PER_TASK fallback in viscy_utils.mp_utils.available_cpus and route MultiExperimentDataModule's tensorstore concurrency through it. Pin BLAS to 1 thread per process in REDUCE_COMBINED — PHATE's joblib n_jobs spawns one worker per allocated CPU, so unbounded BLAS would yield ~cores^2 threads. Standard sklearn parallelism pattern (one axis at a time). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PHATE's internal PCA pre-reduction (graphtools -> sklearn -> scipy.linalg.lu) deadlocks silently on scipy 1.17.1 + sklearn 1.8.0 — process sits at ~0% CPU forever. Wire X_pca_combined back into PHATE so it skips its own pre-PCA: when phate.n_pca is null, fit on the already-reduced PCA output instead of raw .X. Add caller-owned fit-set indexing (fit_idx) to viscy_utils.evaluation.compute_phate so the orchestrator can draw a per-store lineage cap. Whole lineages are sampled per store (cap=N each); PHATE fits on the union and transforms the full input. Re-enables PHATE in the eval recipe with a 100-cell per-store cap for fast iteration; bump for paper figures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-modal InfoNCE head pulls image features toward a paired per-cell vector target (e.g. transcriptomic embedding). Image and target sides each pass through a small projector into a shared space; samples whose target contains NaN (unpaired cells) are masked out so the head runs on partially-paired batches. Extend ContrastiveModule._get_labels to handle vector-valued metadata: list/tuple/array entries are stacked into (B, D) float tensors, scalars stay as (B,) long tensors. Required for the new head's paired-target lookup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CELL-DINO is a DINOv2-architecture ViT pretrained on fluorescence microscopy (Human Protein Atlas). The channel_adaptive_dino_vitl16 checkpoint processes one channel at a time through a single-channel ViT-L/16 stem; the wrapper reshapes (B, C, H, W) -> (B*C, 1, H, W), runs the backbone, and mean-pools the cls token across channels for a fixed-dim embedding regardless of input channel count. Weights load from a local .pth state_dict — nothing fetched from the network. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PCA pairplot rendering is per-coloring-variable independent; fan out across colorings using joblib loky workers (one worker per coloring, capped by available_cpus). Workers re-import matplotlib + seaborn (~1s overhead) so the gain only kicks in for pairplot_components >= 4 on >100k cells, which matches the paper-figure config. Add the pairplot_components knob to the infectomics recipe at 4 (PC1..PC4 grid = 16 panels per coloring); bump higher for final paper figures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fixed coupling between PLOT and PLOT_COMBINED forced every infectomics run to fan out per-experiment scatter even when only the combined figure was needed. Make both stages independently togglable via steps:; the Nextflow DAG already checks `steps`, just had hardcoded behaviour assuming both always run. Switch infectomics-annotated to plot_combined only — the per-experiment scatter doesn't carry into paper figures. Drop the redundant marker_filters on cell_death_state (applies to all markers; the filter was leftover from when LC was only run on G3BP1/SEC61B sensors). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Job 31449149 OOM'd in cgroup on rank 3 host RAM (not VRAM) on the 384² single-marker variant. Loader prefetch buffers scale with workers × prefetch_factor, not batch_size. - Drop prefetch_factor 2→1 in the BoC base config — halves in-flight batches per worker, restores earlier behaviour. - Drop the 192 sbatch from 4→2 GPUs and bump --mem-per-cpu 14G→17G (255 GB/rank, 510 GB/node) so each rank has more headroom; also eases queue priority. Pin trainer.devices=2 in the override yml so the Lightning config matches. Batch size kept at 256/rank — host RAM was the OOM driver. If this still OOMs, suspect a real leak (loky semaphores, tensorstore decoder scratch) rather than papering over with more RAM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same physical microglia cells appeared three times in the collection (BF, Phase3D, Retardance), tripling the experiment's row count and biasing marker/experiment sampling without adding biological signal. Keep Phase3D only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add status legend (✅ landed / 🔄 running / ⬜ pending) and inline notes per model so the registry reads as a state-of-the bake-off. Stable name strings ensure the model→color palette matches across infectomics-annotated, alfi, and microglia registries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the primary_analysis.csv + cage_crop parsing path with a direct read from the cells AnnData zarr (dinov2.zarr / rna.zarr under a shared anndata_dir). The fov_name column is the zarr path; load_cells_anndata returns it as zarr_path so the rest of the pipeline is unchanged. Split CLI: data_paths.yml carries the shared zarr_store + anndata_dir + output_dir, and embed_<model>.yml carries per-model config (channels, output_key, target_pixel_size, batch_size). Both files are merged at startup. Add a max_cells smoke-test knob that truncates the cell table post-filter for fast iteration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 307371c.
- viscy-utils: pin anndata<0.12.9 across all/anndata/dev/test extras (matches pyproject; the constraint was added but the lock hadn't been refreshed) - viscy: normalize gurobipy specifier to the same range - nvidia-* and cuda-bindings: add platform_machine != 's390x' markers per uv solver auto-update Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PCA-RGB timelapse MP4 export needs imageio's FFmpeg plugin; without it the timelapse CLI silently falls back to GIF. Bundle matplotlib so the visualization helpers don't pull it through a transitive eval-extra dependency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EmbeddingWriter's primary array (adata.X) is hard-coded to the
encoder backbone "features". DINOv3-temporal-MLP and similar
frozen-backbone-with-trained-head models put all the learned
task signal in the projection head — predicting features in
that case discards the only learned component.
Add a predict.embedding_key knob ("features" | "projections")
that the eval orchestrator threads into the generated predict
YAML. The unselected array remains as a sidecar in obsm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The MLP head is the only finetuned component — the DINOv3 backbone is frozen during training. Defaulting to features would make this row a duplicate of DINOv3-frozen and discard the only learned task signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.