Refactor applications/DynaCell into a reusable benchmark package by alxndrkalinin · Pull Request #404 · mehta-lab/VisCy

alxndrkalinin · 2026-04-07T21:29:49Z

Summary

Makes applications/dynacell into the reusable dynacell package in VisCy.

Splits config responsibilities more cleanly:

VisCy keeps reusable dynacell code plus generic model recipes/examples.
Paper-specific benchmark state such as SEC61B run configs is removed from VisCy and is expected to live in dynacell-paper.

This PR also adds opt-in validation loss support for CellDiff flow-matching so benchmark reruns can log a scalar validation curve without forcing extra validation work on every flow-matching job.

What changed

Config and package reorg

Move reusable configs from applications/dynacell/examples/configs/ into applications/dynacell/configs/.
Keep configs/recipes/ for reusable fragments and configs/examples/ for generic runnable examples.
Refresh generic fit/predict examples for CellDiff, FNet3D, and UNetViT3D, and add generic UNeXt2 fit/predict examples.
Remove committed lightning_logs/, SEC61B paper configs, and the hard-coded hcs_sec61b_3d.yml recipe.
Update the dynacell README, config-discovery tests, and local ignore rules to match the new layout.

Absorb reusable benchmark modules from `dynacell-paper`

Add dynacell.data with path-based manifest/collection/spec loaders and Pydantic schemas.
Add dynacell.reporting with Hydra config, tables, figures, and tests.
Add dynacell.evaluation with configs, spectral-PCC tooling, I/O helpers, segmentation, metrics, and tests.
Add dynacell.preprocess with the reusable config.py and zarr_utils.py utilities.

Package and CLI shape

Extend the dynacell CLI so evaluate and report route to Hydra entry points while fit/predict/test/validate still go through viscy_utils.
Ship Hydra _configs inside the Python package so installed dynacell evaluate/report commands resolve config from package data instead of the source tree.
Make dynacell.__init__ lazy so lightweight imports do not pull in the full training stack.
Add optional dependency extras for eval, report, and preprocess, plus clearer missing-extra install hints from the CLI router.

Runtime, training, and transform fixes

Add overwrite= support to HCSPredictionWriter, including explicit handling/logging when prediction channels already exist.
Adjust VisCyCLI checkpoint hparam precedence so predict/test/validate honor user config over stale checkpoint values, while fit resume still keeps checkpoint training hparams.
Add direct ckpt_path loading to DynacellFlowMatching so predict-time settings come from config rather than checkpoint metadata.
Add opt-in compute_validation_loss to DynacellFlowMatching. When enabled it logs loss/val/<idx> and aggregate loss/validate while preserving epoch-end ODE sample generation; when disabled it keeps the previous cheaper validation path.
Extract a shared _aggregate_validation_losses() helper so UNet and flow-matching use the same weighted aggregation logic.
Skip the strict HCS spatial-shape check when GPU augmentations intentionally handle cropping.
Extend BatchedRandAffined with configurable padding_mode, safe_crop_size, and safe_crop_coverage, plus warnings/tests around unsupported X/Y-rotation guarantees.

Evaluation and reporting follow-up fixes

Fix corr_coef/PCC computation and add regression tests.
Make cached evaluate_model() returns consistent with fresh evaluation output types.
Fix report barplots for mixed-result inputs where some models are missing requested metrics.
Respect use_gpu in evaluation instead of always taking the CUDA path when visible.
Raise FileNotFoundError for missing preprocess configs instead of failing less clearly.

Benchmark schema + launcher (landed earlier on this branch)

Commit the configs/benchmarks/virtual_staining/ layout with shared axes (train_sets/, targets/, model_overlays/, launcher_profiles/, predict_sets/) and per-leaf train/predict files composed via viscy_utils.compose.load_composed_config.
Migrate the CellDiff train + predict configs for ER/mito/nucleus/membrane onto schema leaves; add FNet3D paper, UNetViT3D, and UNeXt2 train leaves for ER.
Add applications/dynacell/tools/submit_benchmark_job.py with --dry-run, --print-script, --print-resolved-config, and --override. It composes a leaf, strips the launcher: + benchmark: reserved keys, freezes the resolved config to {run_root}/resolved/, and renders an sbatch script to {run_root}/slurm/.
Extend _maybe_compose_config in viscy_utils/cli.py so direct dynacell fit -c <leaf> also strips those reserved keys.
Add a ckpt_sha256_12 sidecar to the evaluation cache so repeated lookups skip full-file hashing.

Topology / trainer-recipe ownership cleanup

Four commits (9734d07 → f9b8f1e → 5b7eaae → 3fdb7cf) untangle three layers that were previously mingled in recipes/trainer/fit_*.yml:

Runtime profile renamed. runtime_single_gpu.yml → runtime_shared.yml: the file's content (srun + PYTHONUNBUFFERED/NCCL_DEBUG/PYTHONFAULTHANDLER env vars) has nothing single-GPU-specific, and every leaf — including the 4-GPU UNeXt2 leaf — composed it. Rename + 14 leaf-base updates + schema-doc references.
New topology layer (recipes/topology/single_gpu.yml + ddp_4gpu.yml) owns trainer.accelerator/devices/strategy/num_nodes. Duplicated independently under applications/dynacell/configs/recipes/topology/ and applications/cytoland/examples/configs/recipes/topology/ because CLAUDE.md forbids cross-application imports.
Unified trainer recipes — recipes/trainer/fit.yml + predict.yml per app, replacing fit_1gpu/fit_4gpu/fit_fm_4gpu/predict_gpu. Mode invariants only (seed, log cadence, callbacks, WandbLogger class pinned with per-app project). Topology and precision are owned by the topology recipes and model overlays respectively. Every benchmark model overlay and example leaf now composes [fit.yml + topology/*.yml] and sets precision + max_epochs explicitly.
Hardware profiles (hardware_h200_single, hardware_gpu_any_long, hardware_4gpu) drop their trainer: block; they now own only launcher.sbatch.*.
Intentional behavior deltas (spelled out, not masked): strategy: ddp → auto on every single-GPU leaf (Lightning-equivalent at devices=1, just intent-honest); dynacell examples fnet3d/unetvit3d/unext2/fit.yml and benchmark unext2.yml now pin WandbLogger project=dynacell instead of falling through to Lightning's default TensorBoard; fit_1gpu-derived paths move from save_top_k: 4 to save_top_k: 5.
Preserved via leaf-body overrides: FNet3D paper keeps precision: 32-true; examples/celldiff/fit.yml keeps FM-style epoch-based checkpointing (save_top_k: -1, every_n_epochs: 10, no monitor); cytoland FCMAE pretraining keeps strategy: ddp_find_unused_parameters_true; cytoland→dynacell bridge configs keep project: dynacell override over cytoland's default.
LEGACY deletion: applications/dynacell/tools/LEGACY/examples_configs/ (pre-schema CellDiff configs) is removed outright per CLAUDE.md's "avoid backwards-compatibility hacks" rule. The equivalence tests that composed LEGACY (test_*_leaf_matches_legacy, test_fnet3d_paper_leaf_matches_ran_config, test_byte_equivalence_sec61b_train_leaf) are replaced with forward-looking composition sanity tests.
Plan + review trail: plan is in ~/.claude/plans/vectorized-sleeping-clock.md; commits ran through /review-plan and /verify-plan subagent rounds before landing.

Dependency and packaging updates

Pin microssim to a compatible Git revision and refresh uv.lock so the new dynacell[eval] extra resolves with the current workspace.
Allow direct references in hatch metadata so the workspace can build/package those dependency pins cleanly.

Why

This is the VisCy side of the dynacell-paper split:

dynacell in VisCy becomes the reusable runtime.
dynacell-paper owns benchmark-instance configs, orchestration, and manuscript-specific assets.

That gives us a cleaner boundary for future work like multi-experiment collections and benchmark-specific runners without keeping paper state in the shared application package.

The flow-matching validation-loss change is intentionally opt-in because it adds extra validation compute and the resulting metric is still stochastic; default flow-matching checkpointing remains epoch-based.

The topology / trainer-recipe cleanup at the end removes a silent-drop-DDP trap: before this change, pairing a mismatched trainer recipe and benchmark hardware profile let the hardware profile's trainer.devices win (both layers redundantly set it), which could silently run a DDP training on 1 GPU. Topology is now the single source of truth.

Verification

uvx ruff check applications/dynacell/
uvx ruff format --check applications/dynacell/
.venv/bin/python -m pytest applications/dynacell/tests -q
.venv/bin/python -m pytest applications/dynacell/tests/test_data_manifests.py applications/dynacell/tests/test_reporting_tables.py applications/dynacell/tests/test_reporting_tables_extended.py applications/dynacell/tests/test_reporting_figures.py applications/dynacell/tests/test_evaluation_metrics.py applications/dynacell/tests/test_evaluation_pipeline.py applications/dynacell/tests/test_evaluation_io.py applications/dynacell/tests/test_preprocess_config.py applications/dynacell/tests/test_preprocess_zarr_utils.py applications/dynacell/tests/test_cli_routing.py -q
.venv/bin/python -m pytest applications/dynacell/tests/test_engine.py -q
.venv/bin/python -m pytest applications/dynacell/tests/test_training_integration.py::test_celldiff_fm_validation_loss_keeps_generation applications/dynacell/tests/test_training_integration.py::test_celldiff_fm_warmup_cosine_fast_dev_run applications/dynacell/tests/test_training_integration.py::test_celldiff_fm_constant_schedule_fast_dev_run -q
.venv/bin/dynacell fit --print_config -c applications/dynacell/configs/examples/unext2/fit.yml
uv run pytest applications/dynacell/tests/test_benchmark_config_composition.py applications/dynacell/tests/test_submit_benchmark_job.py -v (benchmark schema + submit-tool tests)
Topology/trainer cleanup end-to-end:
- Composition snapshot of 35 touched leaves before vs after the cleanup; only expected intentional deltas (strategy ddp → auto at devices=1, WandbLogger added where previously default TB).
- uv run dynacell fit -c <benchmark-unext2-leaf> --trainer.fast_dev_run=true --trainer.devices=1 --trainer.strategy=auto ran one training step cleanly on an A40 interactive node (loss 1.048).
- submit_benchmark_job.py <benchmark-unext2-leaf> queued a real 4-GPU DDP job (31122607) via the new schema path.

Copilot

Pull request overview

This PR refactors applications/dynacell from a thin training app into a reusable dynacell benchmark package, reorganizing configs and bringing evaluation/reporting/preprocess/data modules (with tests) into VisCy while updating shared runtime utilities to support the new package shape.

Changes:

Reorganize Dynacell configs into configs/recipes/ + configs/examples/, remove paper-/HPC-specific SEC61B artifacts, and update docs/tests for config discovery.
Add reusable dynacell.data, dynacell.evaluation, dynacell.reporting, and dynacell.preprocess modules plus substantial test coverage.
Update shared VisCy components (CLI checkpoint merging, prediction writer overwrite behavior, affine augmentation safety options, HCS training shape checks) to support the new workflows.

Reviewed changes

Copilot reviewed 74 out of 85 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
packages/viscy-utils/src/viscy_utils/cli.py	Adjust checkpoint-hparam vs user-config precedence during LightningCLI parsing.
packages/viscy-utils/src/viscy_utils/callbacks/prediction_writer.py	Add `overwrite` option to allow reusing existing output stores.
packages/viscy-transforms/src/viscy_transforms/_affine.py	Add configurable `padding_mode` and safe-crop scale clamping for GPU affine aug.
packages/viscy-transforms/tests/test_affine.py	Extend affine tests to cover scale-floor computation and safe-crop behavior.
packages/viscy-data/src/viscy_data/hcs.py	Skip strict training shape check when GPU augmentations handle cropping.
applications/dynacell/src/dynacell/main.py	Route `evaluate`/`report` to Hydra entry points; keep fit/predict via `viscy_utils`.
applications/dynacell/src/dynacell/engine.py	Add `ckpt_path` init-time loading to decouple predict-time settings from ckpt hparams.
applications/dynacell/src/dynacell/data/_yaml.py	Add shared OmegaConf→Pydantic YAML loader.
applications/dynacell/src/dynacell/data/manifests.py	Add dataset manifest + split schemas and path-based loaders.
applications/dynacell/src/dynacell/data/collections.py	Add frozen benchmark collection schemas and loader.
applications/dynacell/src/dynacell/data/specs.py	Add benchmark-spec schema/loader for tying pipeline stages together.
applications/dynacell/src/dynacell/data/init.py	Re-export dynacell.data public API.
applications/dynacell/src/dynacell/evaluation/init.py	Initialize evaluation package.
applications/dynacell/src/dynacell/evaluation/pipeline.py	Add Hydra-based evaluation pipeline with caching + output writing.
applications/dynacell/src/dynacell/evaluation/metrics.py	Add pixel/mask/feature metric computations (optional deps guarded).
applications/dynacell/src/dynacell/evaluation/utils.py	Add feature extractors + plotting and distribution-distance helpers.
applications/dynacell/src/dynacell/evaluation/io.py	Add eval I/O dispatch helpers (Zarr vs TIFF-like) and preprocessing utilities.
applications/dynacell/src/dynacell/evaluation/segmentation.py	Add segmentation workflows with optional dependency guards.
applications/dynacell/src/dynacell/evaluation/formatting.py	Add DataFrame formatting utilities for evaluation outputs.
applications/dynacell/src/dynacell/evaluation/torch_ssim.py	Add torch-native SSIM implementation used by metrics.
applications/dynacell/src/dynacell/evaluation/spectral_pcc/init.py	Initialize spectral_pcc subpackage.
applications/dynacell/src/dynacell/evaluation/spectral_pcc/plot_shading_analysis.py	Add analysis plotting script for shading artifact comparison.
applications/dynacell/src/dynacell/evaluation/spectral_pcc/plot_combined.py	Add combined metrics plotting + summary script.
applications/dynacell/src/dynacell/evaluation/spectral_pcc/diagnostic_real.py	Add Hydra diagnostic workflow for real-data spectra/metrics analysis.
applications/dynacell/src/dynacell/reporting/init.py	Re-export reporting public API.
applications/dynacell/src/dynacell/reporting/tables.py	Add table aggregation + LaTeX rendering utilities.
applications/dynacell/src/dynacell/reporting/figures.py	Add matplotlib figure generation for aggregated metrics.
applications/dynacell/src/dynacell/reporting/cli.py	Add Hydra reporting CLI entry point and config path wiring.
applications/dynacell/src/dynacell/preprocess/init.py	Re-export preprocess utilities.
applications/dynacell/src/dynacell/preprocess/config.py	Add OmegaConf-based preprocess config loader (with fallback).
applications/dynacell/src/dynacell/preprocess/zarr_utils.py	Add OME-Zarr rewrite utility (rechunk/reshard).
applications/dynacell/configs/examples/unext2/fit.yml	Add generic UNeXt2 fit example in new config layout.
applications/dynacell/configs/examples/unext2/predict.yml	Add generic UNeXt2 predict example with init-time `ckpt_path`.
applications/dynacell/configs/examples/fnet3d/fit.yml	Update FNet3D fit example paths to new layout.
applications/dynacell/configs/examples/fnet3d/predict.yml	Add generic FNet3D predict example with init-time `ckpt_path`.
applications/dynacell/configs/examples/unetvit3d/fit.yml	Update UNetViT3D fit example paths to new layout.
applications/dynacell/configs/examples/unetvit3d/predict.yml	Update UNetViT3D predict example paths and use init-time `ckpt_path`.
applications/dynacell/configs/examples/celldiff/fit.yml	Update CellDiff fit example paths to new layout.
applications/dynacell/configs/examples/celldiff/predict.yml	Add CellDiff predict example with init-time `ckpt_path` and overlap config.
applications/dynacell/configs/recipes/trainer/fit_1gpu.yml	Add reusable 1-GPU trainer recipe.
applications/dynacell/configs/recipes/trainer/fit_4gpu.yml	Add reusable 4-GPU trainer recipe.
applications/dynacell/configs/recipes/trainer/fit_fm_4gpu.yml	Add reusable 4-GPU flow-matching trainer recipe.
applications/dynacell/configs/recipes/trainer/predict_gpu.yml	Remove `ckpt_path` from trainer recipe (moved to model init args).
applications/dynacell/configs/recipes/models/unext2_3d.yml	Add reusable UNeXt2 model recipe (z=15).
applications/dynacell/configs/recipes/models/unext2_3d_z8.yml	Add reusable UNeXt2 model recipe (z=8).
applications/dynacell/configs/recipes/models/fnet3d.yml	Add reusable FNet3D model recipe.
applications/dynacell/configs/recipes/models/fnet3d_z8.yml	Add reusable FNet3D model recipe (z=8).
applications/dynacell/configs/recipes/models/unetvit3d.yml	Add reusable UNetViT3D model recipe.
applications/dynacell/configs/recipes/models/celldiff_fm.yml	Add reusable CellDiff flow-matching model recipe.
applications/dynacell/configs/recipes/modes/spotlight.yml	Add Spotlight mode recipe for foreground-aware loss.
applications/dynacell/configs/recipes/data/hcs_phase_fluor_3d.yml	Adjust data recipe defaults (e.g., preload false).
applications/dynacell/configs/evaluate/eval.yaml	Add Hydra eval base config.
applications/dynacell/configs/evaluate/spectral_pcc/base.yaml	Add Hydra config for spectral PCC workflows.
applications/dynacell/configs/evaluate/spectral_pcc/simulate.yaml	Add simulation config for metric validation.
applications/dynacell/configs/evaluate/spectral_pcc/diagnostic_real.yaml	Add real-data diagnostic config.
applications/dynacell/configs/report/base.yaml	Add Hydra report config.
applications/dynacell/tests/test_training_integration.py	Update config discovery test to new `configs/examples` location.
applications/dynacell/tests/test_cli_routing.py	Add tests for CLI routing between Lightning and Hydra subcommands.
applications/dynacell/tests/test_data_manifests.py	Add tests for dataset manifest/collection/spec schemas and loaders.
applications/dynacell/tests/test_evaluation_io.py	Add tests for eval I/O dispatching based on path type.
applications/dynacell/tests/test_evaluation_metrics.py	Add regression tests for shared-scale pixel metrics behavior.
applications/dynacell/tests/test_evaluation_pipeline.py	Add caching regression test for evaluation pipeline.
applications/dynacell/tests/test_reporting_tables.py	Add tests for reporting tables loading/aggregation/LaTeX output.
applications/dynacell/tests/test_reporting_tables_extended.py	Add extended reporting table tests (caption/label, lower-is-better).
applications/dynacell/tests/test_reporting_figures.py	Add tests for figure creation/saving and empty-data behavior.
applications/dynacell/tests/test_preprocess_config.py	Add tests for preprocess config loader fallback behavior.
applications/dynacell/tests/test_preprocess_zarr_utils.py	Add tests for zarr rewrite correctness (data, chunks, metadata, shards).
applications/dynacell/README.md	Update usage + config layout documentation; remove paper-specific SEC61B instructions.
applications/dynacell/.gitignore	Ignore runtime artifacts (lightning_logs, outputs, pycache).
applications/dynacell/pyproject.toml	Add optional extras for eval/report/preprocess and allow direct references.
applications/dynacell/examples/configs/sec61b/run_unext2_continue.slurm	Remove paper/HPC-specific SEC61B SLURM script.
applications/dynacell/examples/configs/sec61b/run_unext2.slurm	Remove paper/HPC-specific SEC61B SLURM script.
applications/dynacell/examples/configs/sec61b/run_fnet3d_paper.slurm	Remove paper/HPC-specific SEC61B SLURM script.
applications/dynacell/examples/configs/sec61b/run_fnet3d.slurm	Remove paper/HPC-specific SEC61B SLURM script.
applications/dynacell/examples/configs/sec61b/fit_unext2.yml	Remove paper-specific SEC61B config.
applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml	Remove paper-specific SEC61B config.
applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml	Remove paper-specific SEC61B config.
applications/dynacell/examples/configs/sec61b/fit_celldiff.yml	Remove paper-specific SEC61B config.
applications/dynacell/examples/configs/recipes/data/hcs_sec61b_3d.yml	Remove hard-coded SEC61B data recipe.
applications/dynacell/examples/configs/fnet3d/predict.yml	Remove old predict example path (superseded by new configs layout).
applications/dynacell/examples/configs/celldiff/predict.yml	Remove old predict example path (superseded by new configs layout).
CLAUDE.md	Update repo/dev workflow documentation and conventions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 78 out of 91 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The three near-duplicate blocks for mask/pixel/feature metrics are collapsed into a single loop. Also guard the plot call with `if not df.empty` — when feature metrics are disabled, the empty DataFrame was previously crashing plot_metrics on a groupby("FOV") lookup. Behavior is otherwise identical. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Splits metrics.py feature functions so GT-side work can be cached separately from prediction-side work. New API: cp_target_regionprops, cp_pred_regionprops, cp_pairwise deep_target_features, deep_pred_features, deep_pairwise The old cp_feature_similarity / deep_feature_similarity / compute_feature_metrics entry points are removed — pipeline.py calls the split API directly via a thin _compute_feature_metrics_from_split helper. CP pairing preserves the target-side all-zero column drop and per-matrix z-score of the original. Also renames eval.yaml's recalculate_metrics to force_recompute.final_metrics and introduces the full per-artifact force_recompute block (gt_masks / gt_cp / gt_dinov3 / gt_dynaclr / final_metrics / all). io.cell_segmentation_path is now optional (required only when compute_feature_metrics is true), and io.gt_cache_dir / io.require_complete_cache are introduced for the cache work in the next commit. Bundled changes keep the per-commit test rule: removing the old metrics API without rewiring pipeline.py would break tests at this commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Per-FOV helpers in new pipeline_cache.py (init_cache_context, fov_gt_masks, fov_gt_cp_features, fov_gt_deep_features, flush_manifest) wrap the raw cache I/O in cache.py and honor the per-artifact force_recompute.* flags plus io.require_complete_cache. evaluate_predictions now pre-fetches GT masks and per-timepoint GT feature arrays from the cache before running the timepoint loop. On a hit it skips the expensive aicssegmentation + feature extraction; on a miss (and when caching is enabled) it computes and writes to cache in-place. Manifest is flushed after each position so an interrupted run preserves completed work. io.gt_cache_dir remains optional (null = no-op caching, identical to the previous behavior), so one-off eval runs don't need any cache plumbing. require_complete_cache=true flips cache misses from fill to raise — the pattern for parallel sweeps where the cache has already been primed via precompute-gt. Also fixes cache.write_mask to overwrite an existing position's image without tearing down the well group (which iohub's create_position cannot recreate on its own). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Standalone Hydra entrypoint that iterates GT positions and fills the cache for the artifact families toggled in config.build (masks, cp, dinov3, dynaclr). Reuses the same pipeline_cache helpers that evaluate_predictions uses, so a position fills in identically whether it was built by precompute-gt or filled on-the-fly by evaluate. precompute.yaml inherits eval.yaml and requires io.gt_cache_dir (the whole point of the CLI). Designed as a one-time SLURM job ahead of many parallel evaluate runs with io.require_complete_cache=true. Routed via __main__._HYDRA_COMMANDS so 'dynacell precompute-gt' is the user-facing command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extends the evaluation README with the cache layout, a full flag reference for force_recompute, the parallel-sweep workflow using require_complete_cache, and a precompute-gt example. Also updates the components table to cover cache.py, pipeline_cache.py, and precompute_cli.py, and clarifies which inputs are optional after the cache changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

/simplify pass over the cache work. Three cleanups: 1. Open each feature zarr group once per FOV (not once per timepoint) via a new open_features_group context manager + two helpers (read_features_from_group, write_features_to_group). Shrinks the per-run zarr.open_group count from ~2T×N_artifacts×N_positions (~2600 on SEC61B) to N_artifacts×N_positions (~300). read_features / write_features stay as the single-shot convenience API. 2. Extract _load_or_compute_feature_timepoints shared loop so fov_gt_cp_features and fov_gt_deep_features stop copy-pasting the miss-detection + compute + cache-write logic. 3. Add FeatureKind = Literal["cp", "dinov3", "dynaclr"] and use it everywhere `kind` is accepted, so the three valid values are visible at type-check time. Also consolidate the duplicated slug helper into cache.feature_slug (was _safe_slug in cache.py and _slug in pipeline_cache.py). Also drops one redundant narration comment in pipeline.py. All 142 non-training tests still pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Addresses two PR #404 review findings: 1. `_CacheContext._manifest_dirty` was mutated directly from helper call sites, leaking implementation detail. Adds `mark_manifest_dirty` and `consume_manifest_dirty` methods and routes every external touch through them. Only the dataclass itself now references the private field. 2. `resolve_dynaclr_encoder_cfg` used `except Exception` to detect a missing nested config key — wider than needed and against CLAUDE.md guidance. Replaced with `OmegaConf.select(..., default=None)`, which handles missing keys natively without a try/except. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Addresses a PR #404 review finding: the split GT/pred feature API had structural tests (empty inputs, column-drop, shape mismatch) but no pinned-value regression guard. Adds two tests that seed deterministic synthetic inputs and assert exact output values for CP_FID / CP_KID / CP_Median_Cosine_Similarity and the DINOv3 equivalents. If anyone later changes the column-drop, per-side z-score, or FID/KID/cosine pairing logic — or a dependency shifts numerics — these tests will fail rather than silently drifting metrics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Repeated ckpt_sha256_12 calls on multi-GB checkpoints dominate parallel sweep cache-key resolution. Write a sibling .sha256 sidecar after the first hash; on later calls, reuse the sidecar when its mtime >= the ckpt's. Falls back to recompute on any OSError (read-only dir, NFS flake) and on corrupt non-hex sidecars. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dynacell's benchmark leaf YAMLs carry two reserved top-level keys: `launcher:` (sbatch/runtime metadata) and `benchmark:` (experiment identifiers). LightningCLI rejects unknown top-level keys, so these must be removed before the composed config reaches the CLI. Widen _maybe_compose_config to: - strip both reserved keys whether or not `base:` is present - extract _find_config_arg and _replace_config_path_in_argv helpers This unblocks `uv run dynacell fit -c <benchmark-leaf.yml>` without requiring the dedicated benchmark submit tool. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Lands the benchmark config layout without any runnable leaves yet: - BENCHMARK_CONFIG_SCHEMA.md — reference doc (previously untracked) - virtual_staining/README.md — reserved-keys contract, compose+submit docs - shared/train_sets/ipsc_confocal.yml — imaging modality defaults - shared/targets/{er_sec61b, mito_tomm20, nucleus, membrane}.yml — four targets with channel names, train-side data paths, normalizations, and RandWeightedCropd - shared/model_overlays/celldiff_{fit,predict}.yml — model + trainer recipe binding + mode-specific data hparams and GPU aug stack - shared/launcher_profiles/{mode_fit, mode_predict, hardware_h200_single, runtime_single_gpu}.yml — launcher metadata split across axes - shared/predict_sets/ipsc_confocal.yml — predict-set metadata + source_channel (duplicated from train_sets because predict leaves don't compose train_sets) Train/predict leaves land in the next two commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Four benchmark leaves at configs/benchmarks/virtual_staining/train/<org>/ ipsc_confocal/celldiff.yml — one per organelle. Each composes the shared axes (train_set, target, celldiff_fit overlay, launcher profiles) and carries organelle-specific WandB run name, checkpoint dirpath, and launcher.{job_name, run_root} in the leaf body. test_benchmark_config_composition.py composes both the pre-schema fit_celldiff.yml and the new leaf through load_composed_config, strips reserved keys, and asserts the full intersection of model/data/trainer fields matches. All four organelles pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Four predict leaves at configs/benchmarks/virtual_staining/predict/<org>/ ipsc_confocal/celldiff/ipsc_confocal.yml. Each overrides: - data.init_args.data_path to the test_cropped store for the organelle - data.init_args.normalizations to Phase3D-only (predict doesn't use target normalization — target isn't loaded) - data.init_args.augmentations to [] (clears target-inherited RandWeightedCropd; predict has no CPU augs) - trainer.callbacks to a single HCSPredictionWriter with the organelle's output zarr Extends test_benchmark_config_composition.py with a predict-side equivalence test that asserts model.init_args.{num_generate_steps, predict_method, predict_overlap, ckpt_path, net_config}, the predict data.init_args key intersection, HCSPredictionWriter output_store equality, and a 'test_cropped/' guard on data_path. All four predict leaves pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…late Drives one benchmark leaf end-to-end: compose via load_composed_config, apply --override (stdlib dotlist, interpolation forbidden), validate launcher block, consistency-check trainer.devices vs sbatch.gpus, render sbatch from tools/sbatch_template.sbatch using a string.Template subclass with @@ delimiter (so shell $VARs pass through verbatim), and submit. The SBATCH directive render order (job-name, time, nodes, ntasks, partition, cpus-per-task, gpus, mem, constraint, output, error) is pinned explicitly to match Dihan's run_celldiff.slurm. Byte-equivalence test against the SEC61B train leaf confirms the rendered sbatch differs only on the final srun --config path. Flags: --dry-run, --print-script, --print-resolved-config, --override key.path=value (repeatable). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

git-renamed the four pre-schema CellDiff trees (memb/nucl/sec61b/tomm20, fit+predict YAMLs and run_celldiff.slurm) from applications/dynacell/examples/configs/ to applications/dynacell/tools/LEGACY/examples_configs/. Empty examples/ parent removed. Post-move, the eight YAMLs' base: paths needed one additional '..' to still resolve to configs/recipes/ — the only content change. This keeps the equivalence test in test_benchmark_config_composition.py able to compose the LEGACY files as the source-of-truth reference. Both test files' EXAMPLES paths updated to the new location. tools/LEGACY/README.md documents the contract: reference-only, not for direct launch; delete after one successful end-to-end submit run and 2026-06-30 at the earliest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds the configs/benchmarks/virtual_staining/ layer to the config structure section, points at its own README for composition order, and documents the submit_benchmark_job.py tool with --dry-run examples. Also notes that launcher:/benchmark: reserved keys are stripped automatically by _maybe_compose_config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

viscy_utils.compose._deep_merge was private, forcing dynacell's submit_benchmark_job.py to keep a byte-identical copy with a docstring explaining the duplication. Drop the underscore and export it. Prevents silent drift between the two copies if one is updated (e.g. changing list-replace to list-append semantics). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@OVERRIDES

Three substantive fixes plus cleanup: - Drop the @@OVERRIDES tail from sbatch_template.sbatch. Previously --override tokens were both merged into the resolved YAML AND appended to the srun command line, applying the overrides twice. For scalar overrides this happened to be idempotent; for list overrides it would have silently duplicated entries. - Make --print-script and --print-resolved-config imply skip-submission. Previously running submit_benchmark_job.py with --print-resolved-config alone (no --dry-run) would still sbatch the job — a surprising foot-gun. - Use the newly-public deep_merge from viscy_utils.compose; drop the duplicated copy from submit_benchmark_job.py. - Change _apply_override to return the merged dict instead of mutating in place via clear()+update(). Simpler contract matching _deep_merge. - Deduplicate the stat() call in ckpt_sha256_12 (Path.exists() followed by Path.stat() was two syscalls for one result). - Strip stale "# Equivalent to examples/configs/..." comments from the 8 leaf YAMLs — the referenced path was moved to tools/LEGACY/ in an earlier commit. - Clean up author-referencing narration comments ("matches Dihan's ...") — the code is the contract now. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three confirmed review findings: - Remove sys.path.insert from test_submit_benchmark_job.py (CLAUDE.md bans sys.path mutation). Replace with pytest pythonpath config in the workspace pyproject.toml pointing at applications/dynacell/tools so the test can import submit_benchmark_job. - Make --dry-run the mode that writes the resolved YAML and sbatch to disk (previously nothing wrote files outside the real-submit path, which meant --dry-run rendered a path it never populated). --print-* flags are now documented as preview-only: stdout inspection, no disk writes, no submission. - Drop redundant FileNotFoundError from the except tuple in ckpt_sha256_12 — it's an OSError subclass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ayout + feature metrics Rewrites fnet3d/unetvit3d/celldiff a549 scripts to use the flat mantis_v1/test/{ORGANELLE}_{infection} layout and adds seg_cleaned paths + compute_feature_metrics=true. Adds new scripts for unext2 (fcmae_vscyto3d_scratch) and vscyto3d (fcmae_vscyto3d_pretrained). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

HCSDataModule._train_transform raised when batch_size % num_samples != 0 because HCSDataModule.train_dataloader divides batch_size by num_samples and would round down silently. BatchedConcatDataModule overrides train_dataloader to use batch_size as-is — it loads N indices, each yielding num_samples patches per the child transform, so N indices * num_samples patches = effective per-step samples and divisibility is irrelevant. The check still fired during the child's setup_fit, blocking joint configs that intentionally pick (N, num_samples) pairs hitting a target sample count without divisibility (e.g. fnet3d_paper joint at bs=6, num_samples=8 → 48 samples per step, matching the single-set fnet3d's 48//8=6 indices * 8 patches behaviour). BatchedConcatDataModule.setup now flags each child before delegating to ConcatDataModule.setup, and the check honors that flag. ConcatDataModule (the parent class) keeps the original constraint because *its* train_dataloader does divide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Trim the inline comment on the gated divisibility check to a single line — the why was already covered by the helper docstring on BatchedConcatDataModule.setup, the longer block was rehashing what the surrounding code already shows. Extend the existing _make_dm helper in test_combined.py with optional augmentations + source/target channel overrides instead of inlining a closure that re-derives the same construction. Drops the redundant f-string assertion message; pytest already prints the failing value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Mirrors the existing nucleus/fnet3d_paper/ipsc_confocal/predict__*.yml layout into the sibling a549_mantis/ leaf so the a549-trained checkpoint produced by job 31858491 (FNet3DPaper_A549_NUCL, completed 2026-05-03 with val_loss=0.2088 at epoch 293) has predict configs for all four test sets used by the evaluation pipeline. Outputs are suffixed _a549trained to keep them distinct from the existing ipsc-trained predictions in dynacell/{ipsc,a549}/predictions/ so downstream eval configs can reference the pair side-by-side. A549 leaves keep the dataset_ref.target=h2b override the ipsc-trained versions use, since the a549 manifest keys nucleus by gene. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Document the BatchedConcatDataModule.train_dataloader override that uses batch_size as-is (vs HCSDataModule and the parent ConcatDataModule which divide by train_patches_per_stack), the resulting joint.batch_size = single_set.batch_size / num_samples sizing rule, and the divisibility check that is now suppressed for joint children (commit 5a2a346). Captured here because I misread this in two separate review passes during the fnet3d joint resubmission and shipped a wrong "FCMAE joint is 4x undersized" claim before re-tracing the dataloader code; the table + verified examples should prevent the same mistake on the next pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

predict_local_{a549,ipsc}.sh hardcoded ipsc_confocal as the train set, so a549-trained or joint-trained models could not reuse them — the new a549-trained nucleus fnet3d_paper predict configs (commit 4951fc0) had no driver. Collapse both into one predict_local.sh that takes <organelle> <model> <train_set> <test_set> and accepts shorthand (ipsc | a549 | joint), with the same compose + ckpt-existence preflight + parallel batching. Single-leaf (test_set=ipsc) and per-plate (test_set=a549, glob over the 3 a549_mantis treatments) paths unify cleanly because the leaf discovery is just a glob; the existing parallel batcher handles N=1 as a degenerate case. Old scripts removed per the project's "avoid backwards compatibility" convention (CLAUDE.md). No callers in applications/, docs/, or adjacent tools/ scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Translates between code names (used in YAML config keys, prediction zarr filenames, eval pipeline keys, W&B run names) and paper / display names (UNeXt2, VSCyto3D, FNet3D, UNetViT3D, CELL-Diff). Also explains the eval-pipeline directory naming convention which keys directories by paper name (eval_unext2_membrane → predictions from fcmae_vscyto3d_scratch, eval_vscyto3d_membrane → fcmae_vscyto3d_pretrained), so cross-references between training configs and eval results are unambiguous. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Single-GPU and 4-GPU DDP smoke variants for the fcmae_vscyto3d_scratch ER joint leaf, mirroring the existing celldiff and fnet3d_paper smoke patterns under the same er/.../joint_*/ tree. The 4-GPU variant was the probe that reproduced the heterogeneous-T mmap_preload bug (commit 10e5c16) on a small SEC61B_test48 + a549 SEC61B_all pair — keep it in tree as a regression check the next time someone touches prepare_data or BatchedConcatDataModule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Operational script for delegating the 4 a549-only ER/MITO FCMAE trainings to a different submitter when the primary user's fairshare is pushing them hours-to-days into the future. The script checks that the venv exists and that HEAD contains the heterogeneous-T fix (dcfedfd) before submitting via submit_benchmark_job.py — without that commit the jobs crash again at prepare_data on the mixed-T SEC61B_all and TOMM20_all pools. The header documents the one-time per-user setup (clone, venv, wandb login, group-write access) and warns the primary submitter to scancel any prior pending submissions before this runs so they do not duplicate against the same run_root + wandb project. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Best val checkpoint from job 31822574 (epoch 119, loss/validate=0.2722, 146 epochs total — plateaued from ep 119 onward). Mirrors the existing nucleus/fnet3d_paper/a549_mantis predict pattern: ipsc-test leaf has no target override, a549 leaves override dataset_ref.target to gene-keyed `caax`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Best val checkpoint from job 31858488 (epoch 281, loss/validate=0.3143, 289 epochs total — 7-epoch plateau). Mirrors the existing nucleus/fnet3d_paper/a549_mantis predict pattern: ipsc-test leaf has no target override, a549 leaves override dataset_ref.target to gene-keyed `caax`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Best val checkpoint from job 31822562 (epoch 110, loss/validate=0.8345, 135 epochs total). Mirrors the existing membrane/fcmae_vscyto3d_scratch/ a549_mantis predict pattern: ipsc-test leaf has no target override, a549 leaves override dataset_ref.target to gene-keyed `h2b`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bring predict_batch.sh to parity with predict_local.sh: accept <train_set> and <test_set> args (with ipsc/a549/joint shorthand) instead of hardcoding ipsc-trained → A549-mantis prediction. Renamed from predict_all_a549.sh — the old name lies once test_set is variable. Verified with --dry-run on the original use case (er fnet3d_paper ipsc a549) and on cases that weren't possible before (nucleus fnet3d_paper a549 a549, ... a549 ipsc). submit_benchmark_batch.py itself was already test-set-agnostic; this is a wrapper-only change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Track FOV name and timepoint per cell across CellProfiler, DINOv3, and DynaCLR embeddings, then save as .npz to embeddings/ under the eval output dir. Also redirects OUT_ROOT to evaluations_with_embeddings and enables ER organelle evals (disabling membrane/nucleus) across all 5 model eval scripts. Add viscy-data[mmap] extra to dynacell pyproject. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…h size to 2 Add predict YAMLs for membrane and nucleus CellDiff across mock/denv/zikv infection variants. Reduce batch_size from 4 to 2 in the joint ipsc+confocal+a549+mantis train configs for both membrane and nucleus. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Best val checkpoint from job 31822558 (epoch 134, loss/validate=0.8142, 178 epochs total). Fills the last gap in the (Nucleus|Cell Membrane) × (F-net|UNeXt2 scratch|UNeXt2 FCMAE) a549-trained predict matrix; only the FCMAE Cell Membrane variant remains. ipsc-test leaf has no target override, a549 leaves override dataset_ref.target to gene-keyed `h2b`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a "Prediction zarr naming convention" section to dynacell/CLAUDE.md covering the iPSC / A549 / Joint training-set infix scheme and the historical SEC61B/TOMM20 double-underscore form. Settles the naming for forthcoming joint-trained predicts as `_jointtrained` (decided 2026-05-04). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Predict configs for J31910356 (val 0.5716 @ ep132): VSCyto3D-pretrained FCMAE on a549_mantis sec61b. Covers iPSC test set + 3 a549 plates (mock/denv/zikv). Both manifests use sec61b natively — no dataset_ref override needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31822521 (TIMEOUT @ ep102, val 0.6448 @ ep92): joint iPSC+A549 UNeXt2 scratch on nucleus. Covers iPSC test set + 3 a549 plates (mock/denv/zikv); a549 leaves override target to h2b for the gene-keyed a549 manifest. Outputs land at nucl_fcmae_vscyto3d_scratch_jointtrained{,_<cond>}.zarr per the _jointtrained convention. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31822529 (finished @ 119 ep, val 0.3754 @ ep111): joint iPSC+A549 VSCyto3D-pretrained on cell membrane. Covers iPSC test set + 3 a549 plates (mock/denv/zikv); a549 leaves override target to caax for the gene-keyed a549 manifest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31822536 (finished @ 118 ep, val 0.3859 @ ep112): joint iPSC+A549 UNeXt2 scratch on cell membrane. Covers iPSC test set + 3 a549 plates (mock/denv/zikv); a549 leaves override target to caax. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31910360 (cancelled @ ≥113 ep): A549-only UNeXt2 scratch on mito (TOMM20). Pinned to ep113 ckpt with val 0.7329 per the resume's re-evaluation in last.ckpt's best_model_score block. Carries a wandb-collision caveat: J31910360 (TOMM20) and J31910346 (SEC61B) landed on the same node gpu-f-5 within 9 seconds and the launcher's timestamp-derived wandb run id `20260502-204536` collided. The wandb dashboard for that run shows the TOMM20 display name but logged metrics align step-wise with SEC61B's training, not TOMM20's. The original-run wandb history is therefore unrecoverable for TOMM20; the on-disk Lightning trainer state is the source of truth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31910346 (completed cleanly @ ep199, val 0.6219 @ ep137): A549-only UNeXt2 scratch on ER (SEC61B). Carries the same wandb-collision caveat as the sibling TOMM20 configs: this SEC61B run shared wandb run id `20260502-204536` with J31910360 (TOMM20) on gpu-f-5 — the wandb history metrics actually belong to SEC61B (step-wise alignment confirms), but the dashboard display name was overwritten by TOMM20. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31962520 (completed @ ep129 hitting max_steps, val 0.9709 @ ep126): joint iPSC+A549 F-net on nucleus. Covers iPSC test set + 3 a549 plates; a549 leaves override target to h2b. Note val late-stage drift (ep127→1.05, ep128→1.28) — best-val ckpt at ep126 used for predictions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31962519 (completed @ ep129 hitting max_steps, val 0.5759 @ ep116): joint iPSC+A549 F-net on cell membrane. Covers iPSC test set + 3 a549 plates; a549 leaves override target to caax. Note val drifted to 0.6751 at ep128 — best-val ckpt at ep116 used for predictions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Predict configs for J31965119 (completed @ ep392 hitting max_steps, val 0.8291 @ ep248): A549-only F-net on mito (TOMM20). Final val drifted to 0.9493 — best-val ckpt at ep248 used for predictions. Both iPSC and a549 manifests use tomm20; no dataset_ref override needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…r all models Add mix-trained evaluation scripts for celldiff, vscyto3d, unext2, and fnet3d across iPSC and A549 (mock/denv/zikv) test sets, covering membrane and nucleus targets. Also add new CellDiff predict configs for movie/iPSC sets and fix nucleus CellDiff A549 output paths to use joint_predictions/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fills the two missing slots in the 6 train-set × 2 organelle FNet3D membrane/nucleus eval matrix: joint→iPSC nucleus and joint→A549 nucleus (×3 infections). Predictions already existed under {ipsc,a549}/joint_predictions/, but no eval scripts had been wired up. Distinct from upstream's run_eval_mix_trained_a549_pred_nucleus_*.sh, which run pixel-only (compute_feature_metrics=false, no seg path). These compute all three metric tiers (pixel + mask + DynaCLR feature metrics) for the supplementary table comparing FNet3D across training pools. Slurm runner submits the two scripts as a 2-task array so the iPSC eval (1 condition) and the A549 eval (3 infections) run in parallel on h100/h200/a100. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ics) Adds the UNeXt2 (fcmae_vscyto3d_scratch) counterpart to the FNet3D joint-trained membrane evals, completing the model coverage needed for the 3-pool × 2-test comparison table in the paper appendix. UNeXt2 joint membrane predictions already existed; only the eval scripts were missing. Same shape as the joint nucleus runner: a 2-task slurm array — task 0 = iPSC test, task 1 = A549 test × {mock, denv, zikv} — with full metric tiers (pixel + mask + DynaCLR features). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

alxndrkalinin changed the title ~~Dynacell models~~ Refactor applications/DynaCell into a reusable benchmark package Apr 14, 2026

alxndrkalinin requested a review from Copilot April 14, 2026 22:35

Copilot started reviewing on behalf of alxndrkalinin April 14, 2026 22:35 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

alxndrkalinin requested a review from Copilot April 15, 2026 19:47

Copilot started reviewing on behalf of alxndrkalinin April 15, 2026 19:48 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

alxndrkalinin and others added 21 commits April 16, 2026 16:05

update the model .yml file for unetvit3d

fd030f8

update the training script for unetvit3d on sec61b

60f9ca9

alxndrkalinin requested a review from Copilot April 17, 2026 03:52

Copilot started reviewing on behalf of alxndrkalinin April 17, 2026 03:53 View session

dihan.zheng and others added 30 commits May 3, 2026 13:50

train infected 4gpu

8981a08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor applications/DynaCell into a reusable benchmark package#404

Refactor applications/DynaCell into a reusable benchmark package#404
alxndrkalinin wants to merge 312 commits intomodular-viscy-stagingfrom
dynacell-models

alxndrkalinin commented Apr 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alxndrkalinin commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Config and package reorg

Absorb reusable benchmark modules from dynacell-paper

Package and CLI shape

Runtime, training, and transform fixes

Evaluation and reporting follow-up fixes

Benchmark schema + launcher (landed earlier on this branch)

Topology / trainer-recipe ownership cleanup

Dependency and packaging updates

Why

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alxndrkalinin commented Apr 7, 2026 •

edited

Loading

Absorb reusable benchmark modules from `dynacell-paper`