diff --git a/.envrc b/.envrc new file mode 100644 index 000000000..5ee5204cc --- /dev/null +++ b/.envrc @@ -0,0 +1,3 @@ +export CUDA_PATH=/hpc/apps/cuda/12.8.0_570.86.10 +export PATH=$CUDA_PATH/bin:$PATH +export LD_LIBRARY_PATH=$CUDA_PATH/lib64:${LD_LIBRARY_PATH:-} \ No newline at end of file diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 78f82e681..f475f7e7f 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -118,10 +118,33 @@ jobs: run: uv run --frozen pytest working-directory: applications/${{ matrix.application }} + test-dynacell-configs: + name: Test dynacell benchmark configs (Python 3.13, ubuntu-latest) + runs-on: ubuntu-latest + + steps: + - name: Checkout repository + uses: actions/checkout@v5 + + - name: Set up uv with Python 3.13 + uses: astral-sh/setup-uv@v7 + with: + python-version: "3.13" + enable-cache: true + cache-suffix: ubuntu-latest-3.13 + + - name: Install minimal dynacell (base deps + test group) + run: uv sync --frozen --group test + working-directory: applications/dynacell + + - name: Run benchmark-schema + submit-tool tests + run: uv run --frozen pytest tests/test_benchmark_config_composition.py tests/test_submit_benchmark_job.py -v + working-directory: applications/dynacell + check: name: All tests pass if: always() - needs: [test, test-data, test-data-extras, test-applications] + needs: [test, test-data, test-data-extras, test-applications, test-dynacell-configs] runs-on: ubuntu-latest steps: - name: Verify all test jobs succeeded diff --git a/.gitignore b/.gitignore index 699fa5be0..6fb890029 100644 --- a/.gitignore +++ b/.gitignore @@ -66,3 +66,7 @@ slurm*.out lightning_logs/ # NOTE: uv.lock is NOT ignored - it should be tracked for reproducibility + +checkpoints/ + +plot_related/ diff --git a/CLAUDE.md b/CLAUDE.md index 36847c0a6..832ac5707 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,16 +1,10 @@ -# CLAUDE.md +# VisCy — Claude Code Reference -Project-specific instructions for Claude Code sessions in this repository. +## Project -## Git Workflow -- **NEVER** use `git commit --amend` or `git push --force` / `--force-with-lease` unless the user explicitly requests it. Always create NEW commits. -- ALWAYS use atomic commits: one logical change per commit. Never bundle unrelated changes. -- Never use `git add -A` or `git add .`. Always stage specific files by name. -- Always pull before pushing. If push is rejected, pull and retry — never force-push. - -## Repository Structure +VisCy is a **uv workspace monorepo** for virtual staining and computational microscopy. Sub-packages live under `packages/`. -VisCy is a **uv workspace monorepo**. Sub-packages live under `packages/`: +## Repo Layout ``` pyproject.toml # Root config (ruff, pytest, uv workspace) @@ -28,51 +22,115 @@ applications/ # Self-contained research applications - **Applications must not import from each other.** If two applications need the same logic, move it to an existing package or create a new one. - Applications are consumers of packages — the dependency graph always flows `applications/ → packages/`, never sideways. -## Code Style +--- + +## Development +### Environment Setup -## Testing +Use `uv` package manager. Run commands with `uv run `. Edit `pyproject.toml` to modify dependencies and sync to update `uv.lock`. ```sh -uv run pytest # all tests -uv run pytest packages/viscy-data/ # single package (data) -uv run pytest packages/viscy-models/ # single package (models) +uv venv -p 3.13 +uv sync --all-packages --all-extras ``` -## Common Commands +If `uv` is not installed: +```sh +curl -LsSf https://astral.sh/uv/install.sh | sh +``` +On HPC, symlink the uv cache out of your home directory first: ```sh -uvx ruff check packages/ # lint +mkdir -p /hpc/mydata/firstname.lastname/.cache/uv && ln -s /hpc/mydata/firstname.lastname/.cache/uv ~/.cache/uv +``` + +For full setup instructions (installing uv, creating a venv, syncing dependencies), see [CONTRIBUTING.md](./CONTRIBUTING.md). + +### SLURM scripts for Lightning DDP jobs + +When hand-writing `.slurm` scripts that launch Lightning via `srun`, always use `--ntasks-per-node=N` (not `--ntasks=N`). Lightning's `SLURMEnvironment` validates `SLURM_NTASKS_PER_NODE` at trainer init and raises `RuntimeError: You set --ntasks=N in your SLURM bash script, but this variable is not supported. HINT: Use --ntasks-per-node=N instead.` — the job then dies seconds into the allocation. + +Invariant: `#SBATCH --ntasks-per-node=N` must equal `trainer.devices` in the YAML config and `#SBATCH --gpus=N` (single-node) or `#SBATCH --gpus-per-node=N` (multi-node). + +The dynacell launcher (`applications/dynacell/tools/submit_benchmark_job.py`) already emits `--ntasks-per-node` correctly; this note is for hand-written scripts (e.g., `applications/cytoland/examples/configs/*/run_*.slurm`). + +### Joint vs single-set training batch semantics + +`HCSDataModule` and `BatchedConcatDataModule` produce the same number of GPU samples per training step — but the YAML `batch_size` value that gets there is **different by a factor of `num_samples`**. Easy to misread either by skimming. + +| DataModule | `train_dataloader` divides by `num_samples`? | Samples per step | +|---|---|---| +| `HCSDataModule` (single-set) | yes (`hcs.py` `train_dataloader`) | `batch_size` | +| `ConcatDataModule` (parent class) | yes (`combined.py` `train_dataloader`) | `batch_size` | +| `BatchedConcatDataModule` (joint) | **no** (`combined.py` overrides; uses `batch_size` as-is) | `batch_size * num_samples` | + +To match the same effective per-step samples between a single-set and a joint config, **set `joint.batch_size = single_set.batch_size / num_samples`**. + +Examples (verified against the `applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/` overlays + their joint leaves): + +- FCMAE (`fcmae_vscyto3d_*`): single-set `batch_size: 32, num_samples: 4` → joint `batch_size: 8, num_samples: 4` → both yield **32 samples/step**. +- FNet3D (`fnet3d_paper`): single-set `batch_size: 48, num_samples: 8` → joint `batch_size: 6, num_samples: 8` → both yield **48 samples/step**. + +`HCSDataModule._train_transform` enforces `batch_size % num_samples == 0` for single-set use because `train_dataloader` would otherwise round down silently. The check is suppressed for `BatchedConcatDataModule` children via the `_is_batched_concat_child` flag set in the wrapper's `setup()` — joint configs are free to pick any `(batch_size, num_samples)` pair as long as the product is the desired sample count. **Do not** "fix" a joint config by raising `batch_size` to satisfy the divisibility rule; it would multiply effective samples by `num_samples`. + +When in doubt, read both `train_dataloader` overrides directly — they are short. Don't infer from comments alone. + +### Common Commands + +```sh +uvx ruff check packages/ # lint uvx ruff check --fix packages/ # lint + auto-fix uvx ruff format packages/ # format +uv run pytest # all tests ``` -## Code Style +### Testing + +```sh +uv run pytest # all tests +uv run pytest packages/viscy-data/ # single package (data) +uv run pytest packages/viscy-models/ # single package (models) +``` + +Prefer `{file}_test.py` in the same directory as `{file}.py`, unless there are import issues, in which case use `tests/`. + +--- + +## Project Conventions + +- Ruff config is centralized in the root `pyproject.toml` only. Sub-packages must NOT have their own `[tool.ruff.*]` sections. Ruff does not inherit config — any `[tool.ruff.*]` in a sub-package silently overrides the entire root config (including `lint.select`, `per-file-ignores`, etc.). +- Run `uvx prek run --files {files_you_edited}` (unless the change was simple) and fix typing and linting errors. Use `# type: ignore` as needed. The precommit will give you type errors which is useful — especially to know if you have incorrect code — but for many minor changes it's better to do this after testing. Use a subagent to apply complex fixes. + +--- + +## Engineering Standards + +### Git Workflow + +- **NEVER** use `git commit --amend` or `git push --force` / `--force-with-lease` unless the user explicitly requests it. Always create NEW commits. +- ALWAYS use atomic commits: one logical change per commit. Never bundle unrelated changes. +- Never use `git add -A` or `git add .`. Always stage specific files by name. +- Always pull before pushing. If push is rejected, pull and retry — never force-push. + +### Code Style -### General -- **Ruff config is centralized in the root `pyproject.toml` only.** - Sub-packages must NOT have their own `[tool.ruff.*]` sections. - Ruff does not inherit config — any `[tool.ruff.*]` in a sub-package - silently overrides the entire root config (including `lint.select`, - `per-file-ignores`, etc.). - Docstrings use **numpy style** (`convention = "numpy"`). - Lint rules: `D, E, F, I, NPY, PD, W`. - `D` rules are ignored in `**/tests/**` and notebooks. - Format: double quotes, spaces, 120 char line length. -- Prefer {file}_test.py in the same directory as {file}.py, unless there are import issues, in which case use tests/... -- Run `uvx prek run --files {files_you_editted}` (unless the change was simple) and fix typing and linting errors, you make `# type: ignore` as needed. - The precommit will give you type errors which is nice - especially to know if you have incorrect code - but for many minor changes it's better to do this after testing. - Use a subagent to apply complex fixes. -- Use a subagent to run tests and complex bash commands, especially that which you think will return complex output. +- Use a subagent to run tests and complex bash commands, especially those expected to return complex output. -### Avoid Backwards Compatibility -In most cases it is incorrect to maintain backwards compatibility with a previous pipeline. This is a research codebase - changes are expected and encouraged. Keeping backwards compatibility risks MORE bugs, since someone can unknowingly run old code. +#### Avoid Backwards Compatibility + +In most cases it is incorrect to maintain backwards compatibility with a previous pipeline. This is a research codebase — changes are expected and encouraged. Keeping backwards compatibility risks MORE bugs, since someone can unknowingly run old code. If you believe it is important to maintain backwards compatibility, explicitly ask the user if you should do so during the planning stage. If the user says no, then do not maintain backwards compatibility. Delete and remove old code that is not used. -### Use Context Managers for Resources +#### Use Context Managers for Resources + Always use context managers (`with` statements) when opening external resources like zarr stores, files, or database connections. Never assign them to a variable without a context manager — this leaks file handles and locks. ```python @@ -84,95 +142,76 @@ with open_ome_zarr(path, mode="r") as plate: plate = open_ome_zarr(path, mode="r") ``` -### Prefer Raising Errors -In general, prefer raising errors instead of silently catching them. Errors are good and warn us of issues in the script. For example, prefer `value = my_dictionary['key']` over `value = my_dictionary.get('key')` since the former will raise a `KeyError` to signal that the underlying data is not behaving as expected. +#### Prefer Raising Errors + +Prefer raising errors instead of silently catching them. Errors are good and warn us of issues. For example, prefer `value = my_dictionary['key']` over `value = my_dictionary.get('key')` since the former will raise a `KeyError` to signal that the underlying data is not behaving as expected. Only catch errors when there is a good reason to do so: for example, catching HTTP errors in order to retry a request. If you find yourself writing an if statement, fallback, or except statement designed to avoid errors, ask yourself if it would be better to raise the error as a signal to the user. +#### Use Real Integration Tests -### Use Real Integration Tests -Tests should directly *import* the actual code we are trying to test. For example, if you are trying to test `my_function` on some sample data, your test should directly import `my_function` and run it on the sample data. AVOID testing "key behavior" or components of the pipeline, since this can miss bugs. +Tests should directly *import* the actual code we are trying to test. For example, if you are trying to test `my_function` on some sample data, your test should directly import `my_function` and run it on the sample data. Avoid testing "key behavior" or components in isolation when an integration test would catch more bugs. Ask yourself if your test is actually covering the true function. -### Imports -- Import at the top of the file. Don't use inline imports without strong reason. -- Use absolute imports (`from projects.my_directory.my_file`) instead of relative. -- Do not modify `sys.path` for imports. - -## Development Environment - -### Environment -Use `uv` package manager. Run commands with `uv run `. Edit `pyproject.toml` to modify dependencies and sync to update `uv.lock` - -For full setup instructions (installing uv, creating a venv, syncing dependencies), see [CONTRIBUTING.md](./CONTRIBUTING.md). - -Quick start: -```sh -uv venv -p 3.13 -uv sync --all-packages --all-extras -uv run pytest -``` +#### Imports -If `uv` is not installed: -```sh -curl -LsSf https://astral.sh/uv/install.sh | sh -``` +- Import at the top of the file. No inline imports without strong reason. +- Use absolute imports (`from packages.my_directory.my_file`) instead of relative. +- Do not modify `sys.path` for imports. -On HPC, symlink the uv cache out of your home directory first: -```sh -mkdir -p /hpc/mydata/firstname.lastname/.cache/uv && ln -s /hpc/mydata/firstname.lastname/.cache/uv ~/.cache/uv -``` +### Coding Philosophy -## Coding +#### 1. Think Before Coding -1. Think Before Coding Don't assume. Don't hide confusion. Surface tradeoffs. Before implementing: +- State your assumptions explicitly. If uncertain, ask. +- If multiple interpretations exist, present them — don't pick silently. +- If a simpler approach exists, say so. Push back when warranted. +- If something is unclear, stop. Name what's confusing. Ask. + +#### 2. Simplicity First -State your assumptions explicitly. If uncertain, ask. -If multiple interpretations exist, present them - don't pick silently. -If a simpler approach exists, say so. Push back when warranted. -If something is unclear, stop. Name what's confusing. Ask. -2. Simplicity First Minimum code that solves the problem. Nothing speculative. -No features beyond what was asked. -No abstractions for single-use code. -No "flexibility" or "configurability" that wasn't requested. -No error handling for impossible scenarios. -If you write 200 lines and it could be 50, rewrite it. -Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify. +- No features beyond what was asked. +- No abstractions for single-use code. +- No "flexibility" or "configurability" that wasn't requested. +- No error handling for impossible scenarios. +- If you write 200 lines and it could be 50, rewrite it. +- Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify. + +#### 3. Surgical Changes -3. Surgical Changes Touch only what you must. Clean up only your own mess. When editing existing code: +- Don't "improve" adjacent code, comments, or formatting. +- Don't refactor things that aren't broken. +- Match existing style, even if you'd do it differently. +- If you notice unrelated dead code, mention it — don't delete it. -Don't "improve" adjacent code, comments, or formatting. -Don't refactor things that aren't broken. -Match existing style, even if you'd do it differently. -If you notice unrelated dead code, mention it - don't delete it. When your changes create orphans: +- Remove imports/variables/functions that YOUR changes made unused. +- Don't remove pre-existing dead code unless asked. + +The test: every changed line should trace directly to the user's request. -Remove imports/variables/functions that YOUR changes made unused. -Don't remove pre-existing dead code unless asked. -The test: Every changed line should trace directly to the user's request. +#### 4. Goal-Driven Execution -4. Goal-Driven Execution Define success criteria. Loop until verified. Transform tasks into verifiable goals: +- "Add validation" → "Write tests for invalid inputs, then make them pass" +- "Fix the bug" → "Write a test that reproduces it, then make it pass" +- "Refactor X" → "Ensure tests pass before and after" -"Add validation" → "Write tests for invalid inputs, then make them pass" -"Fix the bug" → "Write a test that reproduces it, then make it pass" -"Refactor X" → "Ensure tests pass before and after" For multi-step tasks, state a brief plan: - 1. [Step] → verify: [check] 2. [Step] → verify: [check] -3. [Step] → verify: [check] + Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification. diff --git a/applications/airtable/pyproject.toml b/applications/airtable/pyproject.toml index ebd5a173c..9b11bc441 100644 --- a/applications/airtable/pyproject.toml +++ b/applications/airtable/pyproject.toml @@ -8,14 +8,13 @@ description = "Interface to the Computational Imaging Airtable database" keywords = [ "airtable", "metadata", "microscopy", "zarr" ] license = "BSD-3-Clause" authors = [ { name = "Biohub", email = "compmicro@czbiohub.org" } ] -requires-python = ">=3.11" +requires-python = ">=3.12" classifiers = [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python :: 3 :: Only", - "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.13", "Programming Language :: Python :: 3.14", diff --git a/applications/cytoland/examples/configs/dynacell/fit_fnet3d_sec61b.yml b/applications/cytoland/examples/configs/dynacell/fit_fnet3d_sec61b.yml index d354416d6..c3b8ff259 100644 --- a/applications/cytoland/examples/configs/dynacell/fit_fnet3d_sec61b.yml +++ b/applications/cytoland/examples/configs/dynacell/fit_fnet3d_sec61b.yml @@ -5,7 +5,8 @@ # Batch related launches with: # export VISCY_WANDB_LAUNCH=20260401-augfix-r1 base: - - ../recipes/trainer/fit_1gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/single_gpu.yml - ../recipes/data/hcs_sec61b_3d.yml - ../recipes/models/fnet3d_z8.yml @@ -20,9 +21,12 @@ model: schedule: WarmupCosine trainer: + precision: bf16-mixed max_epochs: 100 logger: init_args: + # Override cytoland's default project: this bridge trains on a dynacell dataset (iPSC SEC61B). + project: dynacell name: FNet3D_iPSC_SEC61B save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell_cytoland/ipsc/sec61b/fnet3d callbacks: diff --git a/applications/cytoland/examples/configs/dynacell/fit_vscyto3d_sec61b.yml b/applications/cytoland/examples/configs/dynacell/fit_vscyto3d_sec61b.yml index 57e26577c..2e5b2e129 100644 --- a/applications/cytoland/examples/configs/dynacell/fit_vscyto3d_sec61b.yml +++ b/applications/cytoland/examples/configs/dynacell/fit_vscyto3d_sec61b.yml @@ -5,7 +5,8 @@ # Batch related launches with: # export VISCY_WANDB_LAUNCH=20260401-augfix-r1 base: - - ../recipes/trainer/fit_1gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/single_gpu.yml - ../recipes/data/hcs_sec61b_3d.yml - ../recipes/models/unext2_3d_z8.yml @@ -20,9 +21,12 @@ model: schedule: WarmupCosine trainer: + precision: bf16-mixed max_epochs: 100 logger: init_args: + # Override cytoland's default project: this bridge trains on a dynacell dataset (iPSC SEC61B). + project: dynacell name: VSCyto3D_iPSC_SEC61B save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell_cytoland/ipsc/sec61b/vscyto3d callbacks: diff --git a/applications/cytoland/examples/configs/fnet3d/fit.yml b/applications/cytoland/examples/configs/fnet3d/fit.yml index c5b98c266..61df4e08b 100644 --- a/applications/cytoland/examples/configs/fnet3d/fit.yml +++ b/applications/cytoland/examples/configs/fnet3d/fit.yml @@ -3,7 +3,8 @@ # FNet3D: supervised training (Ounkomol et al. 2018). # Usage: python -m cytoland fit --config fnet3d/fit.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/hcs_nuc_mem_3d.yml - ../recipes/models/fnet3d.yml @@ -13,6 +14,8 @@ model: schedule: Constant trainer: + precision: 16-mixed + max_epochs: 200 max_steps: 50000 data: diff --git a/applications/cytoland/examples/configs/fnet3d/predict.yml b/applications/cytoland/examples/configs/fnet3d/predict.yml index 62f22e4ff..05466f236 100644 --- a/applications/cytoland/examples/configs/fnet3d/predict.yml +++ b/applications/cytoland/examples/configs/fnet3d/predict.yml @@ -3,7 +3,8 @@ # FNet3D: inference. # Usage: python -m cytoland predict --config fnet3d/predict.yml base: - - ../recipes/trainer/predict_gpu.yml + - ../recipes/trainer/predict.yml + - ../recipes/topology/single_gpu.yml - ../recipes/data/hcs_nuc_mem_3d.yml - ../recipes/models/fnet3d.yml diff --git a/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d1_hummingbird.yml b/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d1_hummingbird.yml new file mode 100644 index 000000000..d376cfc93 --- /dev/null +++ b/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d1_hummingbird.yml @@ -0,0 +1,78 @@ +# Data recipe: A549 infection finetune — Hummingbird, 2026-01-29 (D1). +# Phase3D -> DAPI (nucleus) + TXR (membrane). Per-timepoint normalization. +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_01_29_A549_H2B_CAAX_DAPI_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: [DAPI_Density3D, TXR_Density3D] + z_window_size: 20 + split_ratio: 0.8 + batch_size: 16 + num_workers: 8 + persistent_workers: true + mmap_preload: true + yx_patch_size: [384, 384] + # CPU: normalize, then weighted-sample 4 crops per FOV at full Z depth + # (20, 600, 600). Weighting by the DAPI (nucleus) channel biases crops + # to cell-dense regions. Matches the dynacell fcmae_vscyto3d recipe. + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, DAPI_Density3D, TXR_Density3D] + w_key: DAPI_Density3D + spatial_size: [20, 600, 600] + num_samples: 4 + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [DAPI_Density3D, TXR_Density3D] + level: timepoint_statistics + subtrahend: median + divisor: iqr + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] diff --git a/applications/dynacell/examples/configs/recipes/data/hcs_sec61b_3d.yml b/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d2_hummingbird.yml similarity index 50% rename from applications/dynacell/examples/configs/recipes/data/hcs_sec61b_3d.yml rename to applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d2_hummingbird.yml index a7b87b7d7..0cd2c5f7e 100644 --- a/applications/dynacell/examples/configs/recipes/data/hcs_sec61b_3d.yml +++ b/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d2_hummingbird.yml @@ -1,65 +1,75 @@ -# Data recipe: HCSDataModule for Phase3D -> Structure (SEC61B), 3D (z=8). -# Uses mean/std (source) and median/iqr (target) normalization with GPU-side Batched* augmentations. +# Data recipe: A549 infection finetune — Hummingbird, 2026-03-10 (D2). +# Phase3D -> DAPI (nucleus) + TXR (membrane). Per-timepoint normalization. data: class_path: viscy_data.hcs.HCSDataModule init_args: - data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_03_10_A549_H2B_CAXX_DAPI_DENV_ZIKV.zarr source_channel: Phase3D - target_channel: Structure - z_window_size: 8 + target_channel: [DAPI_Density3D, TXR_Density3D] + z_window_size: 20 + split_ratio: 0.8 + batch_size: 16 num_workers: 8 - yx_patch_size: [512, 512] + persistent_workers: true + mmap_preload: true + yx_patch_size: [384, 384] + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, DAPI_Density3D, TXR_Density3D] + w_key: DAPI_Density3D + spatial_size: [20, 600, 600] + num_samples: 4 normalizations: - class_path: viscy_transforms.NormalizeSampled init_args: keys: [Phase3D] - level: fov_statistics + level: timepoint_statistics subtrahend: mean divisor: std - class_path: viscy_transforms.NormalizeSampled init_args: - keys: [Structure] - level: fov_statistics + keys: [DAPI_Density3D, TXR_Density3D] + level: timepoint_statistics subtrahend: median divisor: iqr gpu_augmentations: - - class_path: viscy_transforms.BatchedRandWeightedCropd - init_args: - keys: [source, target] - w_key: target - spatial_size: [8, 384, 384] - class_path: viscy_transforms.BatchedRandAffined init_args: keys: [source, target] - prob: 0.5 + prob: 0.8 rotate_range: [3.14, 0, 0] - shear_range: [0.0, 3.0, 3.0] - scale_range: [[0.8, 1.2], [0.7, 1.3], [0.7, 1.3]] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] - class_path: viscy_transforms.BatchedCenterSpatialCropd init_args: keys: [source, target] - roi_size: [8, 256, 256] + roi_size: [15, 384, 384] - class_path: viscy_transforms.BatchedRandAdjustContrastd init_args: keys: [source] - prob: 0.3 - gamma: [0.75, 1.5] + prob: 0.5 + gamma: [0.8, 1.2] - class_path: viscy_transforms.BatchedRandScaleIntensityd init_args: keys: [source] - factors: 0.5 prob: 0.5 + factors: 0.5 - class_path: viscy_transforms.BatchedRandGaussianNoised init_args: keys: [source] prob: 0.5 mean: 0.0 - std: 1.0 + std: 0.3 - class_path: viscy_transforms.BatchedRandGaussianSmoothd init_args: keys: [source] prob: 0.5 - sigma_x: [0.25, 1.5] - sigma_y: [0.25, 1.5] - sigma_z: [0.25, 1.5] - preload: true + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] diff --git a/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d3_mantis.yml b/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d3_mantis.yml new file mode 100644 index 000000000..c2f8bdea8 --- /dev/null +++ b/applications/cytoland/examples/configs/recipes/data/hcs_a549_infected_d3_mantis.yml @@ -0,0 +1,112 @@ +# Data recipe: A549 infection finetune — Mantis, 2026-03-26 (D3). +# Phase3D -> raw mCherry (nucleus H2B) + raw Cy5 (membrane CAAX). Per-timepoint normalization. +# Excludes 27 FOVs held out as the Mantis test set (see 2026-02 Viral infection wiki). +# Run `uv run viscy preprocess` on the zarr first to populate normalization_metadata. +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_03_26_A549_CAAX_H2B_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + z_window_size: 20 + split_ratio: 0.8 + batch_size: 16 + num_workers: 8 + persistent_workers: true + mmap_preload: true + yx_patch_size: [384, 384] + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: + - Phase3D + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + w_key: "raw mCherry EX561 EM600-37" + spatial_size: [20, 600, 600] + num_samples: 4 + exclude_fov_names: + - B/2/000000 + - B/2/000001 + - B/2/000002 + - B/2/000003 + - B/2/000004 + - B/2/000005 + - B/2/000006 + - B/2/000007 + - B/2/000008 + - B/3/000004 + - B/3/000005 + - B/3/000006 + - B/3/000007 + - B/3/000008 + - B/3/001000 + - B/3/001001 + - B/3/001002 + - B/3/001003 + - B/4/000000 + - B/4/000001 + - B/4/000002 + - B/4/000003 + - B/4/000006 + - B/4/000007 + - B/4/000008 + - B/4/001000 + - B/4/002000 + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + level: timepoint_statistics + subtrahend: median + divisor: iqr + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] diff --git a/applications/cytoland/examples/configs/recipes/topology/ddp_4gpu.yml b/applications/cytoland/examples/configs/recipes/topology/ddp_4gpu.yml new file mode 100644 index 000000000..6ecdb4ad8 --- /dev/null +++ b/applications/cytoland/examples/configs/recipes/topology/ddp_4gpu.yml @@ -0,0 +1,6 @@ +# Topology recipe: 4-GPU DDP training on a single node. +trainer: + accelerator: gpu + strategy: ddp + devices: 4 + num_nodes: 1 diff --git a/applications/cytoland/examples/configs/recipes/topology/single_gpu.yml b/applications/cytoland/examples/configs/recipes/topology/single_gpu.yml new file mode 100644 index 000000000..a05fa451a --- /dev/null +++ b/applications/cytoland/examples/configs/recipes/topology/single_gpu.yml @@ -0,0 +1,7 @@ +# Single-GPU training. strategy=auto lets Lightning pick single_device; +# plain ddp at devices=1 would add pointless process-group overhead. +trainer: + accelerator: gpu + strategy: auto + devices: 1 + num_nodes: 1 diff --git a/applications/dynacell/examples/configs/recipes/trainer/fit_4gpu.yml b/applications/cytoland/examples/configs/recipes/trainer/fit.yml similarity index 58% rename from applications/dynacell/examples/configs/recipes/trainer/fit_4gpu.yml rename to applications/cytoland/examples/configs/recipes/trainer/fit.yml index 9184b862a..441dbfd49 100644 --- a/applications/dynacell/examples/configs/recipes/trainer/fit_4gpu.yml +++ b/applications/cytoland/examples/configs/recipes/trainer/fit.yml @@ -1,11 +1,15 @@ -# Trainer recipe: 4-GPU DDP training. +# Topology (accelerator / devices / strategy / num_nodes) lives in +# recipes/topology/*.yml. Precision lives in model overlays. +# max_epochs and max_steps also live in model overlays or leaves. seed_everything: 42 trainer: - accelerator: gpu - strategy: ddp - devices: 4 - num_nodes: 1 - precision: 16-mixed + log_every_n_steps: 10 + enable_checkpointing: true + inference_mode: true + logger: + class_path: lightning.pytorch.loggers.WandbLogger + init_args: + project: cytoland callbacks: - class_path: lightning.pytorch.callbacks.LearningRateMonitor init_args: @@ -16,8 +20,3 @@ trainer: every_n_epochs: 1 save_top_k: 5 save_last: true - fast_dev_run: false - max_epochs: 200 - log_every_n_steps: 10 - enable_checkpointing: true - inference_mode: true diff --git a/applications/cytoland/examples/configs/recipes/trainer/fit_1gpu.yml b/applications/cytoland/examples/configs/recipes/trainer/fit_1gpu.yml deleted file mode 100644 index 6ac1650fe..000000000 --- a/applications/cytoland/examples/configs/recipes/trainer/fit_1gpu.yml +++ /dev/null @@ -1,30 +0,0 @@ -# Legacy transitional config; new benchmark launches should use Dynacell. -# See: applications/dynacell/examples/configs/sec61b/ -# Trainer recipe: 1-GPU training with WandB logging and checkpointing. -# W&B convention: -# - run name: YYYYMMDD-HHMMSS_ -# - group: VISCY_WANDB_GROUP, else VISCY_WANDB_LAUNCH, else the base name -seed_everything: 42 -trainer: - accelerator: gpu - strategy: ddp - devices: 1 - num_nodes: 1 - precision: bf16-mixed - log_every_n_steps: 10 - logger: - class_path: lightning.pytorch.loggers.WandbLogger - init_args: - project: dynacell - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - monitor: loss/validate - every_n_epochs: 1 - save_top_k: 4 - save_last: true - enable_checkpointing: true - inference_mode: true diff --git a/applications/cytoland/examples/configs/recipes/trainer/fit_4gpu.yml b/applications/cytoland/examples/configs/recipes/trainer/fit_4gpu.yml deleted file mode 100644 index cb8da48c4..000000000 --- a/applications/cytoland/examples/configs/recipes/trainer/fit_4gpu.yml +++ /dev/null @@ -1,32 +0,0 @@ -# Trainer recipe: 4-GPU DDP training with WandB logging and checkpointing. -# W&B convention: -# - run name: YYYYMMDD-HHMMSS_ -# - group: VISCY_WANDB_GROUP, else VISCY_WANDB_LAUNCH, else the base name -seed_everything: 42 -trainer: - accelerator: gpu - strategy: ddp - devices: 4 - num_nodes: 1 - precision: 16-mixed - logger: - class_path: lightning.pytorch.loggers.WandbLogger - init_args: - project: cytoland - name: #TODO run name - save_dir: #TODO save directory - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - monitor: loss/validate - every_n_epochs: 1 - save_top_k: 5 - save_last: true - fast_dev_run: false - max_epochs: 200 - log_every_n_steps: 10 - enable_checkpointing: true - inference_mode: true diff --git a/applications/dynacell/examples/configs/recipes/trainer/predict_gpu.yml b/applications/cytoland/examples/configs/recipes/trainer/predict.yml similarity index 73% rename from applications/dynacell/examples/configs/recipes/trainer/predict_gpu.yml rename to applications/cytoland/examples/configs/recipes/trainer/predict.yml index a8baf2f63..52a1c6036 100644 --- a/applications/dynacell/examples/configs/recipes/trainer/predict_gpu.yml +++ b/applications/cytoland/examples/configs/recipes/trainer/predict.yml @@ -1,7 +1,6 @@ -# Trainer recipe: single-GPU prediction. +# Unified predict trainer recipe. +# Topology lives in recipes/topology/single_gpu.yml. trainer: - accelerator: gpu - devices: 1 precision: 32-true callbacks: - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter diff --git a/applications/cytoland/examples/configs/vscyto2d/finetune.yml b/applications/cytoland/examples/configs/vscyto2d/finetune.yml index f00c3575f..d9838635b 100644 --- a/applications/cytoland/examples/configs/vscyto2d/finetune.yml +++ b/applications/cytoland/examples/configs/vscyto2d/finetune.yml @@ -1,7 +1,8 @@ # VSCyto2D: supervised fine-tuning from FCMAE-pretrained encoder. # Usage: python -m cytoland fit --config vscyto2d/finetune.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/hcs_nuc_mem_2d.yml - ../recipes/models/fcmae_2d.yml @@ -18,6 +19,10 @@ model: lr: 0.0002 schedule: WarmupCosine +trainer: + precision: 16-mixed + max_epochs: 200 + data: init_args: data_path: #TODO HCS OME-Zarr data diff --git a/applications/cytoland/examples/configs/vscyto2d/predict.yml b/applications/cytoland/examples/configs/vscyto2d/predict.yml index c865d1f66..b633b2243 100644 --- a/applications/cytoland/examples/configs/vscyto2d/predict.yml +++ b/applications/cytoland/examples/configs/vscyto2d/predict.yml @@ -2,7 +2,8 @@ # Checkpoint: https://public.czbiohub.org/comp.micro/viscy/VS_models/VSCyto2D/VSCyto2D/epoch=399-step=23200.ckpt # Usage: python -m cytoland predict --config vscyto2d/predict.yml base: - - ../recipes/trainer/predict_gpu.yml + - ../recipes/trainer/predict.yml + - ../recipes/topology/single_gpu.yml - ../recipes/data/hcs_nuc_mem_2d.yml - ../recipes/models/fcmae_2d.yml diff --git a/applications/cytoland/examples/configs/vscyto2d/pretrain.yml b/applications/cytoland/examples/configs/vscyto2d/pretrain.yml index 3ece1bc7e..c0b2c1d92 100644 --- a/applications/cytoland/examples/configs/vscyto2d/pretrain.yml +++ b/applications/cytoland/examples/configs/vscyto2d/pretrain.yml @@ -1,7 +1,8 @@ # VSCyto2D: FCMAE self-supervised pretraining (2D, in_stack_depth=1). # Usage: python -m cytoland fit --config vscyto2d/pretrain.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/cached_pretrain.yml model: @@ -24,7 +25,9 @@ model: log_samples_per_batch: 1 trainer: + # FCMAE pretraining requires find_unused_parameters=True (masked decoder). strategy: ddp_find_unused_parameters_true + precision: 16-mixed max_epochs: 400 use_distributed_sampler: false callbacks: diff --git a/applications/cytoland/examples/configs/vscyto3d/finetune.yml b/applications/cytoland/examples/configs/vscyto3d/finetune.yml index 8305babe3..d547f176d 100644 --- a/applications/cytoland/examples/configs/vscyto3d/finetune.yml +++ b/applications/cytoland/examples/configs/vscyto3d/finetune.yml @@ -1,7 +1,8 @@ # VSCyto3D: supervised fine-tuning from FCMAE-pretrained encoder. # Usage: python -m cytoland fit --config vscyto3d/finetune.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/hcs_nuc_mem_3d.yml - ../recipes/models/unext2_3d.yml @@ -16,6 +17,10 @@ model: lr: 0.0002 schedule: WarmupCosine +trainer: + precision: bf16-mixed + max_epochs: 200 + data: init_args: data_path: #TODO HCS OME-Zarr data diff --git a/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected.yml b/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected.yml new file mode 100644 index 000000000..fad85d814 --- /dev/null +++ b/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected.yml @@ -0,0 +1,353 @@ +# VSCyto3D: warm-start finetune on A549 infected-cell data from three microscopes. +# +# Sources: +# D1 = Hummingbird 2026-01-29, DAPI_Density3D + TXR_Density3D +# D2 = Hummingbird 2026-03-10, DAPI_Density3D + TXR_Density3D +# D3 = Mantis 2026-03-26, raw mCherry + raw Cy5 (27 FOVs held out for test) +# +# The three sub-DMs keep their native channel names; CombinedDataModule +# (MAX_SIZE_CYCLE) pulls one sub-batch from each per step so every +# microscope contributes equally to each gradient step regardless of +# dataset size. +# +# Run D3 preprocess first to populate normalization_metadata: +# uv run viscy preprocess --data_path \ +# --channel_names+ "Phase3D" \ +# --channel_names+ "raw mCherry EX561 EM600-37" \ +# --channel_names+ "raw Cy5 EX639 EM698-70" +# +# Usage: +# uv run python -m cytoland fit --config vscyto3d/finetune_a549_infected.yml +base: + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml + +# The published VSCyto3D is FullyConvolutionalMAE (architecture='fcmae') in +# supervised mode (pretraining=False), not UNeXt2 despite the dataset/model +# card nickname. VSUNet(architecture='fcmae') lets us warm-start from the +# published ckpt without the extra FCMAE pretraining validators on +# FcmaeUNet (which require a GPUTransformDataModule that HCSDataModule +# does not subclass). +model: + class_path: cytoland.engine.VSUNet + init_args: + architecture: fcmae + model_config: + in_channels: 1 + out_channels: 2 + encoder_blocks: [3, 3, 9, 3] + encoder_drop_path_rate: 0.1 + dims: [96, 192, 384, 768] + decoder_conv_blocks: 2 + stem_kernel_size: [5, 4, 4] + in_stack_depth: 15 + pretraining: false + loss_function: + class_path: viscy_utils.losses.MixedLoss + init_args: + l1_alpha: 0.5 + l2_alpha: 0.0 + ms_dssim_alpha: 0.5 + # lr=2e-4 matches the canonical vs_test/finetune_3d.py recipe that + # produced the published VSCyto3D ckpt we warm-start from. + # (Dynacell's fcmae_vscyto3d_fit.yml uses lr=4e-4 as a retune vs. a + # UNeXt2 throughput baseline — not the right anchor for this finetune.) + lr: 0.0002 + schedule: WarmupCosine + # D3 drives step count in MAX_SIZE_CYCLE: ~ (243-27) * 0.8 * 11 T / batch_size + # ~= 120 steps/epoch at batch_size=16 across 4 GPUs. One-epoch warmup. + warmup_steps: 120 + warmup_multiplier: 1e-3 + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/fcmae-cyto3d-sensor/vscyto3d-logs/hek-a549-ipsc-finetune/checkpoints/epoch=83-step=14532-loss=0.492.ckpt + +trainer: + # FullyConvolutionalMAE(pretraining=False) has decoder/head params that + # only receive gradients on some forward paths; default ddp with + # find_unused_parameters=False errors at step 1. Matches dynacell + # fcmae_vscyto3d_fit.yml and vs_test/finetune_3d.py:215. + strategy: ddp_find_unused_parameters_true + # bf16-mixed avoids the Hopper fp16 cuDNN slowdown documented in + # applications/dynacell/configs/examples/fcmae_hopper_slowdown.md. + precision: bf16-mixed + # Matches the canonical vs_test finetune budget that produced the + # published VSCyto3D ckpt (stopped at epoch 83 on val loss). + max_epochs: 100 + logger: + init_args: + project: cytoland + name: VSCyto3D_ft_A549_infected + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cytoland/a549_infected/vscyto3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cytoland/a549_infected/vscyto3d/checkpoints + +data: + class_path: viscy_data.combined.CombinedDataModule + init_args: + train_mode: MAX_SIZE_CYCLE + val_mode: SEQUENTIAL + data_modules: + # D1 — Hummingbird 2026-01-29 + - class_path: viscy_data.hcs.HCSDataModule + init_args: + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_01_29_A549_H2B_CAAX_DAPI_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: [DAPI_Density3D, TXR_Density3D] + z_window_size: 20 + split_ratio: 0.8 + batch_size: 16 + num_workers: 8 + persistent_workers: true + mmap_preload: true + yx_patch_size: [384, 384] + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, DAPI_Density3D, TXR_Density3D] + w_key: DAPI_Density3D + spatial_size: [20, 600, 600] + num_samples: 4 + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [DAPI_Density3D, TXR_Density3D] + level: timepoint_statistics + subtrahend: median + divisor: iqr + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + + # D2 — Hummingbird 2026-03-10 + - class_path: viscy_data.hcs.HCSDataModule + init_args: + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_03_10_A549_H2B_CAXX_DAPI_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: [DAPI_Density3D, TXR_Density3D] + z_window_size: 20 + split_ratio: 0.8 + batch_size: 16 + num_workers: 8 + persistent_workers: true + mmap_preload: true + yx_patch_size: [384, 384] + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, DAPI_Density3D, TXR_Density3D] + w_key: DAPI_Density3D + spatial_size: [20, 600, 600] + num_samples: 4 + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [DAPI_Density3D, TXR_Density3D] + level: timepoint_statistics + subtrahend: median + divisor: iqr + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + + # D3 — Mantis 2026-03-26 (27 FOVs held out for test) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_03_26_A549_CAAX_H2B_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + z_window_size: 20 + split_ratio: 0.8 + batch_size: 16 + num_workers: 8 + persistent_workers: true + mmap_preload: true + yx_patch_size: [384, 384] + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: + - Phase3D + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + w_key: "raw mCherry EX561 EM600-37" + spatial_size: [20, 600, 600] + num_samples: 4 + exclude_fov_names: + - B/2/000000 + - B/2/000001 + - B/2/000002 + - B/2/000003 + - B/2/000004 + - B/2/000005 + - B/2/000006 + - B/2/000007 + - B/2/000008 + - B/3/000004 + - B/3/000005 + - B/3/000006 + - B/3/000007 + - B/3/000008 + - B/3/001000 + - B/3/001001 + - B/3/001002 + - B/3/001003 + - B/4/000000 + - B/4/000001 + - B/4/000002 + - B/4/000003 + - B/4/000006 + - B/4/000007 + - B/4/000008 + - B/4/001000 + - B/4/002000 + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + level: timepoint_statistics + subtrahend: median + divisor: iqr + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] diff --git a/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_4gpu_batched.yml b/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_4gpu_batched.yml new file mode 100644 index 000000000..d3efd9f82 --- /dev/null +++ b/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_4gpu_batched.yml @@ -0,0 +1,237 @@ +# Production 4-GPU finetune of VSCyto3D on A549 infected-cell data. +# Mirrors the dynacell joint training pattern (see +# applications/dynacell/configs/benchmarks/virtual_staining/nucleus/ +# fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml). +# +# Architecture: BatchedConcatDataModule pools D1+D2+D3 cropped zarrs +# (~323 GB total) into a single shuffled dataset with one +# ShardedDistributedSampler. Per dynacell joint convention, batch_size is +# NOT divided by num_samples in joint mode — bs=16 indices × num_samples=4 +# = 64 patches/step/rank × 4 ranks = 256 patches/step total. +# +# Backend: zarr-python with mmap_preload to /tmp. Stages all FOVs to a +# MemoryMappedTensor on local /tmp (28 TB) during prepare_data, then +# closes the zarr handles. DataLoader workers fork from a parent with no +# live zarr asyncio loop and read from the mmap'd tensor — fork-safe. +# /dev/shm (~126 GB) is too small even for the cropped 323 GB dataset. +# +# Cropped zarrs (zarrv3_cropped/) shrink each FOV from +# (T, 3ch, Z=126, 2048, 2048) full-tile down to (T, 3ch, Z=50, ~1500-2048, +# ~1300-2048) so total staging fits well under /tmp. +# +# Standalone (does NOT inherit finetune_a549_infected.yml) because the +# parent's data: block authors a CombinedDataModule with +# train_mode/val_mode init_args that BatchedConcatDataModule rejects. +base: + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml + +model: + class_path: cytoland.engine.VSUNet + init_args: + architecture: fcmae + model_config: + in_channels: 1 + out_channels: 2 + encoder_blocks: [3, 3, 9, 3] + encoder_drop_path_rate: 0.1 + dims: [96, 192, 384, 768] + decoder_conv_blocks: 2 + stem_kernel_size: [5, 4, 4] + in_stack_depth: 15 + pretraining: false + loss_function: + class_path: viscy_utils.losses.MixedLoss + init_args: + l1_alpha: 0.5 + l2_alpha: 0.0 + ms_dssim_alpha: 0.5 + # Smaller lr than the published VSCyto3D recipe because we're + # finetuning from a strong ckpt onto a smaller dataset (~50 FOVs) + # for a focused domain shift (A549 infected-cell phenotype). + lr: 2.0e-5 + schedule: WarmupCosine + # ~50 train FOVs × ~5 T (avg) × (50-19)=31 Z windows = ~7700 patches + # per epoch dataset-wide. At 256 patches/step (joint, 4 ranks) → + # ~30 steps/epoch. ~1 epoch warmup = 30. + warmup_steps: 30 + warmup_multiplier: 1e-3 + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/fcmae-cyto3d-sensor/vscyto3d-logs/hek-a549-ipsc-finetune/checkpoints/epoch=83-step=14532-loss=0.492.ckpt + +trainer: + strategy: ddp_find_unused_parameters_true + precision: bf16-mixed + # Finetune budget per user spec: 30 epochs is plenty for adapting from + # a converged ckpt onto this small domain-shifted set. + max_epochs: 30 + logger: + init_args: + project: cytoland + name: VSCyto3D_ft_A549_infected_4gpu_batched + save_dir: /hpc/mydata/eduardo.hirata/cytoland/a549_infected_4gpu_batched + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/mydata/eduardo.hirata/cytoland/a549_infected_4gpu_batched/checkpoints + +# Shared HCS init args — mirrors dynacell joint config defaults. +_hcs_init_args: &hcs_init_args + z_window_size: 20 + split_ratio: 0.8 + # Joint mode: batch_size is NOT divided by num_samples (see + # BatchedConcatDataModule.train_dataloader). 16 indices × num_samples=4 + # = 64 patches/step/rank × 4 ranks = 256 patches/step. + batch_size: 16 + # OOM-tuned for the cropped 459 GB mmap working set: each worker pulls + # full-FOV slabs (~660 MB read per index) before cropping. 16 indices + # × num_workers × prefetch_factor batches in-flight = the binding + # constraint, not the mmap virtual size. Halving each cuts ~4× per + # rank. + num_workers: 2 + prefetch_factor: 1 + persistent_workers: true + mmap_preload: true + scratch_dir: /tmp + pin_memory: false + yx_patch_size: [384, 384] + +# CPU-side weighted crop — dynacell pattern. RandWeightedCropd with +# num_samples=4 yields 4 patches per stack, all weighted by the nuclear +# marker channel. Runs in DataLoader workers (fork-safe because mmap_ +# preload closed zarr handles before fork). +_d1_d2_normalizations: &d1_d2_normalizations + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [DAPI_Density3D, TXR_Density3D] + level: timepoint_statistics + subtrahend: median + divisor: iqr + +_d1_d2_augmentations: &d1_d2_augmentations + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, DAPI_Density3D, TXR_Density3D] + w_key: DAPI_Density3D + spatial_size: [20, 600, 600] + num_samples: 4 + +_gpu_augmentations: &gpu_augmentations + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + +_val_gpu_augmentations: &val_gpu_augmentations + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.combined.BatchedConcatDataModule + init_args: + data_modules: + # D1 — Hummingbird 2026-01-29 (cropped: 15 FOVs, T=3, Z=50, 2048×2048) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/zarrv3_cropped/2026_01_29_A549_H2B_CAAX_DAPI_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: [DAPI_Density3D, TXR_Density3D] + normalizations: *d1_d2_normalizations + augmentations: *d1_d2_augmentations + gpu_augmentations: *gpu_augmentations + val_gpu_augmentations: *val_gpu_augmentations + + # D2 — Hummingbird 2026-03-10 (cropped: 15 FOVs, T=3, Z=50, 2025×1998) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/zarrv3_cropped/2026_03_10_A549_H2B_CAXX_DAPI_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: [DAPI_Density3D, TXR_Density3D] + normalizations: *d1_d2_normalizations + augmentations: *d1_d2_augmentations + gpu_augmentations: *gpu_augmentations + val_gpu_augmentations: *val_gpu_augmentations + + # D3 — Mantis 2026-03-26 (cropped: 17 FOVs, T=11, Z=50, 1600×1332). + # 27 FOVs held out for test on the un-cropped store. + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/zarrv3_cropped/2026_03_26_A549_CAAX_H2B_DENV_ZIKV.zarr + source_channel: Phase3D + target_channel: + - "raw Cy5 EX639 EM698-70" + - "raw mCherry EX561 EM600-37" + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: timepoint_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + level: timepoint_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: + - Phase3D + - "raw mCherry EX561 EM600-37" + - "raw Cy5 EX639 EM698-70" + w_key: "raw Cy5 EX639 EM698-70" # H2B nuclear marker; mCherry is membrane on D3 + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: *gpu_augmentations + val_gpu_augmentations: *val_gpu_augmentations diff --git a/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_d2_smoke.yml b/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_d2_smoke.yml new file mode 100644 index 000000000..ca7c5b620 --- /dev/null +++ b/applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_d2_smoke.yml @@ -0,0 +1,61 @@ +# Smoke-test leaf: warm-start VSCyto3D on D2 alone (Hummingbird 2026-03-10) +# — single HCSDataModule, tiny batch, limited steps. Purpose: verify model +# load / forward / backward / val on the simplest subset of the full +# finetune_a549_infected.yml stack before scaling up. +# +# Model + the D2 recipe's gpu_augmentations mirror the dynacell +# FCMAE-VSCyto3D warm-start recipe +# (applications/dynacell/configs/benchmarks/.../fcmae_vscyto3d_fit.yml) +# so this smoke exercises the same code paths the main leaf runs at scale. +# +# Usage: +# uv run python -m cytoland fit --config vscyto3d/finetune_a549_infected_d2_smoke.yml +base: + - ../recipes/trainer/fit.yml + - ../recipes/data/hcs_a549_infected_d2_hummingbird.yml + +model: + class_path: cytoland.engine.VSUNet + init_args: + architecture: fcmae + model_config: + in_channels: 1 + out_channels: 2 + encoder_blocks: [3, 3, 9, 3] + encoder_drop_path_rate: 0.1 + dims: [96, 192, 384, 768] + decoder_conv_blocks: 2 + stem_kernel_size: [5, 4, 4] + in_stack_depth: 15 + pretraining: false + loss_function: + class_path: viscy_utils.losses.MixedLoss + init_args: + l1_alpha: 0.5 + l2_alpha: 0.0 + ms_dssim_alpha: 0.5 + lr: 0.0002 + schedule: Constant + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/fcmae-cyto3d-sensor/vscyto3d-logs/hek-a549-ipsc-finetune/checkpoints/epoch=83-step=14532-loss=0.492.ckpt + +trainer: + accelerator: gpu + strategy: auto + devices: 1 + precision: bf16-mixed + max_epochs: 1 + limit_train_batches: 2 + limit_val_batches: 2 + num_sanity_val_steps: 0 + logger: null + callbacks: [] + +data: + init_args: + # batch_size must be divisible by augmentations.RandWeightedCropd.num_samples (4). + batch_size: 4 + num_workers: 2 + # Disable mmap_preload for the smoke — building the full D2 cache would + # dominate smoke walltime. Prod leaves keep mmap_preload: true. + mmap_preload: false + persistent_workers: false diff --git a/applications/cytoland/examples/configs/vscyto3d/predict.yml b/applications/cytoland/examples/configs/vscyto3d/predict.yml index 892431a56..7728eb18a 100644 --- a/applications/cytoland/examples/configs/vscyto3d/predict.yml +++ b/applications/cytoland/examples/configs/vscyto3d/predict.yml @@ -2,7 +2,8 @@ # Checkpoint: https://public.czbiohub.org/comp.micro/viscy/VS_models/VSCyto3D/epoch=48-step=18130.ckpt # Usage: python -m cytoland predict --config vscyto3d/predict.yml base: - - ../recipes/trainer/predict_gpu.yml + - ../recipes/trainer/predict.yml + - ../recipes/topology/single_gpu.yml - ../recipes/data/hcs_nuc_mem_3d.yml - ../recipes/models/unext2_3d.yml diff --git a/applications/cytoland/examples/configs/vscyto3d/preprocess_a549_infected_d3.sh b/applications/cytoland/examples/configs/vscyto3d/preprocess_a549_infected_d3.sh new file mode 100755 index 000000000..4018dd3be --- /dev/null +++ b/applications/cytoland/examples/configs/vscyto3d/preprocess_a549_infected_d3.sh @@ -0,0 +1,23 @@ +#!/bin/bash +# One-shot preprocess for the A549 Mantis infection dataset (D3). +# +# Writes per-FOV and per-timepoint normalization statistics into the +# zarr's .zattrs for the three channels actually used by the VSCyto3D +# finetune (Phase3D source + raw mCherry / raw Cy5 targets). D1 and D2 +# already have normalization_metadata and do not need preprocessing. +# +# Usage: +# bash applications/cytoland/examples/configs/vscyto3d/preprocess_a549_infected_d3.sh +set -euo pipefail + +REPO_ROOT="$(git -C "$(dirname "${BASH_SOURCE[0]}")" rev-parse --show-toplevel)" +cd "$REPO_ROOT" +mkdir -p .tmp/preprocess_logs + +uv run viscy preprocess \ + --data_path /hpc/projects/virtual_staining/training/a549/2026_05_infected_cell/2026_03_26_A549_CAAX_H2B_DENV_ZIKV.zarr \ + --channel_names+ "Phase3D" \ + --channel_names+ "raw mCherry EX561 EM600-37" \ + --channel_names+ "raw Cy5 EX639 EM698-70" \ + --num_workers 16 \ + 2>&1 | tee .tmp/preprocess_logs/d3_preprocess.log diff --git a/applications/cytoland/examples/configs/vscyto3d/pretrain.yml b/applications/cytoland/examples/configs/vscyto3d/pretrain.yml index c9b0087d1..18e673362 100644 --- a/applications/cytoland/examples/configs/vscyto3d/pretrain.yml +++ b/applications/cytoland/examples/configs/vscyto3d/pretrain.yml @@ -1,7 +1,8 @@ # VSCyto3D: FCMAE self-supervised pretraining. # Usage: python -m cytoland fit --config vscyto3d/pretrain.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/cached_pretrain.yml model: @@ -24,7 +25,9 @@ model: log_samples_per_batch: 1 trainer: + # FCMAE pretraining requires find_unused_parameters=True (masked decoder). strategy: ddp_find_unused_parameters_true + precision: 16-mixed max_epochs: 400 use_distributed_sampler: false callbacks: diff --git a/applications/cytoland/examples/configs/vscyto3d/run_a549_4gpu_batched.slurm b/applications/cytoland/examples/configs/vscyto3d/run_a549_4gpu_batched.slurm new file mode 100644 index 000000000..97f3a1398 --- /dev/null +++ b/applications/cytoland/examples/configs/vscyto3d/run_a549_4gpu_batched.slurm @@ -0,0 +1,47 @@ +#!/bin/bash +# 4-GPU production training for VSCyto3D A549 infected-cell finetune. +# Uses cropped zarrs (~323 GB total) so mmap_preload stages to /tmp +# without OOM. Mirrors dynacell joint training pattern. +# +# sbatch applications/cytoland/examples/configs/vscyto3d/run_a549_4gpu_batched.slurm +# +# Architecture: BatchedConcatDataModule + zarr-python + mmap_preload + fork. +# - mmap_preload stages cropped FOVs to /tmp (~323 GB; node /tmp is 28 TB) +# - During training, DataLoader workers (fork) read from MemoryMappedTensor +# instead of zarr — no fork-after-asyncio issue. + +#SBATCH --job-name=VSCyto3D_A549_4gpu_batched +#SBATCH --time=22:00:00 +#SBATCH --nodes=1 +#SBATCH --ntasks-per-node=4 +#SBATCH --partition=gpu +#SBATCH --cpus-per-task=8 +#SBATCH --gpus=4 +#SBATCH --mem=1024G +#SBATCH --constraint='h200|h100' +#SBATCH --output=/home/eduardo.hirata/repos/viscy/slurm_logs/cytoland_4gpu_batched/run_%j.out +#SBATCH --error=/home/eduardo.hirata/repos/viscy/slurm_logs/cytoland_4gpu_batched/run_%j.err + +set -euo pipefail + +mkdir -p /home/eduardo.hirata/repos/viscy/slurm_logs/cytoland_4gpu_batched +mkdir -p /hpc/mydata/eduardo.hirata/cytoland/a549_infected_4gpu_batched/checkpoints + +ml uv + +export PYTHONUNBUFFERED=1 +export PYTHONNOUSERSITE=1 +# Limit threading libs so DataLoader workers don't oversubscribe the +# 8-core/rank allocation. +export OMP_NUM_THREADS=1 +export MKL_NUM_THREADS=1 +export NUMEXPR_NUM_THREADS=1 + +REPO=/hpc/mydata/eduardo.hirata/repos/viscy +cd "${REPO}" + +nvidia-smi +df -h /tmp + +srun uv run python -m cytoland fit \ + --config applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected_4gpu_batched.yml diff --git a/applications/cytoland/examples/configs/vscyto3d/run_a549_infected.slurm b/applications/cytoland/examples/configs/vscyto3d/run_a549_infected.slurm new file mode 100755 index 000000000..bbc43182b --- /dev/null +++ b/applications/cytoland/examples/configs/vscyto3d/run_a549_infected.slurm @@ -0,0 +1,27 @@ +#!/bin/bash +# Full finetune: VSCyto3D warm-started on A549 infection data from three +# microscopes (D1/D2/D3). Run from repo root: sbatch applications/cytoland/examples/configs/vscyto3d/run_a549_infected.slurm + +#SBATCH --job-name=VSCyto3D_A549_infected +#SBATCH --time=5-00:00:00 +#SBATCH --nodes=1 +#SBATCH --ntasks-per-node=4 +#SBATCH --partition=gpu +#SBATCH --cpus-per-task=16 +#SBATCH --gpus=4 +#SBATCH --mem=1024G +# Need >=80 GB VRAM so the 3-DM cycled batch (16x4x3 = 192 crops/step/GPU) +# fits with bf16 + 35M-param FCMAE. Excludes 40 GB A100 / 48 GB A40/A6000. +#SBATCH --constraint='h200|h100_80|a100_80' +#SBATCH --output=/hpc/projects/comp.micro/virtual_staining/models/cytoland/a549_infected/vscyto3d/slurm/%j.out +#SBATCH --error=/hpc/projects/comp.micro/virtual_staining/models/cytoland/a549_infected/vscyto3d/slurm/%j.err + +mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/cytoland/a549_infected/vscyto3d/slurm +mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/cytoland/a549_infected/vscyto3d/checkpoints + +ml uv + +export PYTHONUNBUFFERED=1 + +nvidia-smi +srun uv run python -m cytoland fit --config applications/cytoland/examples/configs/vscyto3d/finetune_a549_infected.yml diff --git a/applications/cytoland/examples/configs/vscyto3d/train_spotlight.yml b/applications/cytoland/examples/configs/vscyto3d/train_spotlight.yml index f7ba5642f..a5cbdd25c 100644 --- a/applications/cytoland/examples/configs/vscyto3d/train_spotlight.yml +++ b/applications/cytoland/examples/configs/vscyto3d/train_spotlight.yml @@ -2,7 +2,8 @@ # Requires: viscy preprocess --compute_otsu --compute_fg_masks # Usage: python -m cytoland fit --config vscyto3d/train_spotlight.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/hcs_nuc_mem_3d.yml - ../recipes/modes/spotlight.yml - ../recipes/models/unext2_3d.yml @@ -12,6 +13,10 @@ model: lr: 0.0002 schedule: WarmupCosine +trainer: + precision: 16-mixed + max_epochs: 200 + data: init_args: data_path: #TODO HCS OME-Zarr data diff --git a/applications/cytoland/examples/configs/vsneuromast/fit.yml b/applications/cytoland/examples/configs/vsneuromast/fit.yml index cdbc41b9c..371c61904 100644 --- a/applications/cytoland/examples/configs/vsneuromast/fit.yml +++ b/applications/cytoland/examples/configs/vsneuromast/fit.yml @@ -1,7 +1,8 @@ # VSNeuromast: supervised training from scratch (no pretraining). # Usage: python -m cytoland fit --config vsneuromast/fit.yml base: - - ../recipes/trainer/fit_4gpu.yml + - ../recipes/trainer/fit.yml + - ../recipes/topology/ddp_4gpu.yml - ../recipes/data/hcs_nuc_mem_neuromast.yml - ../recipes/models/unext2_neuromast.yml @@ -16,6 +17,10 @@ model: lr: 0.001 schedule: Constant +trainer: + precision: 16-mixed + max_epochs: 200 + data: init_args: data_path: #TODO HCS OME-Zarr data diff --git a/applications/cytoland/examples/configs/vsneuromast/predict.yml b/applications/cytoland/examples/configs/vsneuromast/predict.yml index 273ebc002..2f56a67e9 100644 --- a/applications/cytoland/examples/configs/vsneuromast/predict.yml +++ b/applications/cytoland/examples/configs/vsneuromast/predict.yml @@ -2,7 +2,8 @@ # Checkpoint: https://public.czbiohub.org/comp.micro/viscy/VS_models/VSNeuromast/epoch=64-step=24960.ckpt # Usage: python -m cytoland predict --config vsneuromast/predict.yml base: - - ../recipes/trainer/predict_gpu.yml + - ../recipes/trainer/predict.yml + - ../recipes/topology/single_gpu.yml - ../recipes/data/hcs_nuc_mem_neuromast.yml - ../recipes/models/unext2_neuromast.yml diff --git a/applications/cytoland/pyproject.toml b/applications/cytoland/pyproject.toml index abce3c8d6..66ba77117 100644 --- a/applications/cytoland/pyproject.toml +++ b/applications/cytoland/pyproject.toml @@ -14,14 +14,13 @@ keywords = [ ] license = "BSD-3-Clause" authors = [ { name = "Biohub", email = "compmicro@czbiohub.org" } ] -requires-python = ">=3.11" +requires-python = ">=3.12" classifiers = [ "Development Status :: 4 - Beta", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python :: 3 :: Only", - "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.13", "Programming Language :: Python :: 3.14", diff --git a/applications/cytoland/src/cytoland/engine.py b/applications/cytoland/src/cytoland/engine.py index 5087ea052..03b272c08 100644 --- a/applications/cytoland/src/cytoland/engine.py +++ b/applications/cytoland/src/cytoland/engine.py @@ -10,10 +10,8 @@ import torch.nn.functional as F from imageio import imwrite from lightning.pytorch import LightningModule -from monai.optimizers import WarmupCosineSchedule from monai.transforms import DivisiblePad, Rotate90 from torch import Tensor, nn -from torch.optim.lr_scheduler import ConstantLR from torchmetrics.functional import ( accuracy, cosine_similarity, @@ -31,6 +29,7 @@ from viscy_utils.callbacks.prediction_writer import _blend_in from viscy_utils.evaluation.metrics import mean_average_precision from viscy_utils.log_images import detach_sample, log_image_grid +from viscy_utils.optimizers import configure_adamw_scheduler from viscy_utils.tensor_utils import to_numpy _UNET_ARCHITECTURE = { @@ -141,6 +140,8 @@ def __init__( loss_function: nn.Module | None = None, lr: float = 1e-3, schedule: Literal["WarmupCosine", "Constant"] = "Constant", + warmup_steps: int = 3, + warmup_multiplier: float = 1e-3, freeze_encoder: bool = False, ckpt_path: str | None = None, log_batches_per_epoch: int = 8, @@ -153,7 +154,7 @@ def __init__( tta_type: Literal["mean", "median", "product"] = "mean", ) -> None: super().__init__() - self.save_hyperparameters(ignore=["loss_function"]) + self.save_hyperparameters(ignore=["loss_function", "ckpt_path"]) if model_config is None: model_config = {} net_class = _UNET_ARCHITECTURE.get(architecture) @@ -165,6 +166,8 @@ def __init__( self.loss_function = loss_function if loss_function else nn.MSELoss() self.lr = lr self.schedule = schedule + self.warmup_steps = warmup_steps + self.warmup_multiplier = warmup_multiplier self.log_batches_per_epoch = log_batches_per_epoch self.log_samples_per_batch = log_samples_per_batch self.training_step_outputs = [] @@ -510,18 +513,14 @@ def configure_optimizers(self): f"(e.g. FullyConvolutionalMAE), got {type(self.model).__name__}" ) self.model.encoder.requires_grad_(False) - optimizer = torch.optim.AdamW(self.model.parameters(), lr=self.lr) - if self.schedule == "WarmupCosine": - scheduler = WarmupCosineSchedule( - optimizer, - warmup_steps=3, - t_total=self.trainer.estimated_stepping_batches, - warmup_multiplier=1e-3, - ) - return [optimizer], [{"scheduler": scheduler, "interval": "step"}] - elif self.schedule == "Constant": - scheduler = ConstantLR(optimizer, factor=1, total_iters=self.trainer.max_epochs) - return [optimizer], [scheduler] + return configure_adamw_scheduler( + self, + self.model, + self.lr, + self.schedule, + warmup_steps=self.warmup_steps, + warmup_multiplier=self.warmup_multiplier, + ) def _log_samples(self, key: str, imgs: Sequence[Sequence[np.ndarray]]): """Log image sample grid to the active logger (TensorBoard or W&B).""" diff --git a/applications/dynacell/.gitignore b/applications/dynacell/.gitignore new file mode 100644 index 000000000..0cc49df5c --- /dev/null +++ b/applications/dynacell/.gitignore @@ -0,0 +1,3 @@ +lightning_logs/ +outputs/ +__pycache__/ diff --git a/applications/dynacell/CLAUDE.md b/applications/dynacell/CLAUDE.md new file mode 100644 index 000000000..47756a8fe --- /dev/null +++ b/applications/dynacell/CLAUDE.md @@ -0,0 +1,34 @@ +# dynacell — Claude Code reference + +## Model name conventions + +Code names (used in YAML config keys, prediction zarr filenames, eval pipeline keys, W&B run names) differ from the paper names. When writing/reading anything that crosses the code/paper boundary (figures, tables, Confluence pages, manuscripts), translate: + +| Code name (config / zarr / W&B) | Paper / display name | +| --- | --- | +| `fcmae_vscyto3d_scratch` | **UNeXt2** | +| `fcmae_vscyto3d_pretrained` | **VSCyto3D** (FCMAE-pretrained is the canonical VSCyto3D variant) | +| `unext2` | UNeXt2 (legacy zarr prefix; superseded by `fcmae_vscyto3d_scratch`) | +| `vscyto3d` | VSCyto3D (display key in Dihan's eval pipeline; sources `*_fcmae_vscyto3d_pretrained` predictions) | +| `unetvit3d` | UNetViT3D | +| `fnet3d_paper` | FNet3D | +| `celldiff` | CELL-Diff (variants: `iterative`, `sliding_window`, `denoise`/Mean Predictor) | + +Eval-pipeline directory naming (`/hpc/projects/virtual_staining/training/dynacell/{ipsc,a549}/evaluations/eval__[_]`) uses the **paper key** (`unext2`, `vscyto3d`, `fnet3d`, `unetvit3d`, `celldiff_*`), not the config key. So `eval_unext2_membrane` maps to the `fcmae_vscyto3d_scratch` predictions, `eval_vscyto3d_membrane` maps to `fcmae_vscyto3d_pretrained`. + +## Prediction zarr naming convention + +Set by `trainer.callbacks[…HCSPredictionWriter].init_args.output_store` in each leaf of `applications/dynacell/configs/benchmarks/virtual_staining////predict__*.yml`. The infix between model name and the optional plate condition flags the **training set** of the source model: + +| Trained on | Test set | Filename | +| --- | --- | --- | +| iPSC | iPSC | `_.zarr` | +| iPSC | A549 plate | `__.zarr` | +| A549 | iPSC | `__a549trained.zarr` | +| A549 | A549 plate | `__a549trained_.zarr` | +| Joint (iPSC + A549) | iPSC | `__jointtrained.zarr` | +| Joint (iPSC + A549) | A549 plate | `__jointtrained_.zarr` | + +Where `` is `nucl` / `memb` / `sec61b` / `tomm20`, `` is the **code name** from the table above (e.g. `fcmae_vscyto3d_scratch`, `fnet3d_paper`), and `` is `mock` / `denv` / `zikv`. The (no-infix) iPSC-trained naming is historical baggage from before joint/A549 training existed; don't add a `_ipsctrained` infix retroactively. Output dirs: iPSC test predictions land under `ipsc/predictions/`, A549 plate predictions under `a549/predictions/`, regardless of training set. + +Caveat: Dihan's earlier ER + Mito iPSC-trained zarrs use a legacy `____.zarr` shape (e.g. `sec61b_fcmae_vscyto3d_scratch__sec61b_mock.zarr`, double-underscore + redundant gene prefix). New leaves should follow the table above; do not propagate the legacy form. diff --git a/applications/dynacell/README.md b/applications/dynacell/README.md index c122cf2f5..764871776 100644 --- a/applications/dynacell/README.md +++ b/applications/dynacell/README.md @@ -7,17 +7,15 @@ Benchmark virtual staining application for deterministic and generative architec Set `data_path` in the config file or pass it on the command line: ```bash -cd applications/dynacell/examples/configs +cd applications/dynacell/configs/examples # Deterministic models -uv run dynacell fit -c unetvit3d/fit.yml --data.init_args.data_path=/path/to/data.zarr uv run dynacell fit -c fnet3d/fit.yml --data.init_args.data_path=/path/to/data.zarr -uv run dynacell predict -c unetvit3d/predict.yml --data.init_args.data_path=/path/to/data.zarr --ckpt_path=/path/to/checkpoint.ckpt -uv run dynacell predict -c fnet3d/predict.yml --data.init_args.data_path=/path/to/data.zarr --ckpt_path=/path/to/checkpoint.ckpt +uv run dynacell fit -c unext2/fit.yml --data.init_args.data_path=/path/to/data.zarr +uv run dynacell fit -c unetvit3d/fit.yml --data.init_args.data_path=/path/to/data.zarr # Flow-matching CellDiff uv run dynacell fit -c celldiff/fit.yml --data.init_args.data_path=/path/to/data.zarr -uv run dynacell predict -c celldiff/predict.yml --data.init_args.data_path=/path/to/data.zarr --ckpt_path=/path/to/checkpoint.ckpt ``` ## Architectures @@ -34,26 +32,80 @@ uv run dynacell predict -c celldiff/predict.yml --data.init_args.data_path=/path Uses ODE sampling for inference. No external loss function needed — the flow-matching loss is computed internally. -## SEC61B Benchmark +## Config Structure + +- `configs/recipes/` — reusable fragments (model, trainer, data, modes) +- `configs/examples/` — generic fit/predict pair per model family (stubs with + `#TODO` placeholders) +- `configs/benchmarks/virtual_staining/` — runnable benchmark leaves composed + from shared axes. One file per (organelle, train_set, model) for fit and + one per (organelle, train_set, model, predict_set) for predict. See + `configs/benchmarks/virtual_staining/README.md` for the layout and + composition order. +- `tools/submit_benchmark_job.py` — drives one benchmark leaf end-to-end + (compose → strip launcher metadata → render sbatch → submit). Use + `--print-script` for a safe preview on any leaf, or `--dry-run` to + stage artifacts to `launcher.run_root` without submitting (requires + write permission on that path). -Launch SEC61B training from Dynacell (canonical location): +### Benchmark submit ```bash -# FNet3D benchmark config -uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml +LEAF=applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/train.yml -# FNet3D paper-native baseline config -uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml +# Preview the rendered sbatch to stdout — safe on any leaf, no disk writes: +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF --print-script -# UNeXt2 (VSCyto3D) -uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_unext2.yml +# Preview the resolved LightningCLI config (launcher+benchmark stripped): +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF --print-resolved-config -# SLURM (H200) -sbatch applications/dynacell/examples/configs/sec61b/run_fnet3d.slurm -sbatch applications/dynacell/examples/configs/sec61b/run_fnet3d_paper.slurm -sbatch applications/dynacell/examples/configs/sec61b/run_unext2.slurm +# Stage artifacts to launcher.run_root without submitting (requires write perms): +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF --dry-run + +# Submit: +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF + +# Dotlist overrides deep-merge after compose (repeatable, no ${...} interpolation): +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF \ + --override trainer.max_epochs=50 \ + --override data.init_args.batch_size=2 ``` +Flag semantics: + +- `--print-script` / `--print-resolved-config` — pure preview: stdout + only, no disk writes, no submission. Safe against run_roots the caller + can't write to. +- `--dry-run` alone — write resolved YAML + rendered sbatch under + `launcher.run_root`, but skip `sbatch`. Requires write permission on + that path. +- `--dry-run` combined with any `--print-*` — preview wins (no writes). +- Bare invocation — write artifacts **and** submit. + +Benchmark leaves carry two reserved top-level YAML keys (`launcher:` and +`benchmark:`) that are stripped automatically before the config reaches +LightningCLI, so `uv run dynacell fit -c ` also works +without the submit tool. + +See `configs/benchmarks/virtual_staining/README.md` for the shared-axis +layout, composition order, and reserved-key contract. + +## Manifest registry (drift policy) + +Benchmark leaves resolve `benchmark.dataset_ref` lookups against a bundled +manifest registry shipped with the dynacell wheel at +`applications/dynacell/src/dynacell/_manifests/`. The resolver auto-discovers +this via the `dynacell.manifest_roots` entry point, so `uv run dynacell +predict -c ` works out of the box without `DYNACELL_MANIFEST_ROOTS`. +Override the env var to point at an alternate registry for testing. + +VisCy is the source of truth for manifest **content**; `dynacell-paper` +remains the source of truth for manifest **authoring**. When a new plate +is preprocessed in `dynacell-paper`, mirror the new manifest (and its +`splits/` siblings) into `applications/dynacell/src/dynacell/_manifests/`. +The `tests/test_manifest_sync.py` suite catches drift when run with +`DYNACELL_PAPER_PATH=/path/to/dynacell-paper` set. + ## Supported subcommands - `fit` and `validate`: fully supported for all architectures diff --git a/applications/dynacell/configs/benchmarks/A549_EXPANSION_ROADMAP.md b/applications/dynacell/configs/benchmarks/A549_EXPANSION_ROADMAP.md new file mode 100644 index 000000000..64b886767 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/A549_EXPANSION_ROADMAP.md @@ -0,0 +1,160 @@ +# A549 Expansion Roadmap + +Multi-stage rollout adding A549/mantis-lightsheet alongside the +existing iPSC/confocal benchmark cells, with a manifest-driven +dataset resolver as the foundation. + +## Goal + +- **Two training sets per (organelle, model) cell**: `ipsc_confocal` + and `joint_ipsc_confocal_a549_mantis`. +- **Two held-out evaluation splits per trained model**: + `ipsc_confocal` and `a549_mantis`. Every trained model evaluates on + both, regardless of training source, so cross-dataset transfer is + measurable. + +The post-reorg layout (`14f59f1`) supports this — each +`///` dir is a training experiment with room +for multiple `predict__.yml` and +`eval__.yaml` leaves. The resolver removed the data-path +duplication that would otherwise blow up across ~60 new leaves. + +## Status snapshot (2026-04-26) + +| Stage | Description | Status | +|---|---|---| +| 1 | Resolver core + 1 migration (`er_sec61b`, `ipsc_confocal`) | **Done** — `38d47b3`, `4bb9f09` | +| 2 | Migrate `mito_tomm20`, `nucleus`, `membrane` to `dataset_ref` | **Done** — `11836c8`, `326b2d0`, `6273439` | +| 3 | Hydra-side hook + 4 eval target YAMLs migrated | **Done** — `8924ab2`, `f5a6e56`, `a984384` | +| 4 | (folded into Stage 3) | n/a | +| 5 | Register a549-mantis manifests | **Partial** — done in dynacell-paper (`aeef64c`, 7 per-plate manifests 2024_10_29 → 2025_08_26); VisCy fixture mirror missing. A549 zarr normalization-stats backfill closed 2026-04-24 (dynacell-paper `f4120e0` + 17-zarr backfill). | +| 6 | Single-dataset a549 predict + eval leaves | **Not started** | +| 7 | Joint training leaves (ipsc + a549) | **In flight — blocked**. First leaf + smoke variants shipped (`er/celldiff`: `9654e2b`, `4d399d5`, `234819a`). 4-GPU DDP smoke still hangs after PR #413 (`0b04b24`) — a second deadlock surface remains; see `.claude/handoffs/handoff-batched-concat-ddp-hang-followup-2026-04-26.md`. | + +## Remaining work + +### Stage 5 — VisCy bundled manifest registry + +The canonical a549-mantis manifests live in `dynacell-paper`. VisCy +ships its own copy of the canonical YAMLs as a bundled registry under +`applications/dynacell/src/dynacell/_manifests/`, registered as a +`dynacell.manifest_roots` entry-point provider in +`applications/dynacell/pyproject.toml`. The resolver auto-discovers +this without any `DYNACELL_MANIFEST_ROOTS` env var configuration — +works on a fresh clone for any Stage 6 a549 leaf. Drift between the +mirror and dynacell-paper canonical is guarded by +`tests/test_manifest_sync.py`, which is skipped unless +`DYNACELL_PAPER_PATH` is set (typical CI / local dev environment). + +The a549 zarr normalization-stats gap (every `mantis_v1///.zarr` +missing `normalization` zattrs at plate and position level) closed on +2026-04-24: dynacell-paper `f4120e0` adds `generate_normalization_metadata` +as a post-write step in the assembly pipeline, and the 17 pre-hook +zarrs were backfilled in 5.7 min. Joint leaves consuming these stores +no longer fail or asymmetrically normalize at training time. Treat as +done; no VisCy-side action. + +### Stage 6 — single-dataset a549 predict + eval leaves + +Add `predict__a549_mantis.yml` + `eval__a549_mantis.yaml` to existing +`//ipsc_confocal/` cells so iPSC-trained models can +be evaluated on the a549 test split. + +Sub-scope (from the original roadmap, still unresolved): + +- **(iv) full-but-predictable-only** — the 8 cells that already have + `_predict.yml` overlays (celldiff + unetvit3d × 4 organelles). + Recommended starting point. +- **(iii) full-all-models** — additionally create skeleton + `fcmae_vscyto3d_predict.yml`, `fnet3d_paper_predict.yml`, + `unext2_predict.yml` overlays. Defer unless needed. + +Each leaf is 5–10 lines: + +```yaml +# eval__a549_mantis.yaml +defaults: + - override /target: er_sec61b +benchmark: + dataset_ref: {dataset: a549-mantis-2024_11_07, target: sec61b} +io: + pred_path: /hpc/.../sec61b_celldiff_on_a549.zarr +save: + save_dir: /hpc/.../eval_sec61b_celldiff_on_a549 +``` + +### Stage 7 — joint training leaf expansion + +The joint-loader infrastructure landed in `4bc2e53` (sharded sampler +in `BatchedConcatDataModule`) and `5950576` (split fit overlays). PR +#413 (`0b04b24`) addressed one DDP deadlock surface (the +`use_thread_workers=True` thread-shim under real `init_process_group`) +but the 4-GPU smoke still hangs at the same milestone — a second +deadlock surface remains; see +`.claude/handoffs/handoff-batched-concat-ddp-hang-followup-2026-04-26.md`. +Joint leaf expansion is blocked until this resolves. + +The first joint leaf shipped at +`er/celldiff/joint_ipsc_confocal_a549_mantis/train.yml` (`9654e2b`); +smoke variants followed (single-GPU `4d399d5`, 4-GPU DDP `234819a`). +The single-GPU smoke runs end-to-end against `_test48` debug zarrs; +the 4-GPU DDP smoke is the failing reproducer for the open deadlock. + +Smoke leaves rely on the `_test48` debug-zarr convention documented +in this app's `CLAUDE.md` and mirrored in `dynacell-paper`'s `CLAUDE.md`: +short-wall validation jobs override `data_path` to the colocated +`_test48.zarr` so `mmap_preload` finishes staging in under a +minute instead of 45+ min on the full 500-FOV stores. + +Joint leaves bypass the single-dataset `dataset_ref` resolver and +author the data block inline because hparams live on each child. +Shared HCS init_args factor via a YAML merge anchor. + +Remaining matrix: + +- Other organelles for `celldiff`: `mito`, `nucleus`, `membrane`. +- Other models for `er`: `unetvit3d`, `fcmae_vscyto3d_{scratch,pretrained}`, + `fnet3d_paper`, `unext2`. +- Cross-product: 4 organelles × 6 models = 24 cells (minus the one + already shipped). +- Companion leaves per joint cell: `predict__ipsc_confocal.yml`, + `predict__a549_mantis.yml`, `eval__ipsc_confocal.yaml`, + `eval__a549_mantis.yaml`. + +Decision pending: order of expansion. Reasonable defaults are +"finish the celldiff row first" (organelle sweep on a known-good +model) or "finish the er column first" (model sweep on a known-good +organelle). Pick when the next paper experiment lands. + +## Dependency graph + +``` +Stage 1 ✅ ─> Stage 2 ✅ ─> Stage 3 ✅ + └─> Stage 6 (predict/eval on a549) + ^ + │ +Stage 5 (a549 manifest) — partial ────┘ + canonical: done + VisCy fixture mirror: pending + +Stage 7 (joint training leaves) — independent of resolver path + first leaf + smoke variants: done + 4-GPU DDP smoke: blocked on remaining deadlock (see followup handoff) + expansion (24 cells + companion leaves): pending P0 deadlock fix +``` + +Stages 1–3 and 5 (canonical) blocked Stage 6. The remaining gap on +the VisCy side is the fixture mirror. Stage 7 has its own +infrastructure (`BatchedConcatDataModule` + `ShardedDistributedSampler`) +and is orthogonal to the resolver path. + +## Non-goals + +- FOV-level split resolution (Phase 5D of the dynacell-paper refactor — + about *FOV membership*, not *dataset facts*). +- New CLI flags on `dynacell fit` / `predict` — the resolver is implicit + via the composition hook. +- Reporting-side path resolution — reporting consumes eval outputs, not + source data. +- Changes to `_internal/shared/model/model_overlays/` or + `launcher_profiles/` — those are model/hardware concerns, orthogonal. diff --git a/applications/dynacell/configs/benchmarks/UNEXT2_VS_FCMAE_CLASSES.md b/applications/dynacell/configs/benchmarks/UNEXT2_VS_FCMAE_CLASSES.md new file mode 100644 index 000000000..eecfad8d9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/UNEXT2_VS_FCMAE_CLASSES.md @@ -0,0 +1,297 @@ +# `UNeXt2` vs `FullyConvolutionalMAE`: one paper architecture, two PyTorch models + +Reconciling the Cytoland paper +([Liu et al., *Nat. Mach. Intell.* 2025, doi:10.1038/s42256-025-01046-2](https://doi.org/10.1038/s42256-025-01046-2)) +with the two independent Python classes that claim to implement its +"UNeXt2" architecture. Needed while planning FCMAE-pretrained finetune +runs on `dynacell-models`, where the naming otherwise misleads. + +## TL;DR + +- The paper (Fig 1b ↔ 1c) describes **one** architecture — "UNeXt2" — + trained twice: first self-supervised via FCMAE masking, then supervised + with the pretrained encoder transferred in. +- The code has **two independent Python classes** claiming to implement + that architecture: `viscy_models.unet.unext2.UNeXt2` (timm-backed) and + `viscy_models.unet.fcmae.FullyConvolutionalMAE` (custom masked + re-implementation). They have **incompatible state_dicts** AND + **structurally different models** — verified below by parameter count. +- The split predates the packaging refactor and predates the `UNeXt2` + rename. The supervised path started as `viscy/unet/networks/Unet21D.py` + in August 2023, and the masked FCMAE path was added as + `viscy/unet/networks/fcmae.py` in April 2024. The key reason for the + second implementation was masked pre-training: `timm.models.convnext` + did not expose the per-block masking hooks needed by FCMAE, so Ziwen + Liu (paper lead author) wrote a standalone masked ConvNeXtV2 encoder. + Some of the larger architectural divergence we see today is current + implementation reality, not necessarily the original motivation. +- In the paper's published workflow, + **`FcmaeUNet(architecture="fcmae")` is used for BOTH the self-supervised + pretrain AND the supervised finetune** (the `pretraining` boolean + toggles masking in `forward`). The timm-backed `UNeXt2` class is + **never** used with FCMAE-pretrained weights. +- The checkpoint matters. The published and current fine-tuning script + `/hpc/mydata/alex.kalinin/vs_test/finetune_3d.py` loads + `/hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt`, + and that checkpoint **does** load into the current + `FullyConvolutionalMAE`/`FcmaeUNet` path. The other checkpoint explored + during planning, + `/hpc/projects/comp.micro/virtual_staining/models/fcmae-3d/fit_v1/.../last.ckpt`, + does **not** load into the current packaged FCMAE class because its + stem tensor shapes differ. +- **Setting `pretraining=False` on the FCMAE model does not produce the + same PyTorch model as `UNeXt2`.** They differ in stem (LayerNorm or + not), head (trainable Conv3d or pure PixelShuffle), num_blocks (6 vs 8), + total parameter count (32.4M vs 32.1M), and block forward numerics. + They are the same *conceptual* architecture from the paper's pen-and- + paper diagram, not the same PyTorch hypothesis class. +- So the currently-running dynacell `unext2.yml` job (timm-backed + `UNeXt2`) is a valid "from-scratch ConvNeXtV2-tiny baseline" but is + **not** the apples-to-apples random-init control for a FCMAE-pretrained + finetune. For a clean comparison, both runs must be + `FullyConvolutionalMAE(pretraining=False)`. + +## What the paper says (Fig 1b ↔ 1c) + +One architecture, called **UNeXt2** = +*3D projection stem + 2D encoder + 2D decoder + 3D head*. +Trained twice: + +- **1b (FCMAE pretrain):** masked input, reconstruction loss on masked + regions. +- **1c (virtual-staining supervised):** same net, pretrained encoder + weights copied in, decoder trained from scratch, phase→fluor regression. + +Unambiguous — it's the *same* network, two training regimes. + +## What the code actually has + +Two independent classes under `packages/viscy-models/src/viscy_models/unet/`: + +| | `unext2.py::UNeXt2` | `fcmae.py::FullyConvolutionalMAE` | +|---|---|---| +| Encoder impl | `timm.create_model("convnextv2_tiny", features_only=True)` with `stem_0 → nn.Identity()`, separate `UNeXt2Stem` prepended | Custom `MaskedMultiscaleEncoder` built from `MaskedConvNeXtV2Block` + `MaskedAdaptiveProjection` — from-scratch re-implementation of ConvNeXtV2 with masking hooks in every block | +| Stem params | `stem.weight`, `stem_1.weight` | `encoder.stem.conv3d.*`, `encoder.stem.conv2d.*`, `encoder.stem.norm.*` | +| Block params | `encoder_stages.stages_0.blocks.0.conv_dw.weight`, `.norm.weight` | `encoder.stages.0.blocks.0.dwconv.weight`, `.layernorm.weight` | +| Masking hook | none — inference only | `unmasked: BoolTensor \| None` kwarg threaded through every block's `forward` | +| State_dict interchange | — | **Not compatible.** No adapter exists in the codebase. | + +## Why `pretraining=False` does **not** collapse the gap + +The natural intuition is that `FullyConvolutionalMAE(pretraining=False)` +with `mask_ratio=0.0, unmasked=None` degenerates to a plain ConvNeXtV2 +forward pass and should therefore be structurally equivalent to `UNeXt2` +(both wrap ConvNeXtV2-tiny). Probing both classes at matching config +(`backbone=convnextv2_tiny, in_stack_depth=15, stem_kernel_size=[5,4,4], +decoder_conv_blocks=2, in_channels=1, out_channels=1, drop_path_rate=0.1`) +shows that is not the case: + +``` +UNeXt2 total params: 32,426,277 num_blocks: 6 +FullyConvolutionalMAE(p=F) total params: 32,148,528 num_blocks: 8 + delta: -277,749 (-0.86%) + +UNeXt2 children FCMAE(p=F) children + encoder_stages: 27,860,256 encoder: 27,857,856 (stem folded in) + stem: 2,592 decoder: 4,290,672 + decoder: 4,561,616 head: 0 + head: 1,813 (no separate stem module) + +UNeXt2 stem has LayerNorm? False +FCMAE encoder.stem has norm? True +``` + +Concrete structural differences that survive `unmasked=None`: + +1. **Stem normalization.** `MaskedAdaptiveProjection` applies + `nn.LayerNorm(out_channels)` after the 3D→channels projection. + `UNeXt2Stem` is just `Conv3d + reshape` with no normalization. The + first activations handed to stage 0 have different statistics in the + two classes. + +2. **Head is structurally different.** `UNeXt2.head` is + `PixelToVoxelHead` = `UpSample(pixelshuffle) + Conv3d + icnr_init + + PixelShuffle` (1,813 trainable params). + `FullyConvolutionalMAE.head` defaults to `PixelToVoxelShuffleHead` = + a pure `UpSample(pixelshuffle)` (**0 trainable params**) and pushes + all channel math into the decoder's last stage. Not the same output + pathway. `FullyConvolutionalMAE(head_conv=True, ...)` would select + `PixelToVoxelHead` but with different channel wiring than `UNeXt2`. + +3. **`num_blocks` differs (6 vs 8).** Consumed by + `DynacellUNet._make_divisible_pad` / `VSUNet._make_divisible_pad` to + require input spatial dims divisible by `2**num_blocks`. UNeXt2 needs + multiples of 64; FCMAE needs multiples of 256. A YX patch size that + validates for one will not necessarily validate for the other. + +4. **Block forward numerics diverge.** `MaskedConvNeXtV2Block.forward` is + `shortcut → dwconv → masked_patchify(x, unmasked=None) (flatten to + BLC) → LayerNorm on channels-last → GlobalResponseNormMlp(unsqueeze→ + squeeze) → masked_unpatchify (reshape back to BCHW) → drop_path + + shortcut`. Timm's `ConvNeXtV2Block.forward` is `shortcut → conv_dw → + norm (as LayerNorm2d in channels-first, or permute-for-channels-last + if `use_conv_mlp`) → mlp → gamma-scale (LayerScale when + `ls_init_value` is set) → drop_path + shortcut`. The masked block + always pays the patchify↔unpatchify reshape even in the no-mask case; + timm stays channels-first throughout; the LayerScale `gamma` + parameter is present in timm and absent in the masked block. Given + identical parameter tensors the two forward passes would not produce + bit-identical outputs. + +5. **Parameter count delta of 277,749 is structural, not initialization + noise.** Sources: the stem LayerNorm (+2 params), the head/decoder + partition difference (UNeXt2 head 1,813 + decoder 4,561,616 = 4,563,429 + vs FCMAE head 0 + decoder 4,290,672 = 4,290,672, delta 272,757 in the + decoder-plus-head block), and the block-level presence/absence of the + LayerScale `gamma` parameter. + +Conclusion: these are the same *conceptual* architecture from Fig 1 but +not the same PyTorch hypothesis class. Training one from scratch does +not yield an equivalent starting point to training the other from +scratch — different parameter sets, different normalization pathways, +different forward numerics. + +## Archaeology: why two on pre-refactor `main` + +History on `origin/main` (all commits by Ziwen Liu, paper's lead author): + +| SHA | Date | PR | Change | +|---|---|---|---| +| `b4ec13c` | 2023-08-30 | #37 | `viscy/unet/networks/Unet21D.py` introduced — supervised ConvNeXt-backed virtual-staining model with custom 3D stem and 3D head. This is the ancestor of today's `UNeXt2` class. | +| **`0536d29`** | **2024-04-08** | **#67** | **`viscy/unet/networks/fcmae.py` added as a new file**, commit titled "Masked autoencoder pre-training for virtual staining models". Squashed commit text explicitly shows the new masked encoder work: `draft fcmae encoder` → `add stem to the encoder` → `wip: masked stem layernorm` → `wip: patchify masked features for linear` → `use mlp from timm`. This was a new implementation, not a refactor of `Unet21D.py`. | +| `9a0fe64` | 2024-06-11 | #84 | `viscy/unet/networks/Unet21D.py` → `viscy/unet/networks/unext2.py`; class lineage rebranded to `UNeXt2`. `fcmae.py` remained a separate file. | + +**Why a standalone class instead of reusing Unet21D / UNeXt2?** +`timm.models.convnext.ConvNeXtBlock` has no per-block mask argument — +its `forward` computes `dwconv → norm → mlp → residual` with no hooks +for zeroing out masked activations or for sparse-gradient propagation. +FCMAE requires all three: masked dwconv input, +`masked_patchify`/`masked_unpatchify` around the pointwise MLP (so the +MLP only runs on visible patches and GRN statistics aren't polluted by +masked zeros), and drop-path/shortcut that skip the masked regions. The +clean path was to write `MaskedConvNeXtV2Block` from scratch with those +hooks baked in; monkey-patching timm's ConvNeXtBlock would have been +fragile across timm upgrades. + +**Why didn't the two codepaths converge later?** +There is no evidence that state_dict compatibility between the two +classes was ever a goal. The paper and the published scripts use the +FCMAE-side class for FCMAE pre-train and FCMAE-initialized finetune, and +use the supervised/timm side for scratch supervised baselines. So the +code never needed a translation layer to support the published workflow. +That explains the persistent key mismatch: `UNeXt2` inherits timm-style +naming (`stages_N`, `conv_dw`, `norm`), whereas the masked path uses its +own naming (`stages.N`, `dwconv`, `layernorm`). No adapter or +equivalence tests were added because the two state_dicts were not +expected to cross in production. + +## How the paper's own workflow handles the split + +The published fine-tuning path as currently exercised by +`/hpc/mydata/alex.kalinin/vs_test/finetune_3d.py` uses +**`FcmaeUNet` for both regimes**: + +```python +unet = FcmaeUNet(model_config=dict( + in_channels=1, out_channels=2, + encoder_blocks=[3, 3, 9, 3], encoder_drop_path_rate=0.1, + dims=[96, 192, 384, 768], decoder_conv_blocks=2, + stem_kernel_size=(5, 4, 4), in_stack_depth=15, + pretraining=False, # supervised mode, no masking in forward +)) + +if encoder_only: + encoder_weights = { + k.split("model.encoder.")[1]: v + for k, v in pretrained["state_dict"].items() + if "encoder" in k + } + unet.model.encoder.load_state_dict(encoder_weights) # same class, trivial load +``` + +`FcmaeUNet` wraps `FullyConvolutionalMAE`. The `pretraining` flag inside +`model_config` toggles masking in `forward`: +- `pretraining=True` → masked input + reconstruction loss (Fig 1b regime) +- `pretraining=False` → no masking + supervised regression loss (Fig 1c regime) + +Weight transfer between the two regimes is **trivial** because both +sides are `FullyConvolutionalMAE` — identical parameter names throughout. +No key translation, no adapter needed. + +On pre-refactor `main`, the encoder-only transfer lived in *user code*, +inside the fine-tune script, not in the library. The +`encoder_only` / `_load_encoder_weights` helper on +`cytoland.engine.FcmaeUNet` was added later on the modular branch to +formalize that same pattern. + +## Implications for our benchmarks + +The two Python classes serve distinct roles: + +- `FullyConvolutionalMAE` (via `FcmaeUNet`) — the FCMAE pretrain ⇄ + finetune codepath. This is what the paper's Fig 1b/1c workflow uses, + on both sides. +- `UNeXt2` — from-scratch supervised training *without* FCMAE + pretraining. Used for baselines / ablations that skip FCMAE entirely. + +**"UNeXt2" in the paper refers to the conceptual architecture, not the +Python class of the same name.** The Python class `UNeXt2` has never +been used with FCMAE-pretrained weights in any checked-in script or +benchmark — not on main, not on this branch, not in the published +artifacts. + +Dynacell's currently-running from-scratch job +(`benchmarks/virtual_staining/er/unext2/ipsc_confocal/train.yml`, SLURM +31122607) uses `DynacellUNet(architecture="UNeXt2")` — the timm-backed +class. That's a valid "from-scratch baseline with a timm ConvNeXtV2-tiny +encoder," but it trains a structurally different model (stem without +LayerNorm, Conv3d-backed head, 277k extra params, num_blocks=6) from +the FCMAE codepath. It is **not** the apples-to-apples random-init +control for an FCMAE-pretrained-init finetune: it's a different +hypothesis class that happens to share the paper's conceptual name. A +paper-faithful comparison requires both runs to use +`FullyConvolutionalMAE(pretraining=False)`. + +### Recommended benchmark layout for dynacell + +Do **not** treat the current `unext2.yml` leaf as the random-init control +for an FCMAE-pretrained run. Keep it, but label it honestly as the +timm-backed supervised UNeXt2 baseline. + +For the FCMAE question, add a separate pair of leaves that use the same +class on both sides: + +- `fcmae_vscyto3d_scratch` +- `fcmae_vscyto3d_pretrained` + +Those two leaves should be identical except for encoder initialization: + +- same `FullyConvolutionalMAE(pretraining=False)` / `FcmaeUNet`-style model +- same decoder config +- same LR / batch / crops / epochs +- only `encoder_only + ckpt_path` differs + +Use the compatible checkpoint from the latest fine-tuning script: + +- `/hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt` + +Do **not** use the incompatible checkpoint: + +- `/hpc/projects/comp.micro/virtual_staining/models/fcmae-3d/fit_v1/lightning_logs/pretrain-neuro-aic-hek-200ep_maxsize_fry1_resume4/checkpoints/last.ckpt` + +### Alternative paths + +1. **Use `FullyConvolutionalMAE(pretraining=False)` for both the + random-init and FCMAE-pretrained-init leaves** (retire the + timm-backed `unext2.yml` leaf, or re-frame it as a separate + baseline). Paper-faithful. The only axis of comparison between the + two new leaves is the encoder init. +2. **Keep the existing timm-backed `unext2.yml` as an informal baseline**, + add a `FullyConvolutionalMAE(pretraining=False)` FCMAE-finetune leaf + on the side. Comparison has an architecture asterisk — same paper + concept, structurally different PyTorch models (param count, stem, + head, num_blocks). +3. **Unify the two classes in `viscy-models`** (replace `UNeXt2`'s timm + encoder with a shared backbone that supports optional masking, or + make the timm encoder's state_dict transformable to FCMAE naming via + a one-shot adapter). Clean but a separate `viscy-models` PR. diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/README.md b/applications/dynacell/configs/benchmarks/virtual_staining/README.md new file mode 100644 index 000000000..7faf6e356 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/README.md @@ -0,0 +1,227 @@ +# Virtual Staining Benchmark Configs + +Composable leaf-per-experiment configs for dynacell virtual-staining +benchmarks. Train, predict, and eval leaves for one training run live +side-by-side under `///` — one subdir per training +experiment so a trained model, its predictions, and its evaluations +form one coherent unit. + +## Reserved top-level keys + +Two top-level YAML keys are **reserved for dynacell** and are stripped +from the composed config before it reaches LightningCLI: + +- `launcher:` — sbatch directives, runtime env, job metadata. Consumed by + `applications/dynacell/tools/submit_benchmark_job.py`. +- `benchmark:` — informational experiment metadata (target, train_set, + experiment_id). Readable by downstream reporting; not consumed by + Lightning. + +The strip happens inside `viscy_utils.cli._maybe_compose_config`. This +means `uv run dynacell fit -c ` works for any benchmark leaf +without the dedicated submit tool. + +The reserved top-level YAML key `benchmark:` (above) is unrelated to the +Hydra `leaf=` selector used for eval. The Hydra selector was +previously named `benchmark=`; both names referring to "benchmark" were +a source of confusion and the eval selector has been renamed. + +## Layout + +``` +virtual_staining/ + README.md + /// + train.yml # LightningCLI fit leaf + predict__.yml # LightningCLI predict leaf + eval__.yaml # Hydra eval leaf (canonical location) + _internal/ # hidden support tree — not for browsing + shared/ + model/ + train_sets/.yml # train-set metadata + benchmark.dataset_ref.dataset + HCS defaults + predict_sets/.yml # predict-set metadata + benchmark.dataset_ref.dataset + targets/.yml # benchmark.dataset_ref.target + target-specific norms / CPU augs + data_overlays/ + _fit.yml # per-model HCS data hparams (batch_size, z_window, gpu_augs) + model_overlays/ + _fit.yml # model + fit trainer (no data: block — joint leaves compose + # only this half and author their own data: block) + _predict.yml # model + predict trainer + predict data hparams + launcher_profiles/ + mode_.yml # launcher.mode + hardware_.yml # sbatch directives + trainer.devices + runtime_shared.yml # launcher.runtime + launcher.env + eval/ + target/.yaml # target_name + benchmark.dataset_ref.target + feature_extractor/dynaclr/ # DynaCLR checkpoint + encoder kwargs + leaf/ # symlink tree aliasing canonical eval leaves + ///eval__.yaml -> ../../../../..////eval__.yaml +``` + +Leaves are grouped by **train set** inside each `//` cell so +that a training experiment (train + the predict/eval variants fed by its +checkpoint) lives in one directory. Adding a new training run — e.g. the +planned `joint_ipsc_confocal_a549_mantis` mix — means creating one new +subdir; deleting one is `rm -r`. Each train-set dir holds one `train.yml` +plus one `predict__.yml` and `eval__.yaml` per +held-out split the model is evaluated on. + +The top level of `virtual_staining/` shows only biology (`er/`, `membrane/`, +`mito/`, `nucleus/`) plus `_internal/` — a hidden support tree whose +leading underscore signals "implementation detail; don't browse here for +science." All Hydra group files, all shared composition building blocks, +and the `leaf/` symlink adapter live under `_internal/`. + +Train/predict leaves use LightningCLI (`.yml`). Eval leaves use Hydra and +keep `.yaml` because Hydra's group resolution only discovers `.yaml` files. +The `_internal/leaf/` symlink tree aliases each canonical eval leaf so +Hydra's `leaf=` selector can discover them at +`/leaf/.yaml`. + +Eval runtime uses two search paths injected by `dynacell.__main__`: +`virtual_staining/_internal/` (for the `leaf/` tree) and +`virtual_staining/_internal/shared/eval/` (for the `target/` and +`feature_extractor/dynaclr/` groups). Schema-only eval configs ship +inside the dynacell package; wheel installs without the repo don't see +the HPC-bound groups and external users provide their own via +`--config-dir`. See `applications/dynacell/src/dynacell/evaluation/README.md`. + +## Composition order + +Last wins via deep-merge. Lists replace wholesale — layers that own list +fields (`callbacks`, `augmentations`, etc.) own the **full** list. + +**Single-store train leaf** (at `///train.yml`): + +```yaml +base: + - ../../../_internal/shared/model/train_sets/.yml + - ../../../_internal/shared/model/targets/.yml + - ../../../_internal/shared/model/data_overlays/_fit.yml + - ../../../_internal/shared/model/model_overlays/_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml +``` + +**Joint train leaf** (e.g. `er/celldiff/joint_ipsc_confocal_a549_mantis/train.yml`): + +Joint leaves (multi-dataset fit) bypass the single-dataset `dataset_ref` +resolver and use `viscy_data.BatchedConcatDataModule` with explicit +child `viscy_data.HCSDataModule` blocks per zarr / experiment. They +compose only `model_overlays/_fit.yml` + launcher profiles — +the `data:` block is authored inline because joint hparams live on the +children. See `MULTI_DATASET_TRAINING_RECOMMENDATION.md` for rationale. + +**Joint smoke sibling** (e.g. `er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke.yml`): + +A `train_smoke.yml` lives next to the production `train.yml` for any +joint leaf that needs a smoke runner. The smoke sibling pre-swaps each +child's `data_path` to its colocated `_test48.zarr` debug variant +(or keeps the path when the train split is already small) and uses a +single-GPU launcher profile (`hardware_h200_single`) instead of +multi-GPU DDP. The reason it's a sibling leaf rather than `--override` +flags at submit time: `submit_benchmark_job.py`'s dotlist override +parser does not index into list elements +(`data.init_args.data_modules.0.init_args.data_path=...` is parsed as +a dict-with-string-key, not a list index), so swapping a single +child's zarr at submit time is not supported. Pair the smoke leaf +with `--override trainer.fast_dev_run=true` (or `trainer.max_steps=N`) +to bound the run. + +**Predict leaf** (at `///predict__.yml`): + +```yaml +base: + - ../../../_internal/shared/model/predict_sets/.yml + - ../../../_internal/shared/model/targets/.yml + - ../../../_internal/shared/model/model_overlays/_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml +``` + +**Eval leaf** (at `///eval__.yaml`): + +```yaml +# @package _global_ +defaults: + - override /target: + - override /predict_set: + - override /feature_extractor/dinov3: lvd1689m + - override /feature_extractor/dynaclr: default + +io: + pred_path: /hpc/.../predictions.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/.../eval_results +``` + +## Running + +The default `trainer.logger` in `configs/recipes/trainer/fit.yml` is +`lightning.pytorch.loggers.WandbLogger`. Install dynacell with the +`wandb` extra to satisfy this default (`uv add 'dynacell[wandb]'` / +`pip install 'dynacell[wandb]'`). Without `wandb` installed, +LightningCLI / jsonargparse rejects the leaf at schema-validation +time. To opt out of W&B without installing it, override the logger +in the leaf or via `--override trainer.logger.class_path=...` to a +different Lightning logger (e.g. `lightning.pytorch.loggers.CSVLogger`). + +Direct LightningCLI (no sbatch): + +- `uv run dynacell fit -c configs/benchmarks/virtual_staining////train.yml` +- `uv run dynacell predict -c configs/benchmarks/virtual_staining////predict__.yml` + +Hydra eval: + +- `uv run dynacell evaluate leaf=///eval__` + +Via sbatch with `submit_benchmark_job.py`: + +```bash +LEAF=configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/train.yml + +# Pure preview (no disk writes, safe on any run_root): +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF --print-script +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF --print-resolved-config + +# Stage artifacts to launcher.run_root but skip submission (requires write perms): +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF --dry-run + +# Submit: +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF + +# Dotlist overrides deep-merge after compose (repeatable; ${...} interpolation is rejected): +uv run python applications/dynacell/tools/submit_benchmark_job.py $LEAF \ + --override trainer.max_epochs=50 --override data.init_args.batch_size=2 +``` + +`--dry-run` combined with `--print-*` drops the disk writes (preview +wins). `trainer.devices` and `launcher.sbatch.gpus` must match or +submission fails fast. + +## Dataset reference contract + +Single-dataset train/predict leaves split `benchmark.dataset_ref` across +shared fragments: + +- `train_sets/.yml` and `predict_sets/.yml` contribute + `benchmark.dataset_ref.dataset` plus HCS defaults for that split. +- `targets/.yml` contributes `benchmark.dataset_ref.target` + plus target-specific normalizations and augmentations. +- The compose-time resolver fills `data.init_args.data_path`, + `source_channel`, and `target_channel` from the manifest, so those + fields are no longer duplicated across train/predict leaves. + +Eval leaves follow the same split on the Hydra side: + +- `target/.yaml` contributes `benchmark.dataset_ref.target`. +- `predict_set/.yaml` contributes `benchmark.dataset_ref.dataset`. +- `dynacell.evaluation._ref_hook.apply_dataset_ref()` fills + `io.gt_path`, `io.cell_segmentation_path`, `io.gt_channel_name`, + `io.pred_channel_name`, `io.gt_cache_dir`, and + `pixel_metrics.spacing` from the manifest. diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..bf10d7589 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..a4c5ea9be --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..a15141b73 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..237286899 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..2e2ea2bdf --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..f6177069d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..42ee842e6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..252099ab4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..296ab3ad9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..ee8c4f389 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..e52418d5e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..3e637e458 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..795e29808 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..71612648e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..83c1332e3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..ff3a386de --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..9c0e95db5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..f5fd532e1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..2353e2df1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..bee63297a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..8ac201261 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..ec7df7fac --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..4a18b3957 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..55ad06bc1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..d825510f9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..d66c60b89 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..b23a90c58 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..12895ce10 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..495ec10aa --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..a05677367 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..69af3ff0d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..1c891950f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..b5384f051 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..b6113f5d2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..f11672d97 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..36476fb4e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..ff457651d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..de8318344 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..58d6f5ef9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..d9abc8a70 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..672c4b9b7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..1011b8fbe --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..37f841983 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..fdddfb41f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..7023b987f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..665a14a37 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..f6803af88 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..9b19caf19 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..8450554cc --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..ec036b12a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..8438345ae --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..8edc2a391 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..160a32964 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..fb805e3ca --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..c530b0717 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..09d15f98d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..8d160143e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..131a499e5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..d97f17296 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..e14458f48 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..847aa1ad0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..eada00bf5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..1071d3912 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..524868764 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 120000 index 000000000..1715307a0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 120000 index 000000000..019928b53 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1 @@ +../../../../../nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 120000 index 000000000..6025b9bf4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1 @@ +../../../../../nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 120000 index 000000000..6fa52b633 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/leaf/nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1 @@ +../../../../../nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml \ No newline at end of file diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/feature_extractor/dynaclr/default.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/feature_extractor/dynaclr/default.yaml new file mode 100644 index 000000000..ddc153558 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/feature_extractor/dynaclr/default.yaml @@ -0,0 +1,14 @@ +# Canonical DynaCLR encoder for organelle-sensor virtual-staining eval. +# Encoder kwargs sourced from the pre-refactor dynaclr repo +# (github.com/czbiohub-sf/dynacell @ a9d5c5a76f25dd15d701ab720b62f93f3511ee51, +# dynacell/evaluation/utils.py DynaCLRFeatureExtractor.__init__). +checkpoint: /hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt +encoder: + backbone: convnext_tiny + in_channels: 1 + in_stack_depth: 1 + stem_kernel_size: [1, 4, 4] + stem_stride: [1, 4, 4] + embedding_dim: 768 + projection_dim: 32 + drop_path_rate: 0.0 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/er_sec61b.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/er_sec61b.yaml new file mode 100644 index 000000000..55eb4c439 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/er_sec61b.yaml @@ -0,0 +1,6 @@ +# @package _global_ +# Target group: ER marked by SEC61B, iPSC dataset v4 test split. +target_name: er +benchmark: + dataset_ref: + target: sec61b diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/membrane.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/membrane.yaml new file mode 100644 index 000000000..3f62a7271 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/membrane.yaml @@ -0,0 +1,6 @@ +# @package _global_ +# Target group: membrane channel of the multi-marker cell.zarr, iPSC dataset v4 test split. +target_name: membrane +benchmark: + dataset_ref: + target: membrane diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/mito_tomm20.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/mito_tomm20.yaml new file mode 100644 index 000000000..07b266a23 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/mito_tomm20.yaml @@ -0,0 +1,6 @@ +# @package _global_ +# Target group: mitochondria marked by TOMM20, iPSC dataset v4 test split. +target_name: mitochondria +benchmark: + dataset_ref: + target: tomm20 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/nucleus.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/nucleus.yaml new file mode 100644 index 000000000..c22230c6f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/eval/target/nucleus.yaml @@ -0,0 +1,6 @@ +# @package _global_ +# Target group: nuclei channel of the multi-marker cell.zarr, iPSC dataset v4 test split. +target_name: nucleus +benchmark: + dataset_ref: + target: nucleus diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/celldiff_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/celldiff_fit.yml new file mode 100644 index 000000000..f262ec6ef --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/celldiff_fit.yml @@ -0,0 +1,57 @@ +# CellDiff fit-time HCS data hparams. +# Lifted from model_overlays/celldiff_fit.yml so the model+trainer half +# there stays composable by joint-dataset (BatchedConcatDataModule) +# leaves that author their own data: block. +data: + init_args: + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + gpu_augmentations: + # GPU: affine on oversized patch → center crop to final 8×512×512. + # safe_crop_size clamps scale so the rotated 624px source always + # covers the 512px crop, eliminating zero-corner artifacts. + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + # CellDiff requires exact input_spatial_size (fixed ViT positional embeddings). + # DivisibleCropd is insufficient — must center-crop to exact model input size. + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml new file mode 100644 index 000000000..8e21ccf59 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml @@ -0,0 +1,57 @@ +# FCMAE-class (VSCyto3D FullyConvolutionalMAE) fit-time HCS data hparams. +# Lifted from model_overlays/fcmae_vscyto3d_fit.yml so the model+trainer +# half there stays composable by joint-dataset +# (BatchedConcatDataModule) leaves that author their own data: block. +data: + init_args: + z_window_size: 20 + batch_size: 32 + num_workers: 4 + yx_patch_size: [384, 384] + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/fnet3d_paper_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/fnet3d_paper_fit.yml new file mode 100644 index 000000000..4d751cd7a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/fnet3d_paper_fit.yml @@ -0,0 +1,54 @@ +# FNet3D paper-baseline fit-time HCS data hparams. +# Lifted from model_overlays/fnet3d_paper_fit.yml so the model+trainer +# half there stays composable by joint-dataset +# (BatchedConcatDataModule) leaves that author their own data: block. +# +# Diverges from shared/model/targets/er_sec61b.yml on two fields because +# the paper's stats + sampling differ from the CellDiff/UNetViT +# conventions: Structure is normalized with mean/std (not median/iqr), +# and 8 small weighted crops per FOV replace the 2 oversized transformer +# crops. +data: + init_args: + z_window_size: 32 + batch_size: 48 + num_workers: 8 + yx_patch_size: [64, 64] + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + # CPU: 8 patches per FOV (amortizes zarr decompression). + # batch_size=48 → DataLoader loads 6 FOVs, each yields 8 patches = 48. + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [32, 64, 64] + num_samples: 8 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [1] + prob: 0.5 + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [2] + prob: 0.5 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Structure] + roi_size: [32, 64, 64] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/unetvit3d_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/unetvit3d_fit.yml new file mode 100644 index 000000000..70fb2fa99 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/unetvit3d_fit.yml @@ -0,0 +1,60 @@ +# UNetViT3D fit-time HCS data hparams. +# Lifted from model_overlays/unetvit3d_fit.yml so the model+trainer half +# there stays composable by joint-dataset (BatchedConcatDataModule) +# leaves that author their own data: block. +# +# Identical to data_overlays/celldiff_fit.yml — divergence expected once +# UNetViT3D training data shape is retuned independently. +data: + init_args: + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + gpu_augmentations: + # GPU: affine on oversized patch → center crop to final 8×512×512. + # safe_crop_size clamps scale so the rotated 624px source always + # covers the 512px crop, eliminating zero-corner artifacts. + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + # UNetViT3D requires exact input_spatial_size (fixed ViT positional embeddings). + # DivisibleCropd is insufficient — must center-crop to exact model input size. + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/unext2_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/unext2_fit.yml new file mode 100644 index 000000000..ba78e6fee --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/data_overlays/unext2_fit.yml @@ -0,0 +1,63 @@ +# UNeXt2 (VSCyto3D) fit-time HCS data hparams — Run 4 baseline +# (lr=0.0004, bs=32, z=20). Lifted from model_overlays/unext2_fit.yml so +# the model+trainer half there stays composable by joint-dataset +# (BatchedConcatDataModule) leaves that author their own data: block. +data: + init_args: + z_window_size: 20 + batch_size: 32 + num_workers: 8 + yx_patch_size: [384, 384] + augmentations: + # List-replaces target's default CPU augmentations with UNeXt2's + # z=20 / 600 YX oversized crop at 4 patches per FOV. + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + # Run 4 affine has no safe_crop_size — that's a later addition. The + # val_gpu_augmentations center-crop handles the post-affine cleanup. + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + # Center-crop to model input size: Z from 20→15, YX to 384×384. + # 384 is divisible by 64 (UNeXt2 downsampling factor). + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_4gpu.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_4gpu.yml new file mode 100644 index 000000000..8a44737d9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_4gpu.yml @@ -0,0 +1,33 @@ +# Hardware profile: 4 GPU DDP on H100/H200, A100 excluded. +# +# 4 GPUs, DDP strategy, 512G host mem, 4-day wall-time per restart. +# +# host mem rationale (post-mmap_preload-fix, commit 6ec0d6f7): +# The earlier 1024G ceiling was sized for the oindex/CoordinateIndexer +# broadcast bloat in HCSDataModule.prepare_data, which inflated heap +# ~7x for sharded zarr reads. With per-channel BasicIndexer reads, +# joint cell.zarr + A549-pooled preload now peaks at ~185 GB +# (cell.zarr 500 FOVs ≈110 GB + a549 30 FOVs ≈75 GB tmpfs files in +# /dev/shm + small process baseline). 512G gives ~325 GB of headroom +# for worker buffers, persistent_workers transients, and validation- +# time spikes. Single-set workloads peak at ~110 GB of the 512G cap. +# +# GPU constraint rationale: +# Restricted to H100/H200 (80–96 GB VRAM) because FCMAE/UNeXt2 train +# at large spatial patches (e.g. 20×600×600) where a single DDP rank +# already needs 30–50 GB; A40/A6000/L40S (48 GB) leave no headroom +# for activation transients. A100 nodes are excluded separately due +# to repeat NCCL BROADCAST/ALLREDUCE hangs at first-batch coordination +# on this cluster's A100 partition. Leaves that intentionally want +# the smaller cards must opt out via +# `--override launcher.sbatch.constraint=h100|h200|a40|a6000|l40s`. +launcher: + sbatch: + partition: gpu + nodes: 1 + ntasks_per_node: 4 + cpus_per_task: 8 + gpus: 4 + mem: "512G" + constraint: "h100|h200" + time: "4-00:00:00" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml new file mode 100644 index 000000000..41b2b85d1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml @@ -0,0 +1,21 @@ +# Hardware profile: 1 GPU, any model (no constraint), long wall-time. +# +# Matches the FNet3D paper-baseline run's actual slurm directives: +# the paper runs were submitted without --constraint (they landed on +# RTX A6000s) and with a 20-day wall-time budget so the job wouldn't +# timeout across multi-day training. 32 CPUs and 256G mem are the same +# as hardware_h200_single; only constraint and time differ. +# +# Leaves whose training zarr is large enough to push mmap_preload over +# the 256G cap (e.g. cell.zarr-backed nucleus/membrane) override +# launcher.sbatch.mem in the leaf body. +launcher: + sbatch: + partition: gpu + nodes: 1 + ntasks_per_node: 1 + cpus_per_task: 32 + gpus: 1 + mem: "256G" + constraint: null + time: "20-00:00:00" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_h200_single.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_h200_single.yml new file mode 100644 index 000000000..baf4c4194 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/hardware_h200_single.yml @@ -0,0 +1,13 @@ +# Hardware profile: single H200 GPU. Pair with recipes/topology/single_gpu.yml. +# launcher.sbatch.gpus must match the topology recipe's trainer.devices +# (enforced by submit_benchmark_job). +launcher: + sbatch: + partition: gpu + nodes: 1 + ntasks_per_node: 1 + cpus_per_task: 32 + gpus: 1 + mem: "256G" + constraint: "h200" + time: "4-00:00:00" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/mode_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/mode_fit.yml new file mode 100644 index 000000000..77054287d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/mode_fit.yml @@ -0,0 +1,3 @@ +# Launcher profile: fit mode. +launcher: + mode: fit diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/mode_predict.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/mode_predict.yml new file mode 100644 index 000000000..0fedc1b62 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/mode_predict.yml @@ -0,0 +1,3 @@ +# Launcher profile: predict mode. +launcher: + mode: predict diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/runtime_shared.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/runtime_shared.yml new file mode 100644 index 000000000..3a6e99c20 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/runtime_shared.yml @@ -0,0 +1,24 @@ +# Runtime profile: shared srun + env defaults (not topology-specific). +launcher: + runtime: + use_srun: true + cleanup_tmp: true + env: + PYTHONUNBUFFERED: "1" + NCCL_DEBUG: INFO + PYTHONFAULTHANDLER: "1" + # Use expandable VA segments in PyTorch's CUDA caching allocator. + # Avoids "tried to allocate N GiB; X GiB free, Y GiB reserved" OOMs + # caused by allocator fragmentation across the variable-shape U-Net + # forward+backward (skip-concat doubles channel counts mid-decoder). + # Hit on J31821456 (A40 48GB, fnet3d joint nucl): 45 GB allocated + + # 2.4 GB free could not fit a 3 GB cat. PyTorch's own OOM message + # explicitly recommends this setting. No known regressions. + PYTORCH_ALLOC_CONF: "expandable_segments:True" + # Shared Hugging Face hub cache on project storage: the first user + # with gated-repo access downloads each model (e.g. DINOv3) once + # into this dir, and every subsequent job on any dynacell team + # account reuses those weights instead of re-downloading to per-user + # ~/.cache/huggingface/hub. HF_HUB_CACHE (not HF_HOME) so each user's + # auth token at ~/.cache/huggingface/token still controls gate ACLs. + HF_HUB_CACHE: /hpc/projects/comp.micro/virtual_staining/models/dynacell/evaluation/hf_cache diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/wall_smoke.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/wall_smoke.yml new file mode 100644 index 000000000..14c281f34 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/launcher_profiles/wall_smoke.yml @@ -0,0 +1,7 @@ +# Smoke wall override. Stack AFTER any hardware profile in `base:` to cap +# launcher.sbatch.time at 30 min so a smoke job cannot sit on a multi-day +# allocation. Pair with `--override trainer.fast_dev_run=true` or +# `--override trainer.max_steps=N` so the run actually exits inside the wall. +launcher: + sbatch: + time: "00:30:00" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/celldiff_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/celldiff_fit.yml new file mode 100644 index 000000000..fd3d51dd2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/celldiff_fit.yml @@ -0,0 +1,20 @@ +# CellDiff fit overlay — model + trainer only. +# HCS data hparams live in data_overlays/celldiff_fit.yml; single-store +# train leaves compose both, joint (BatchedConcatDataModule) leaves +# compose only this one and author data: themselves. +base: + - ../../../../../../recipes/models/celldiff_fm.yml + - ../../../../../../recipes/trainer/fit.yml + - ../../../../../../recipes/topology/single_gpu.yml +model: + init_args: + net_config: + input_spatial_size: [8, 512, 512] + lr: 0.0003 + schedule: WarmupCosine + warmup_steps: 8500 + warmup_multiplier: 1e-3 + num_log_steps: 10 +trainer: + precision: bf16-mixed + max_epochs: 20 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/celldiff_predict.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/celldiff_predict.yml new file mode 100644 index 000000000..fbca171a2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/celldiff_predict.yml @@ -0,0 +1,22 @@ +# CellDiff predict overlay. +# Binds the flow-matching model recipe + predict trainer recipe, then layers +# predict-time model hparams and data-loader settings. +# Predict-time normalizations and data_path are leaf-owned (leaf overrides +# target-inherited values to match each organelle's test_cropped store). +base: + - ../../../../../../recipes/models/celldiff_fm.yml + - ../../../../../../recipes/trainer/predict.yml + - ../../../../../../recipes/topology/single_gpu.yml +model: + init_args: + net_config: + input_spatial_size: [8, 512, 512] + num_generate_steps: 100 + predict_method: iterative + predict_overlap: [4, 256, 256] +data: + init_args: + z_window_size: 40 + batch_size: 1 + num_workers: 0 + yx_patch_size: [512, 512] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml new file mode 100644 index 000000000..6148672f8 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml @@ -0,0 +1,48 @@ +# Shared FCMAE-class (FullyConvolutionalMAE with pretraining=False) fit +# overlay. Model/loss/schedule come from the canonical +# vs_test/finetune_3d.py:load_model recipe; data pipeline (bs=32, z=20, +# yx=384) and lr=0.0004 match the retuned unext2_fit.yml Run 4 baseline +# so the FCMAE runs are directly comparable to the timm-backed unext2 +# job on the same data throughput. Used by both +# fcmae_vscyto3d_scratch.yml and fcmae_vscyto3d_pretrained.yml — +# encoder_only + ckpt_path are set only in the pretrained leaf so init +# is the only difference between the two. +base: + - ../../../../../../recipes/trainer/fit.yml + - ../../../../../../recipes/topology/ddp_4gpu.yml +model: + class_path: dynacell.engine.DynacellUNet + init_args: + architecture: fcmae + model_config: + in_channels: 1 + out_channels: 1 + encoder_blocks: [3, 3, 9, 3] + encoder_drop_path_rate: 0.1 + dims: [96, 192, 384, 768] + decoder_conv_blocks: 2 + stem_kernel_size: [5, 4, 4] + in_stack_depth: 15 + pretraining: false + loss_function: + class_path: viscy_utils.losses.MixedLoss + init_args: + l1_alpha: 0.5 + l2_alpha: 0.0 + ms_dssim_alpha: 0.5 + lr: 0.0004 + schedule: WarmupCosine + warmup_steps: 8500 # ~1 epoch for FCMAE at bs=32, 4 GPUs + warmup_multiplier: 1e-3 +trainer: + # FullyConvolutionalMAE(pretraining=False) has decoder/head params that + # only receive gradients on some forward paths; default ddp with + # find_unused_parameters=False errors at step 1. Matches the canonical + # vs_test/finetune_3d.py:215 recipe. + strategy: ddp_find_unused_parameters_true + precision: 16-mixed + max_epochs: 200 +# HCS data hparams (bs=32, z=20, yx=384, augs) live in +# data_overlays/fcmae_vscyto3d_fit.yml; single-store train leaves compose +# both, joint (BatchedConcatDataModule) leaves compose only this one and +# author data: themselves. diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml new file mode 100644 index 000000000..f59fd24c7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml @@ -0,0 +1,43 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) predict overlay. +# Mirrors the model block of fcmae_vscyto3d_fit.yml so Lightning's +# load_from_checkpoint instantiates the matching architecture, then layers +# predict-time hparams. Unlike celldiff_predict / unetvit3d_predict / +# fnet3d_paper_predict, FCMAE has no `recipes/models/.yml` — the +# model block is defined inline in the fit overlay. Duplicating here +# keeps both overlays standalone; consolidate via a shared recipe if +# fcmae cells need joint leaves too. +# Used by both fcmae_vscyto3d_pretrained and fcmae_vscyto3d_scratch +# predict leaves — predict loads the full trained checkpoint, so the +# pretrained-encoder warm-start path (encoder_only + ckpt_path) from the +# fit leaf does NOT belong here. +base: + - ../../../../../../recipes/trainer/predict.yml + - ../../../../../../recipes/topology/single_gpu.yml +model: + class_path: dynacell.engine.DynacellUNet + init_args: + architecture: fcmae + model_config: + in_channels: 1 + out_channels: 1 + encoder_blocks: [3, 3, 9, 3] + encoder_drop_path_rate: 0.1 + dims: [96, 192, 384, 768] + decoder_conv_blocks: 2 + stem_kernel_size: [5, 4, 4] + in_stack_depth: 15 + pretraining: false + loss_function: + class_path: viscy_utils.losses.MixedLoss + init_args: + l1_alpha: 0.5 + l2_alpha: 0.0 + ms_dssim_alpha: 0.5 + predict_method: full_image + predict_overlap: [4, 256, 256] +data: + init_args: + z_window_size: 15 + batch_size: 1 + num_workers: 0 + yx_patch_size: [512, 512] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fnet3d_paper_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fnet3d_paper_fit.yml new file mode 100644 index 000000000..b475307a2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fnet3d_paper_fit.yml @@ -0,0 +1,24 @@ +# FNet3D paper-baseline fit overlay — model + trainer only. +# HCS data hparams (including the mean/std Structure normalization and +# the 8-crops-per-FOV sampling that diverge from the CellDiff/UNetViT +# conventions) live in data_overlays/fnet3d_paper_fit.yml; single-store +# train leaves compose both, joint (BatchedConcatDataModule) leaves +# compose only this one and author data: themselves. +# +# Reproduces pytorch_fnet paper defaults on DynaCell data. Reference run +# (launched before this schema existed): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/ +base: + - ../../../../../../recipes/models/fnet3d.yml + - ../../../../../../recipes/trainer/fit.yml + - ../../../../../../recipes/topology/single_gpu.yml +seed_everything: 0 +model: + init_args: + loss_function: + class_path: torch.nn.MSELoss + lr: 0.001 + schedule: Constant +trainer: + precision: 32-true + max_steps: 200000 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fnet3d_paper_predict.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fnet3d_paper_predict.yml new file mode 100644 index 000000000..90bee1b0e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/fnet3d_paper_predict.yml @@ -0,0 +1,20 @@ +# FNet3D paper-baseline predict overlay. +# Binds the FNet3D model recipe + predict trainer recipe, then layers +# predict-time model hparams and data-loader settings. +# Predict-time normalizations and data_path are leaf-owned (leaf +# overrides target-inherited values to match each organelle's +# test_cropped store). +base: + - ../../../../../../recipes/models/fnet3d.yml + - ../../../../../../recipes/trainer/predict.yml + - ../../../../../../recipes/topology/single_gpu.yml +model: + init_args: + predict_method: full_image + predict_overlap: [4, 256, 256] +data: + init_args: + z_window_size: 32 + batch_size: 1 + num_workers: 0 + yx_patch_size: [512, 512] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unetvit3d_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unetvit3d_fit.yml new file mode 100644 index 000000000..639bf794b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unetvit3d_fit.yml @@ -0,0 +1,20 @@ +# UNetViT3D fit overlay — model + trainer only. +# HCS data hparams live in data_overlays/unetvit3d_fit.yml; single-store +# train leaves compose both, joint (BatchedConcatDataModule) leaves +# compose only this one and author data: themselves. +# +# Hparams (lr, schedule, epochs) match celldiff_fit.yml — the only +# functional difference here is the model class. +base: + - ../../../../../../recipes/models/unetvit3d.yml + - ../../../../../../recipes/trainer/fit.yml + - ../../../../../../recipes/topology/single_gpu.yml +model: + init_args: + lr: 0.0003 + schedule: WarmupCosine + warmup_steps: 8500 + warmup_multiplier: 1e-3 +trainer: + precision: bf16-mixed + max_epochs: 20 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unetvit3d_predict.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unetvit3d_predict.yml new file mode 100644 index 000000000..8d784083b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unetvit3d_predict.yml @@ -0,0 +1,19 @@ +# UNetViT3D predict overlay. +# Binds the UNetViT3D model recipe + predict trainer recipe, then layers +# predict-time model hparams and data-loader settings. +# Predict-time normalizations and data_path are leaf-owned (leaf overrides +# target-inherited values to match each organelle's test_cropped store). +base: + - ../../../../../../recipes/models/unetvit3d.yml + - ../../../../../../recipes/trainer/predict.yml + - ../../../../../../recipes/topology/single_gpu.yml +model: + init_args: + predict_method: full_image + predict_overlap: [4, 256, 256] +data: + init_args: + z_window_size: 8 + batch_size: 1 + num_workers: 0 + yx_patch_size: [512, 512] diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unext2_fit.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unext2_fit.yml new file mode 100644 index 000000000..6b3fe6cef --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/model_overlays/unext2_fit.yml @@ -0,0 +1,30 @@ +# UNeXt2 (VSCyto3D) fit overlay — reproduces the Run 4 SEC61B config +# from legacy commit 46e4c79 (`examples/configs/sec61b/fit_unext2.yml`). +# Architecture: convnextv2_tiny z=15, MixedLoss(L1+DSSIM), 4-GPU DDP. +# +# Earlier runs in the wandb series (20260403-210816, 20260406-094805, +# 20260406-225302) used lr=0.0002, bs=8, z=15; this overlay reproduces the +# retuned Run 4 (20260409-020023) with lr=0.0004, bs=32, z=20. +base: + - ../../../../../../recipes/models/unext2_3d.yml + - ../../../../../../recipes/trainer/fit.yml + - ../../../../../../recipes/topology/ddp_4gpu.yml +model: + init_args: + loss_function: + class_path: viscy_utils.losses.MixedLoss + init_args: + l1_alpha: 0.5 + l2_alpha: 0.0 + ms_dssim_alpha: 0.5 + lr: 0.0004 + schedule: WarmupCosine + warmup_steps: 8500 + warmup_multiplier: 1e-3 +trainer: + precision: 16-mixed + max_epochs: 200 +# HCS data hparams (bs=32, z=20, yx=384, augs) live in +# data_overlays/unext2_fit.yml; single-store train leaves compose both, +# joint (BatchedConcatDataModule) leaves compose only this one and +# author data: themselves. diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml new file mode 100644 index 000000000..2bb5765d9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis CAAX on DENV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_caax_denv + dataset_ref: + dataset: a549-mantis-caax-denv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml new file mode 100644 index 000000000..f027f45ec --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis CAAX on mock (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_caax_mock + dataset_ref: + dataset: a549-mantis-caax-mock +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml new file mode 100644 index 000000000..e9b81a8ae --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis CAAX on ZIKV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_caax_zikv + dataset_ref: + dataset: a549-mantis-caax-zikv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml new file mode 100644 index 000000000..896520228 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis H2B on DENV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_h2b_denv + dataset_ref: + dataset: a549-mantis-h2b-denv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml new file mode 100644 index 000000000..f54707ef4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis H2B on mock (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_h2b_mock + dataset_ref: + dataset: a549-mantis-h2b-mock +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml new file mode 100644 index 000000000..b067a18da --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis H2B on ZIKV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_h2b_zikv + dataset_ref: + dataset: a549-mantis-h2b-zikv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml new file mode 100644 index 000000000..0b4910aba --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis SEC61B on DENV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_sec61b_denv + dataset_ref: + dataset: a549-mantis-sec61b-denv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml new file mode 100644 index 000000000..d4ae2a646 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis SEC61B on mock (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_sec61b_mock + dataset_ref: + dataset: a549-mantis-sec61b-mock +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml new file mode 100644 index 000000000..c5e2ae9d4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis SEC61B on ZIKV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_sec61b_zikv + dataset_ref: + dataset: a549-mantis-sec61b-zikv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml new file mode 100644 index 000000000..c86d0293c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis TOMM20 on DENV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_tomm20_denv + dataset_ref: + dataset: a549-mantis-tomm20-denv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml new file mode 100644 index 000000000..ec58c2793 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis TOMM20 on mock (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_tomm20_mock + dataset_ref: + dataset: a549-mantis-tomm20-mock +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml new file mode 100644 index 000000000..b5508b18d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml @@ -0,0 +1,12 @@ +# Predict set: A549 mantis TOMM20 on ZIKV (condition-pooled test store). +# data_path resolves to the test store in predict mode via dataset_ref. +# Pool naming inside the store is sequential 0/0/fov; plate +# provenance lives in per-position zattrs and the colocated +# .provenance.json sidecar (see dynacell-paper assemble-pool docs). +benchmark: + predict_set: a549_mantis_tomm20_zikv + dataset_ref: + dataset: a549-mantis-tomm20-zikv +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/ipsc_confocal.yml new file mode 100644 index 000000000..2fa30db3b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/predict_sets/ipsc_confocal.yml @@ -0,0 +1,9 @@ +# Predict set: AICS iPSC confocal, self-predict against test_cropped/. +# data_path resolves to the test store in predict mode via dataset_ref. +benchmark: + predict_set: ipsc_confocal + dataset_ref: + dataset: aics-hipsc +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: {} diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/er_sec61b.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/er_sec61b.yml new file mode 100644 index 000000000..93d5def22 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/er_sec61b.yml @@ -0,0 +1,30 @@ +# Target: ER (SEC61B marker). data_path / source_channel / target_channel +# resolved from the manifest via dataset_ref. +benchmark: + target: er + gene: SEC61B + target_id: er_sec61b + dataset_ref: + target: sec61b +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + num_samples: 2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/membrane.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/membrane.yml new file mode 100644 index 000000000..e4d9fc45a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/membrane.yml @@ -0,0 +1,30 @@ +# Target: membrane (Membrane channel of the multi-marker cell.zarr). data_path / +# source_channel / target_channel resolved from the manifest via dataset_ref. +benchmark: + target: membrane + gene: Membrane + target_id: membrane + dataset_ref: + target: membrane +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [13, 624, 624] + num_samples: 2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/mito_tomm20.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/mito_tomm20.yml new file mode 100644 index 000000000..0a96af1bf --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/mito_tomm20.yml @@ -0,0 +1,30 @@ +# Target: mitochondria (TOMM20 marker). data_path / source_channel / +# target_channel resolved from the manifest via dataset_ref. +benchmark: + target: mito + gene: TOMM20 + target_id: mito_tomm20 + dataset_ref: + target: tomm20 +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + num_samples: 2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/nucleus.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/nucleus.yml new file mode 100644 index 000000000..156ee8e39 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/targets/nucleus.yml @@ -0,0 +1,30 @@ +# Target: nucleus (Nuclei channel of the multi-marker cell.zarr). data_path / +# source_channel / target_channel resolved from the manifest via dataset_ref. +benchmark: + target: nucleus + gene: Nuclei + target_id: nucleus + dataset_ref: + target: nucleus +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [13, 624, 624] + num_samples: 2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/train_sets/a549_mantis.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/train_sets/a549_mantis.yml new file mode 100644 index 000000000..d87e637e5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/train_sets/a549_mantis.yml @@ -0,0 +1,30 @@ +# Train set: A549 mantis-lightsheet, condition-pooled (mock + DENV + ZIKV +# all in one store per target). Pooled stores live at +# `/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/_all.zarr` +# and are NOT registered in the canonical manifest registry — the +# per-treatment manifests under `a549-mantis/-/` +# point at per-treatment ozx files (used for predict/eval), not the +# pooled train zarrs. +# +# Because there is no canonical manifest for the pooled train stores, +# leaves consuming this fragment author `data.init_args.{data_path, +# target_channel}` inline (no resolver) — same shape as the joint +# leaves, but with a single HCSDataModule child instead of two. +# +# `dataset_ref` is intentionally not set: the resolver hook +# (`_compose_hook._dynacell_ref_resolver`) is a strict partial-ref +# no-op when `dataset_ref.dataset` is missing, so composing +# `targets/.yml` (which sets `dataset_ref.target`) alongside this +# fragment is safe and keeps the per-target normalizations / +# augmentations from the target fragment. +benchmark: + train_set: a549_mantis + dataset_group: a549-mantis +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: + source_channel: Phase3D + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/train_sets/ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/train_sets/ipsc_confocal.yml new file mode 100644 index 000000000..e5e3f78a7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/_internal/shared/model/train_sets/ipsc_confocal.yml @@ -0,0 +1,15 @@ +# Train set: AICS iPSC confocal. dataset_ref.dataset lives here (dataset +# identity is train_set-scoped); target fragments carry dataset_ref.target +# and declare source_channel themselves. +benchmark: + train_set: ipsc_confocal + dataset_group: aics-hipsc + dataset_ref: + dataset: aics-hipsc +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/a549_mantis/train.yml new file mode 100644 index 000000000..62b4964ee --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/a549_mantis/train.yml @@ -0,0 +1,42 @@ +# CellDiff fit on ER (SEC61B marker) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: a549_mantis + model_name: celldiff + experiment_id: er__a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_A549_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/sec61b/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/sec61b/celldiff/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: CELLDiff_A549_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/sec61b/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..21194f359 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by CellDiff on a549-mantis-sec61b-denv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_celldiff_iterative__sec61b_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_celldiff_iterative__sec61b_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..24f954091 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by CellDiff on a549-mantis-sec61b-mock. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_celldiff_iterative__sec61b_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_celldiff_iterative__sec61b_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..6390b4106 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by CellDiff on a549-mantis-sec61b-zikv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_celldiff_iterative__sec61b_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_celldiff_iterative__sec61b_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..e11c6a274 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by CellDiff on iPSC confocal. +defaults: + - override /target: er_sec61b + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_celldiff_iterative.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_sec61b_celldiff_iterative diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..952e8c5fa --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,44 @@ +# CellDiff predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_denv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_denv + model_name: celldiff + experiment_id: er__ipsc_confocal__celldiff__a549_mantis_sec61b_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 48 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_celldiff_iterative__sec61b_denv.zarr + +launcher: + job_name: CELLDiff_PRED_SEC61B_ON_A549_sec61b_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..d7bfa03b8 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,44 @@ +# CellDiff predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_mock test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_mock + model_name: celldiff + experiment_id: er__ipsc_confocal__celldiff__a549_mantis_sec61b_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_celldiff_iterative__sec61b_mock.zarr + +launcher: + job_name: CELLDiff_PRED_SEC61B_ON_A549_sec61b_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..6d5d7ef56 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,44 @@ +# CellDiff predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_zikv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_zikv + model_name: celldiff + experiment_id: er__ipsc_confocal__celldiff__a549_mantis_sec61b_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_celldiff_iterative__sec61b_zikv.zarr + +launcher: + job_name: CELLDiff_PRED_SEC61B_ON_A549_sec61b_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..6076274b2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,46 @@ +# CellDiff predict: ER (SEC61B) against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: celldiff + experiment_id: er__ipsc_confocal__celldiff__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + z_window_size: 40 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_celldiff_iterative.zarr + +launcher: + job_name: CELLDiff_PRED_SEC61B + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/train.yml new file mode 100644 index 000000000..39c2a50a2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/ipsc_confocal/train.yml @@ -0,0 +1,36 @@ +# CellDiff fit on ER (SEC61B marker) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: ipsc_confocal + model_name: celldiff + experiment_id: er__ipsc_confocal__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_iPSC_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff/checkpoints + +launcher: + job_name: CELLDiff_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..cedecb8fa --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,149 @@ +# CellDiff fit on ER (SEC61B) — joint ipsc_confocal + a549_mantis_2024_11_07. +# +# First joint train leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# Uses BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/celldiff_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# Topology: single H200, single GPU — same as celldiff/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: celldiff + experiment_id: er__joint_ipsc_confocal_a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_JOINT_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff/checkpoints + +# Child HCSDataModule init_args shared across both datasets (only data_path +# differs). Factored as a YAML anchor so the joint leaf stays auditable; +# this is the first joint leaf — if the pattern sticks we can promote to a +# reusable fragment. +# +# Naming convention: top-level keys starting with `_` are private to the +# YAML compose layer and are stripped by `load_composed_config` before +# the dict reaches LightningCLI / jsonargparse (which would reject them +# as unknown options). The merge expansion under `data:` survives. +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + # a549_mantis — pooled SEC61B all-conditions train store (mantis_v1/train/SEC61B_all.zarr) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: CELLDiff_JOINT_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke.yml new file mode 100644 index 000000000..8ab940773 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke.yml @@ -0,0 +1,145 @@ +# Joint smoke variant of train.yml — small iPSC zarr, single H200, 30-min wall. +# +# Purpose: validate joint compose / instantiate / training-loop end-to-end +# without the 250 GB+ mmap_preload staging that blows a smoke wall. Pair +# this leaf with `--override trainer.fast_dev_run=true` (or +# `--override trainer.max_steps=5`) at submit time to bound the run. +# +# Why a sibling leaf rather than --override at submit time: dotlist / +# bracket syntax (`data.init_args.data_modules.0.init_args.data_path=...`) +# does not index into list elements via submit_benchmark_job.py's override +# parser. Pre-swapping data_paths in a sibling leaf is the supported fix. +base: + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/wall_smoke.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: celldiff + experiment_id: er__joint_ipsc_confocal_a549_mantis__celldiff__smoke + +trainer: + # Smoke runs don't need a logger. `false` disables the recipe's WandbLogger + # so consumers don't have to remember --override trainer.logger=false. + # LearningRateMonitor (recipe default) raises without a logger, so the + # callbacks list is replaced with only ModelCheckpoint (lists replace + # wholesale under deep_merge). + logger: false + callbacks: + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff/smoke/checkpoints + +# `_`-prefixed top-level keys are stripped by load_composed_config; see +# train.yml in this directory for the full anchor-convention rationale. +_hcs_init_args: &hcs_init_args + source_channel: [Phase3D] + target_channel: [Structure] + z_window_size: 13 + # batch_size=1 (vs train.yml's 4) so the smoke fits a single H200. The + # 4-GPU train.yml hparams OOM on one GPU because per-step memory is + # batch_size * num_samples patches at [8, 512, 512]; scaling batch_size + # alone keeps patch shape identical to train.yml so the validation is + # apples-to-apples. + batch_size: 1 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + # num_samples=1 (vs train.yml's 2) — HCSDataModule requires + # batch_size % num_samples == 0 and the smoke uses batch_size=1. + num_samples: 1 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B test48 zarr (48 FOVs, smoke-sized). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B_test48.zarr + # a549_mantis — 2024_11_07 SEC61B train store. Already 4 FOVs, no + # smoke variant needed. + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: CELLDiff_JOINT_SEC61B_SMOKE + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff/smoke diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml new file mode 100644 index 000000000..b94fd2b43 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/celldiff/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml @@ -0,0 +1,162 @@ +# 4-GPU DDP smoke variant of train.yml — small zarrs, 4 H200s, 30-min wall. +# +# Purpose: validate `BatchedConcatDataModule` + `ShardedDistributedSampler` +# integration on the real DDP topology used in production. The single-GPU +# `train_smoke.yml` already proved the joint loader and training/val loops +# work end-to-end; this leaf isolates the *sharding* behavior — that each +# rank pulls a disjoint slice of the joint dataset and the sampler attaches +# automatically once `torch.distributed` is initialized. +# +# Why a sibling leaf rather than --override on train.yml: train.yml points +# at the full 423-FOV iPSC SEC61B store, which `mmap_preload` stages to +# /dev/shm in 45+ min — blows the 30-min smoke wall before the first step. +# We swap the iPSC data_path to its `_test48` companion (48 FOVs, ~24 GB) +# so staging finishes in under a minute. submit_benchmark_job.py's --override +# parser cannot index into list elements (`data_modules.0` / `data_modules[0]` +# both fail), so pre-swapping in a sibling leaf is the supported fix — same +# rationale as train_smoke.yml. +# +# Why batch_size=1 / num_samples=1: matches train_smoke.yml. The point of +# this smoke is "does the sampler shard the joint dataset across ranks", +# not "does train.yml's heavier per-rank hparams (batch=4, num_samples=2) +# fit on H200". Validating sharding at small batch isolates the question; +# memory tuning is a follow-up smoke if needed. +# +# Why max_steps is baked in (not --override): the wall_smoke.yml docstring +# explicitly says to bound the run, and we just spent a 30-min wall on +# train_smoke.yml because we forgot the override at submit time. Bake it +# in so the leaf is self-bounded. +base: + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/wall_smoke.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: celldiff + experiment_id: er__joint_ipsc_confocal_a549_mantis__celldiff__smoke_4gpu + +trainer: + # Override the single_gpu topology pulled in by model_overlays/celldiff_fit.yml. + strategy: ddp + devices: 4 + # Bound the run so the leaf is self-contained (see header). + max_steps: 5 + # Smoke runs don't need a logger. `false` disables the recipe's WandbLogger + # so consumers don't have to remember --override trainer.logger=false. + # LearningRateMonitor (recipe default) raises without a logger, so the + # callbacks list is replaced with only ModelCheckpoint (lists replace + # wholesale under deep_merge). + logger: false + callbacks: + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff/smoke_4gpu/checkpoints + +# `_`-prefixed top-level keys are stripped by load_composed_config; see +# train.yml in this directory for the full anchor-convention rationale. +_hcs_init_args: &hcs_init_args + source_channel: [Phase3D] + target_channel: [Structure] + z_window_size: 13 + # See header — kept at 1 to isolate sharding from per-rank memory tuning. + batch_size: 1 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + # Must satisfy batch_size % num_samples == 0; batch_size=1 forces 1. + num_samples: 1 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B test48 zarr (48 FOVs, smoke-sized). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B_test48.zarr + # a549_mantis — 2024_11_07 SEC61B train store. Already 4 FOVs, no + # smoke variant needed. + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: CELLDiff_JOINT_SEC61B_SMOKE_4GPU + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/celldiff/smoke_4gpu diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..19d2e2acc --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: ER trained on a549_mantis (sec61b), +# predicting against a549-mantis-sec61b-denv test. +# Best val-loss checkpoint from job 31910356 (epoch 132, loss/validate=0.5716). +# Both iPSC and a549 manifests use `sec61b` for the ER target, so no +# dataset_ref override is needed (targets/er_sec61b.yml already sets it). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: a549_mantis_sec61b_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_sec61b_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=132-step=22876.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained_a549trained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..30fbe3de9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: ER trained on a549_mantis (sec61b), +# predicting against a549-mantis-sec61b-mock test. +# Best val-loss checkpoint from job 31910356 (epoch 132, loss/validate=0.5716). +# Both iPSC and a549 manifests use `sec61b` for the ER target, so no +# dataset_ref override is needed (targets/er_sec61b.yml already sets it). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: a549_mantis_sec61b_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_sec61b_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=132-step=22876.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained_a549trained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..63b057c18 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: ER trained on a549_mantis (sec61b), +# predicting against a549-mantis-sec61b-zikv test. +# Best val-loss checkpoint from job 31910356 (epoch 132, loss/validate=0.5716). +# Both iPSC and a549 manifests use `sec61b` for the ER target, so no +# dataset_ref override is needed (targets/er_sec61b.yml already sets it). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: a549_mantis_sec61b_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_sec61b_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=132-step=22876.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained_a549trained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..9768e6e51 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: ER trained on a549_mantis (sec61b), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31910356 (epoch 132, loss/validate=0.5716). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__a549_mantis__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=132-step=22876.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fcmae_vscyto3d_pretrained_a549trained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/train.yml new file mode 100644 index 000000000..c58b0ecd5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/a549_mantis/train.yml @@ -0,0 +1,55 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on ER/SEC61B. Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). See vs_test/finetune_3d.py +# for the canonical recipe. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__a549_mantis__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_A549_SEC61B_ws8500 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_A549_SEC61B_ws8500 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..f55c63251 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-sec61b-denv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained__sec61b_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fcmae_vscyto3d_pretrained__sec61b_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..29845f09c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-sec61b-mock. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained__sec61b_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fcmae_vscyto3d_pretrained__sec61b_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..24c65d717 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-sec61b-zikv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained__sec61b_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fcmae_vscyto3d_pretrained__sec61b_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..ad511fb70 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# FCMAE_VSCyto3D_Pretrained predict: ER (SEC61B) trained on iPSC, +# predicting against a549_mantis_sec61b_denv test. +# +# TODO: replace ckpt_path once iPSC FCMAE pretrained ER training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_sec61b_denv + +model: + init_args: + # Best checkpoint from J31523022 (FCMAE_VSCyto3D_Pretrained_iPSC_SEC61B_ws8500): + # ep 123 / val_loss 0.40979 (49-epoch plateau, scancelled at 3d 1h elapsed). + # Hardlink alias at run_root; the underlying checkpoints/epoch=123-step=32736.ckpt + # is also preserved in checkpoints_frozen_ep123_20260501_004946/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/best_ep123_val0.40979.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained__sec61b_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_ON_A549_sec61b_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..21102d217 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# FCMAE_VSCyto3D_Pretrained predict: ER (SEC61B) trained on iPSC, +# predicting against a549_mantis_sec61b_mock test. +# +# TODO: replace ckpt_path once iPSC FCMAE pretrained ER training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_sec61b_mock + +model: + init_args: + # Best checkpoint from J31523022 (FCMAE_VSCyto3D_Pretrained_iPSC_SEC61B_ws8500): + # ep 123 / val_loss 0.40979 (49-epoch plateau, scancelled at 3d 1h elapsed). + # Hardlink alias at run_root; the underlying checkpoints/epoch=123-step=32736.ckpt + # is also preserved in checkpoints_frozen_ep123_20260501_004946/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/best_ep123_val0.40979.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained__sec61b_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_ON_A549_sec61b_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..def035b7c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# FCMAE_VSCyto3D_Pretrained predict: ER (SEC61B) trained on iPSC, +# predicting against a549_mantis_sec61b_zikv test. +# +# TODO: replace ckpt_path once iPSC FCMAE pretrained ER training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_sec61b_zikv + +model: + init_args: + # Best checkpoint from J31523022 (FCMAE_VSCyto3D_Pretrained_iPSC_SEC61B_ws8500): + # ep 123 / val_loss 0.40979 (49-epoch plateau, scancelled at 3d 1h elapsed). + # Hardlink alias at run_root; the underlying checkpoints/epoch=123-step=32736.ckpt + # is also preserved in checkpoints_frozen_ep123_20260501_004946/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/best_ep123_val0.40979.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_pretrained__sec61b_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B_ON_A549_sec61b_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..76f63d134 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained predict: ER (SEC61B) against ipsc_confocal test_cropped. +# +# TODO: replace ckpt_path with best-val ckpt once iPSC FCMAE pretrained +# ER training (J31523022, ws8500 variant) completes. Expected dir: +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints/ +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + # Best checkpoint from J31523022 (FCMAE_VSCyto3D_Pretrained_iPSC_SEC61B_ws8500): + # ep 123 / val_loss 0.40979 (49-epoch plateau, scancelled at 3d 1h elapsed). + # Hardlink alias at run_root; the underlying checkpoints/epoch=123-step=32736.ckpt + # is also preserved in checkpoints_frozen_ep123_20260501_004946/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/best_ep123_val0.40979.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fcmae_vscyto3d_pretrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_SEC61B + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml new file mode 100644 index 000000000..d8e03ca03 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml @@ -0,0 +1,49 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on ER/SEC61B. Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). See vs_test/finetune_3d.py +# for the canonical recipe. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_iPSC_SEC61B_ws8500 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_SEC61B_ws8500 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_pretrained_ws8500 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..44acd7050 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,154 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on er (SEC61B) — joint +# ipsc_confocal + a549_mantis pooled. Companion to +# fcmae_vscyto3d_scratch joint leaf — the two are identical except +# this one loads encoder weights from the published VSCyto3D FCMAE +# ckpt (400 ep on HEK + A549 + iPSC phase data). Mirrors +# er/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml on +# the joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/fcmae_vscyto3d_fit.yml is composed; +# the data block is authored inline because joint hparams live on +# the children. +# +# Topology: 4-GPU DDP (inherited from +# model_overlays/fcmae_vscyto3d_fit.yml's ddp_4gpu base; the overlay +# also pins strategy=ddp_find_unused_parameters_true because +# FullyConvolutionalMAE has decoder/head params that only receive +# gradients on some forward paths). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: er__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_JOINT_SEC61B_ws8500 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + # a549_mantis — pooled SEC61B all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_JOINT_SEC61B_ws8500 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fcmae_vscyto3d_pretrained_ws8500 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..7f2e72871 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: ER trained on a549_mantis (sec61b), +# predicting against a549-mantis-sec61b-denv test. +# Best val-loss checkpoint from job 31910346 (epoch 137, val 0.6219). See +# predict__ipsc_confocal.yml in this dir for the wandb collision caveat. +# Both iPSC and a549 manifests use `sec61b`; targets/er_sec61b.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: a549_mantis_sec61b_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: er__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_sec61b_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=137-step=23736.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch_a549trained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..32df0df31 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: ER trained on a549_mantis (sec61b), +# predicting against a549-mantis-sec61b-mock test. +# Best val-loss checkpoint from job 31910346 (epoch 137, val 0.6219). See +# predict__ipsc_confocal.yml in this dir for the wandb collision caveat. +# Both iPSC and a549 manifests use `sec61b`; targets/er_sec61b.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: a549_mantis_sec61b_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: er__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_sec61b_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=137-step=23736.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch_a549trained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..7c565f14f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: ER trained on a549_mantis (sec61b), +# predicting against a549-mantis-sec61b-zikv test. +# Best val-loss checkpoint from job 31910346 (epoch 137, val 0.6219). See +# predict__ipsc_confocal.yml in this dir for the wandb collision caveat. +# Both iPSC and a549 manifests use `sec61b`; targets/er_sec61b.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: a549_mantis_sec61b_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: er__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_sec61b_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=137-step=23736.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch_a549trained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..5d85dfd09 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,50 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: ER trained on a549_mantis (sec61b), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31910346 (epoch 137, val 0.6219). Job +# completed cleanly to ep199 at 2026-05-05T03:34:24 (elapsed 2d 6h 50m). +# NOTE: wandb run id `20260502-204536` collided with the simultaneous TOMM20 +# training (J31910360 on the same node gpu-f-5); the wandb dashboard for that +# run id shows the TOMM20 display name but the metrics actually belong to +# this SEC61B training (ep113 step=19607, ep137 step=23735 align with SEC61B's +# step counts, not TOMM20's). Both iPSC and a549 manifests use `sec61b`; +# targets/er_sec61b.yml handles both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: er__a549_mantis__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=137-step=23736.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fcmae_vscyto3d_scratch_a549trained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/train.yml new file mode 100644 index 000000000..fde2e8c5d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/a549_mantis/train.yml @@ -0,0 +1,47 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on ER/SEC61B. Scratch control for the pretrained counterpart — +# the two leaves are identical except this one does NOT load pretrained +# encoder weights. See UNEXT2_VS_FCMAE_CLASSES.md for why this is the +# paper-adjacent scratch baseline (and not unext2.yml). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: er__a549_mantis__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_A549_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_A549_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..1355a7f48 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-sec61b-denv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch__sec61b_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fcmae_vscyto3d_scratch__sec61b_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..b0cc3e8b8 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-sec61b-mock. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch__sec61b_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fcmae_vscyto3d_scratch__sec61b_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..dd687bfae --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-sec61b-zikv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch__sec61b_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fcmae_vscyto3d_scratch__sec61b_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..9af438657 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,47 @@ +# FCMAE_VSCyto3D_Scratch predict: ER (SEC61B) trained on iPSC, +# predicting against a549_mantis_sec61b_denv test. +# +# Pinned to best-val checkpoint from training run J31483778 +# (val 0.4119, epoch 122). Run cancelled at epoch 164 — val plateaued +# at epoch 122 and never recovered (~42 epochs without improvement, +# drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_sec61b_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=122-step=32472.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch__sec61b_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_ON_A549_sec61b_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..9935d25ff --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,47 @@ +# FCMAE_VSCyto3D_Scratch predict: ER (SEC61B) trained on iPSC, +# predicting against a549_mantis_sec61b_mock test. +# +# Pinned to best-val checkpoint from training run J31483778 +# (val 0.4119, epoch 122). Run cancelled at epoch 164 — val plateaued +# at epoch 122 and never recovered (~42 epochs without improvement, +# drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_sec61b_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=122-step=32472.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch__sec61b_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_ON_A549_sec61b_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..6b6068d40 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,47 @@ +# FCMAE_VSCyto3D_Scratch predict: ER (SEC61B) trained on iPSC, +# predicting against a549_mantis_sec61b_zikv test. +# +# Pinned to best-val checkpoint from training run J31483778 +# (val 0.4119, epoch 122). Run cancelled at epoch 164 — val plateaued +# at epoch 122 and never recovered (~42 epochs without improvement, +# drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_sec61b_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=122-step=32472.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fcmae_vscyto3d_scratch__sec61b_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B_ON_A549_sec61b_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..a5ee1a8ec --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch predict: ER (SEC61B) against ipsc_confocal test_cropped. +# +# Pinned to best-val checkpoint from training run J31483778 +# (val 0.4119, epoch 122). Run cancelled at epoch 164 — val plateaued +# at epoch 122 and never recovered (~42 epochs without improvement, +# drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch/checkpoints/epoch=122-step=32472.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fcmae_vscyto3d_scratch.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_SEC61B + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml new file mode 100644 index 000000000..f3f1cbe31 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml @@ -0,0 +1,41 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on ER/SEC61B. Scratch control for the pretrained counterpart — +# the two leaves are identical except this one does NOT load pretrained +# encoder weights. See UNEXT2_VS_FCMAE_CLASSES.md for why this is the +# paper-adjacent scratch baseline (and not unext2.yml). +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: er__ipsc_confocal__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_iPSC_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..282b99377 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on er (SEC61B) — joint ipsc_confocal + +# a549_mantis pooled. Scratch control for the pretrained counterpart +# — the two leaves are identical except this one does NOT load +# pretrained encoder weights. Mirrors +# er/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml on the +# joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fcmae_vscyto3d_fit.yml is composed; data +# block inline. +# +# Topology: 4-GPU DDP +# (strategy=ddp_find_unused_parameters_true inherited from +# model_overlays/fcmae_vscyto3d_fit.yml). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: er__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_JOINT_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fcmae_vscyto3d_scratch/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + # a549_mantis — pooled SEC61B all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_JOINT_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke.yml new file mode 100644 index 000000000..7a4b12c8b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke.yml @@ -0,0 +1,107 @@ +# FCMAE scratch joint smoke — minimal repro for the [N,2,48,640,960] +# val-shape mismatch hitting all 8 ER/MITO submissions (jobs 31857838-41 +# joint + 31858456-61 a549-only). Use: +# uv run python applications/dynacell/tools/submit_benchmark_job.py \ +# applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke.yml \ +# --dry-run --print-resolved > /tmp/fcmae_er_smoke.yaml +# uv run python -m dynacell fit --config /tmp/fcmae_er_smoke.yaml \ +# --trainer.devices=1 --trainer.strategy=auto +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: er__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__smoke + +trainer: + devices: 1 + num_nodes: 1 + strategy: auto + max_steps: 2 + limit_val_batches: 1 + num_sanity_val_steps: 1 + logger: false + callbacks: + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /tmp/fcmae_er_smoke/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + batch_size: 4 + num_workers: 0 + pin_memory: false + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: false + persistent_workers: false + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B test48 zarr (48 FOVs). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B_test48.zarr + # a549_mantis — pooled SEC61B all-conditions train store (T=7). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_JOINT_SEC61B_SMOKE + run_root: /tmp/fcmae_er_smoke diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml new file mode 100644 index 000000000..de515c145 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml @@ -0,0 +1,99 @@ +# FCMAE scratch joint smoke — 4-GPU DDP variant. Reproduces the +# [N,2,48,640,960] val-shape expand error from prod jobs 31857838-41 +# and 31858456-61. Single-GPU runs DO NOT reproduce; bug only appears +# under DDP. fast_dev_run-style: 2 train + 2 val + 2 sanity, no wandb. +# +# Use: +# uv run python applications/dynacell/tools/submit_benchmark_job.py \ +# applications/dynacell/configs/benchmarks/virtual_staining/er/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train_smoke_4gpu.yml +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: er__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__smoke4gpu + +trainer: + max_steps: 2 + limit_train_batches: 2 + limit_val_batches: 2 + num_sanity_val_steps: 2 + enable_checkpointing: false + logger: false + callbacks: [] + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + batch_size: 8 + num_workers: 2 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: false + persistent_workers: false + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B_test48.zarr + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_JOINT_SEC61B_SMOKE4GPU + run_root: /hpc/mydata/alex.kalinin/VisCy/.tmp/fcmae_er_smoke_4gpu + sbatch: + time: "00:30:00" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/a549_mantis/train.yml new file mode 100644 index 000000000..1b822de85 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/a549_mantis/train.yml @@ -0,0 +1,51 @@ +# FNet3D paper-baseline fit on ER (SEC61B marker) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +# Reproduces the trained run at +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fnet3d_paper/. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: a549_mantis + model_name: fnet3d_paper + experiment_id: er__a549_mantis__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_A549_SEC61B_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fnet3d_paper/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FNet3DPaper_A549_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/fnet3d_paper + # 512G to match the shared headroom convention across the fnet3d + # leaves on a549/joint workloads. mmap_preload after the BasicIndexer + # fix peaks at ~75 GB for SEC61B_all alone (single-set); 512G gives + # generous headroom. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..5b09da601 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FNet3DPaper on a549-mantis-sec61b-denv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fnet3d_paper__sec61b_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fnet3d_paper__sec61b_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..553e9bd00 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FNet3DPaper on a549-mantis-sec61b-mock. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fnet3d_paper__sec61b_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fnet3d_paper__sec61b_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..b1204a79e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by FNet3DPaper on a549-mantis-sec61b-zikv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fnet3d_paper__sec61b_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_fnet3d_paper__sec61b_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..a84199039 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_denv test. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 183, loss/validate=0.5991). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_denv + model_name: fnet3d_paper + experiment_id: er__ipsc_confocal__fnet3d_paper__a549_mantis_sec61b_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints/epoch=183-step=134688.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fnet3d_paper__sec61b_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_SEC61B_ON_A549_sec61b_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..8821e61af --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_mock test. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 183, loss/validate=0.5991). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_mock + model_name: fnet3d_paper + experiment_id: er__ipsc_confocal__fnet3d_paper__a549_mantis_sec61b_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints/epoch=183-step=134688.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fnet3d_paper__sec61b_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_SEC61B_ON_A549_sec61b_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..0d72506d5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_zikv test. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 183, loss/validate=0.5991). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_zikv + model_name: fnet3d_paper + experiment_id: er__ipsc_confocal__fnet3d_paper__a549_mantis_sec61b_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints/epoch=183-step=134688.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_fnet3d_paper__sec61b_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_SEC61B_ON_A549_sec61b_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..01395bee1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: ER (SEC61B) against ipsc_confocal test_cropped. +# Uses best val-loss checkpoint (epoch 183, loss/validate=0.5991). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: er__ipsc_confocal__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints/epoch=183-step=134688.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fnet3d_paper.zarr + +launcher: + job_name: FNet3DPaper_PRED_SEC61B + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/train.yml new file mode 100644 index 000000000..3d55701de --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/ipsc_confocal/train.yml @@ -0,0 +1,39 @@ +# FNet3D paper-baseline fit on ER (SEC61B marker) — AICS iPSC confocal. +# Reproduces the trained run at +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: er__ipsc_confocal__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_iPSC_SEC61B_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints + +launcher: + job_name: FNet3DPaper_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..b430d6771 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,123 @@ +# FNet3D paper-baseline fit on er (SEC61B) — joint +# ipsc_confocal + a549_mantis pooled. Mirrors +# er/fnet3d_paper/ipsc_confocal/train.yml on the joint +# train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fnet3d_paper_fit.yml is composed; data block +# inline. Norms + 8-crops-per-FOV diverge from the CellDiff/UNetViT +# conventions: target channel uses mean/std (not median/iqr) and +# val augmentations are CPU CenterSpatialCropd on the raw keys (the +# baseline's training pipeline doesn't go through GPU val transforms). +# +# Topology: single GPU, any model, long wall — same as +# fnet3d_paper/ipsc_confocal/train.yml. The paper baseline is single-GPU +# and we keep that here so iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: fnet3d_paper + experiment_id: er__joint_ipsc_confocal_a549_mantis__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_JOINT_SEC61B_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fnet3d_paper/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 32 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples (unlike single-set), + # so 6 * num_samples=8 = 48 GPU samples matches single-set effective. + batch_size: 6 + num_workers: 8 + yx_patch_size: [64, 64] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [32, 64, 64] + num_samples: 8 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [1] + prob: 0.5 + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [2] + prob: 0.5 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Structure] + roi_size: [32, 64, 64] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + # a549_mantis — pooled SEC61B all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: FNet3DPaper_JOINT_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/fnet3d_paper + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train_smoke.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train_smoke.yml new file mode 100644 index 000000000..8659579f3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train_smoke.yml @@ -0,0 +1,122 @@ +# FNet3D paper-baseline joint smoke (single GPU, local interactive). +# Pairs the iPSC SEC61B_test12 zarr (12 FOVs, 2.4 GB) with the 4-FOV +# a549_mantis SEC61B store so a single A40 / H200 can iterate the joint +# loader end-to-end without the full ~250 GB iPSC SEC61B cache wait. +# +# Why a sibling leaf rather than --override at submit time: dotlist / +# bracket syntax (`data.init_args.data_modules.0.init_args.data_path=...`) +# does not index into list elements via submit_benchmark_job.py's override +# parser. Pre-swapping data_paths in a sibling leaf is the supported fix +# (same pattern as celldiff/joint_*/train_smoke.yml). +# +# Use: +# uv run python applications/dynacell/tools/submit_benchmark_job.py \ +# applications/dynacell/configs/benchmarks/virtual_staining/er/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train_smoke.yml \ +# --dry-run --print-resolved > /tmp/fnet_joint_smoke.yaml +# uv run python -m dynacell fit --config /tmp/fnet_joint_smoke.yaml +base: + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: fnet3d_paper + experiment_id: er__joint_ipsc_confocal_a549_mantis__fnet3d_paper__smoke + +trainer: + # Bound the run for a smoke; smoke doesn't need wandb logging. + max_steps: 10 + val_check_interval: 5 + limit_val_batches: 2 + logger: false + callbacks: + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /tmp/fnet_joint_smoke/checkpoints + +# `_`-prefixed top-level keys are stripped by load_composed_config. +_hcs_init_args: &hcs_init_args + source_channel: [Phase3D] + target_channel: [Structure] + z_window_size: 32 + # bs=8 conservatively fits a single A40 with FNet3D fp32 / 32×64×64 + # patches; bump to bs=16 or bs=32 if memory headroom is plentiful. + # bs % num_samples == 0 is required by HCSDataModule. + batch_size: 8 + # num_workers: 0 + pin_memory: false avoids fork-after-CUDA + pin-thread + # races that can deadlock the joint dataloader's first iter() on + # interactive GPU nodes. Plenty fast for a 10-step smoke; tune up + # (e.g. 2-4 workers, pin_memory=true) for full training. + num_workers: 0 + pin_memory: false + yx_patch_size: [64, 64] + split_ratio: 0.8 + mmap_preload: false + persistent_workers: false + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [32, 64, 64] + num_samples: 8 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [1] + prob: 0.5 + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [2] + prob: 0.5 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Structure] + roi_size: [32, 64, 64] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B test12 zarr (12 FOVs, 2.4 GB). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B_test12.zarr + # a549_mantis — 2024_11_07 SEC61B train store (only 4 FOVs). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +# launcher block kept minimal — local smoke isn't submitted via sbatch. +# Set so submit_benchmark_job.py --dry-run still composes successfully. +launcher: + job_name: FNet3DPaper_JOINT_SEC61B_SMOKE_LOCAL + run_root: /tmp/fnet_joint_smoke diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/a549_mantis/train.yml new file mode 100644 index 000000000..8667ebe44 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/a549_mantis/train.yml @@ -0,0 +1,43 @@ +# UNetViT3D fit on ER (SEC61B marker) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: a549_mantis + model_name: unetvit3d + experiment_id: er__a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_A549_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/sec61b/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/sec61b/unetvit3d/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: UNetViT3D_A549_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/sec61b/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..8bc4b6c11 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by UNetViT3D on a549-mantis-sec61b-denv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_unetvit3d__sec61b_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_unetvit3d__sec61b_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..415e93943 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by UNetViT3D on a549-mantis-sec61b-mock. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_unetvit3d__sec61b_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_unetvit3d__sec61b_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..1439255be --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by UNetViT3D on a549-mantis-sec61b-zikv. +defaults: + - override /target: er_sec61b + - override /predict_set: a549_mantis_sec61b_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_unetvit3d__sec61b_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_sec61b_unetvit3d__sec61b_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..8f4329e08 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: ER (SEC61B) predicted by UNetViT3D on iPSC confocal. +defaults: + - override /target: er_sec61b + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_unetvit3d.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_sec61b_unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..e06039b8e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_denv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_denv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_denv + model_name: unetvit3d + experiment_id: er__ipsc_confocal__unetvit3d__a549_mantis_sec61b_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_unetvit3d__sec61b_denv.zarr + +launcher: + job_name: UNetViT3D_PRED_SEC61B_ON_A549_sec61b_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..a74e2026a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_mock test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_mock.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_mock + model_name: unetvit3d + experiment_id: er__ipsc_confocal__unetvit3d__a549_mantis_sec61b_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_unetvit3d__sec61b_mock.zarr + +launcher: + job_name: UNetViT3D_PRED_SEC61B_ON_A549_sec61b_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..19b0939bb --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: ER (SEC61B) trained on iPSC, predicting against a549_mantis_sec61b_zikv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_sec61b_zikv.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: a549_mantis_sec61b_zikv + model_name: unetvit3d + experiment_id: er__ipsc_confocal__unetvit3d__a549_mantis_sec61b_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/sec61b_unetvit3d__sec61b_zikv.zarr + +launcher: + job_name: UNetViT3D_PRED_SEC61B_ON_A549_sec61b_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..0a021b3d0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: ER (SEC61B) against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: unetvit3d + experiment_id: er__ipsc_confocal__unetvit3d__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_unetvit3d.zarr + +launcher: + job_name: UNetViT3D_PRED_SEC61B + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/train.yml new file mode 100644 index 000000000..d0b03dfd2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/ipsc_confocal/train.yml @@ -0,0 +1,37 @@ +# UNetViT3D fit on ER (SEC61B marker) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: ipsc_confocal + model_name: unetvit3d + experiment_id: er__ipsc_confocal__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_iPSC_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d/checkpoints + +launcher: + job_name: UNetViT3D_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/sec61b/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..7cc35bed1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,148 @@ +# UNetViT3D fit on ER (SEC61B) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/unetvit3d_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# Topology: single H200, single GPU — same as unetvit3d/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: unetvit3d + experiment_id: er__joint_ipsc_confocal_a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_JOINT_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/unetvit3d/checkpoints + +# Child HCSDataModule init_args shared across both datasets (only data_path +# differs). Factored as a YAML anchor so the joint leaf stays auditable. +# +# Naming convention: top-level keys starting with `_` are private to the +# YAML compose layer and are stripped by `load_composed_config` before +# the dict reaches LightningCLI / jsonargparse (which would reject them +# as unknown options). The merge expansion under `data:` survives. +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + # a549_mantis — pooled SEC61B all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: UNetViT3D_JOINT_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/sec61b/unetvit3d + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/a549_mantis/train.yml new file mode 100644 index 000000000..3b0e02a52 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/a549_mantis/train.yml @@ -0,0 +1,53 @@ +# Timm-backed UNeXt2 (viscy_models.unet.unext2:UNeXt2) supervised scratch +# baseline on ER/SEC61B — i.e. NOT FullyConvolutionalMAE(pretraining=False). +# This answers "how does the dynacell UNeXt2 recipe train at all?" — it is +# NOT the apples-to-apples scratch control for FCMAE-pretrained init. The +# FCMAE paper-adjacent scratch baseline lives at fcmae_vscyto3d_scratch.yml +# and uses a different model class. See +# applications/dynacell/configs/benchmarks/UNEXT2_VS_FCMAE_CLASSES.md. +# +# Reproduces wandb run 20260409-020023_UNeXt2_iPSC_SEC61B (Dihan's Run 4, +# commit 46e4c79): lr=0.0004, batch_size=32, z_window_size=20, 4-GPU DDP. +# MixedLoss(L1 0.5 + DSSIM 0.5). max_epochs=200. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/unext2_fit.yml + - ../../../_internal/shared/model/model_overlays/unext2_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: a549_mantis + model_name: unext2_timm_scratch + experiment_id: er__a549_mantis__unext2_timm_scratch + +trainer: + logger: + init_args: + name: UNeXt2_A549_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/unext2 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/unext2/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: UNeXt2_A549_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/sec61b/unext2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/ipsc_confocal/train.yml new file mode 100644 index 000000000..9121f5692 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/ipsc_confocal/train.yml @@ -0,0 +1,47 @@ +# Timm-backed UNeXt2 (viscy_models.unet.unext2:UNeXt2) supervised scratch +# baseline on ER/SEC61B — i.e. NOT FullyConvolutionalMAE(pretraining=False). +# This answers "how does the dynacell UNeXt2 recipe train at all?" — it is +# NOT the apples-to-apples scratch control for FCMAE-pretrained init. The +# FCMAE paper-adjacent scratch baseline lives at fcmae_vscyto3d_scratch.yml +# and uses a different model class. See +# applications/dynacell/configs/benchmarks/UNEXT2_VS_FCMAE_CLASSES.md. +# +# Reproduces wandb run 20260409-020023_UNeXt2_iPSC_SEC61B (Dihan's Run 4, +# commit 46e4c79): lr=0.0004, batch_size=32, z_window_size=20, 4-GPU DDP. +# MixedLoss(L1 0.5 + DSSIM 0.5). max_epochs=200. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/er_sec61b.yml + - ../../../_internal/shared/model/data_overlays/unext2_fit.yml + - ../../../_internal/shared/model/model_overlays/unext2_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + train_set: ipsc_confocal + model_name: unext2_timm_scratch + experiment_id: er__ipsc_confocal__unext2_timm_scratch + +trainer: + logger: + init_args: + name: UNeXt2_iPSC_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/checkpoints + +launcher: + job_name: UNeXt2_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..e1b905ebb --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/er/unext2/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# Timm-backed UNeXt2 (viscy_models.unet.unext2:UNeXt2) supervised +# scratch baseline on er (SEC61B) — joint ipsc_confocal + +# a549_mantis pooled. Mirrors +# er/unext2/ipsc_confocal/train.yml on the joint train_set. +# Reproduces Run 4 hparams (lr=0.0004, bs=32, z=20, 4-GPU DDP, +# MixedLoss(L1 0.5 + DSSIM 0.5), max_epochs=200) inherited from +# model_overlays/unext2_fit.yml. +# +# This is NOT the apples-to-apples scratch control for FCMAE-pretrained +# init. The FCMAE paper-adjacent scratch baseline lives at +# fcmae_vscyto3d_scratch/joint_*/train.yml and uses a different model +# class. See applications/dynacell/configs/benchmarks/UNEXT2_VS_FCMAE_CLASSES.md. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/unext2_fit.yml is composed; data block inline. +# +# Topology: 4-GPU DDP (inherited from +# model_overlays/unext2_fit.yml's ddp_4gpu base). +base: + - ../../../_internal/shared/model/model_overlays/unext2_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: er + gene: SEC61B + target: er + target_id: er_sec61b + train_set: joint_ipsc_confocal_a549_mantis + model_name: unext2_timm_scratch + experiment_id: er__joint_ipsc_confocal_a549_mantis__unext2_timm_scratch + +trainer: + logger: + init_args: + name: UNeXt2_JOINT_SEC61B + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/unext2 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/unext2/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + batch_size: 32 + num_workers: 8 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc SEC61B train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + # a549_mantis — pooled SEC61B all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_all.zarr + +launcher: + job_name: UNeXt2_JOINT_SEC61B + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/sec61b/unext2 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/a549_mantis/train.yml new file mode 100644 index 000000000..a2e1efcd8 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/a549_mantis/train.yml @@ -0,0 +1,42 @@ +# CellDiff fit on membrane (Membrane channel of cell.zarr) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: a549_mantis + model_name: celldiff + experiment_id: membrane__a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_A549_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/memb/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/memb/celldiff/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Membrane + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: CELLDiff_A549_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/memb/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..cc1f59246 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by CellDiff on a549-mantis-caax-denv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-denv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_denv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_celldiff_sliding_window_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_celldiff_sliding_window_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..b11c3ef2f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by CellDiff on a549-mantis-caax-mock. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-mock. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_mock + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_celldiff_sliding_window_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_celldiff_sliding_window_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..21cd29c38 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by CellDiff on a549-mantis-caax-zikv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-zikv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_zikv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_celldiff_sliding_window_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_celldiff_sliding_window_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..74852e701 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: Membrane predicted by CellDiff on iPSC confocal. +defaults: + - override /target: membrane + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_celldiff_sliding_window.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_memb_celldiff_sliding_window diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..77068ff9b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: membrane trained on iPSC, predicting against a549-mantis-caax-denv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_denv + model_name: celldiff + experiment_id: membrane__ipsc_confocal__celldiff__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_celldiff_iterative_denv.zarr + +launcher: + job_name: CELLDiff_PRED_MEMB_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..fea91272a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# CellDiff predict: membrane trained on iPSC, predicting against a549-mantis-caax-mock test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_mock + model_name: celldiff + experiment_id: membrane__ipsc_confocal__celldiff__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_celldiff_iterative_mock.zarr + +launcher: + job_name: CELLDiff_PRED_MEMB_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..4cbab5136 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: membrane trained on iPSC, predicting against a549-mantis-caax-zikv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_zikv + model_name: celldiff + experiment_id: membrane__ipsc_confocal__celldiff__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_celldiff_iterative_zikv.zarr + +launcher: + job_name: CELLDiff_PRED_MEMB_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..3c10f5c9e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,44 @@ +# CellDiff predict: membrane against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: celldiff + experiment_id: membrane__ipsc_confocal__celldiff__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 40 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_celldiff_sliding_window.zarr + +launcher: + job_name: CELLDiff_PRED_MEMB + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/train.yml new file mode 100644 index 000000000..e516ad7ca --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/ipsc_confocal/train.yml @@ -0,0 +1,36 @@ +# CellDiff fit on membrane (Membrane channel of cell.zarr) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: ipsc_confocal + model_name: celldiff + experiment_id: membrane__ipsc_confocal__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_iPSC_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff/checkpoints + +launcher: + job_name: CELLDiff_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..b965ce9ce --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: membrane trained on joint iPSC+A549, predicting against a549-mantis-caax-denv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_denv + model_name: celldiff + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_celldiff_denv.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_MEMB_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..3b0da5355 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# CellDiff predict: membrane trained on joint iPSC+A549, predicting against a549-mantis-caax-mock test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_mock + model_name: celldiff + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_celldiff_mock.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_MEMB_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_movie.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_movie.yml new file mode 100644 index 000000000..2fb3dc554 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_movie.yml @@ -0,0 +1,49 @@ +# CellDiff predict: membrane trained on joint iPSC+A549, predicting on the +# cropped A549 TOMM20 DENV movie zarr (125 timepoints). +# No dataset_ref — data path is set directly to bypass the manifest system. +base: + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_movie + model_name: celldiff + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_movie + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative + predict_overlap: [4, 256, 256] + +data: + class_path: viscy_data.hcs.HCSDataModule + init_args: + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/movie/2024_11_21_A549_TOMM20_DENV_crop.zarr + source_channel: Phase3D + target_channel: Membrane + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/movie_predictions/memb_celldiff_joint.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_MEMB_ON_A549_MOVIE + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/movie_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..be1158294 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: membrane trained on joint iPSC+A549, predicting against a549-mantis-caax-zikv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_zikv + model_name: celldiff + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_celldiff_zikv.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_MEMB_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..16009543b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,44 @@ +# CellDiff predict: membrane trained on joint iPSC+A549, predicting against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: celldiff + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__celldiff__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 40 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions/memb_celldiff.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_MEMB_ON_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..c308da338 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/celldiff/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# CellDiff fit on membrane (Membrane) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/celldiff_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# iPSC source is the multi-marker cell.zarr (Brightfield, Nuclei, +# Membrane, Phase3D); A549 source is the CAAX-marker pooled store +# CAAX_all.zarr. The shared target_channel name is `Membrane` in both. +# +# Topology: single H200, single GPU — same as celldiff/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + gene: Membrane + target: membrane + target_id: membrane + train_set: joint_ipsc_confocal_a549_mantis + model_name: celldiff + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_JOINT_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Membrane + z_window_size: 13 + batch_size: 2 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: CELLDiff_JOINT_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/celldiff + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/a549_mantis/train.yml new file mode 100644 index 000000000..af89bcacb --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/a549_mantis/train.yml @@ -0,0 +1,67 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on membrane (Membrane marker). Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). See vs_test/finetune_3d.py +# for the canonical recipe. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__a549_mantis__fcmae_vscyto3d_pretrained + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual membrane channel +# name in keys/w_key. spatial_size + num_samples kept identical to the +# FCMAE overlay so the augmentation policy matches ER/Mito. +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Membrane + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [20, 600, 600] + num_samples: 4 + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_A549_Membrane + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_pretrained + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_pretrained/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_A549_Membrane + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_pretrained diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..16b577ccf --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-caax-denv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-denv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_denv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fcmae_vscyto3d_pretrained_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..7ff244cf5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-caax-mock. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-mock. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_mock + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fcmae_vscyto3d_pretrained_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..1a8143be7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-caax-zikv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-zikv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_zikv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fcmae_vscyto3d_pretrained_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..59d75b8a7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,48 @@ +# FCMAE_VSCyto3D_Pretrained predict: membrane trained on iPSC, +# predicting against a549-mantis-caax-denv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side +# `membrane` target_id from targets/membrane.yml so the resolver finds +# the caax target on a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=189-step=59280.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..309d3dae6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,48 @@ +# FCMAE_VSCyto3D_Pretrained predict: membrane trained on iPSC, +# predicting against a549-mantis-caax-mock test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side +# `membrane` target_id from targets/membrane.yml so the resolver finds +# the caax target on a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=189-step=59280.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..c97983bbb --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,48 @@ +# FCMAE_VSCyto3D_Pretrained predict: membrane trained on iPSC, +# predicting against a549-mantis-caax-zikv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side +# `membrane` target_id from targets/membrane.yml so the resolver finds +# the caax target on a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=189-step=59280.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..c8f9a2523 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,41 @@ +# FCMAE_VSCyto3D_Pretrained predict: membrane (CAAX) against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=189-step=59280.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_pretrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml new file mode 100644 index 000000000..4fb87dd32 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml @@ -0,0 +1,64 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on membrane (Membrane marker). Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). See vs_test/finetune_3d.py +# for the canonical recipe. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_pretrained + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual membrane channel +# name in keys/w_key. spatial_size + num_samples kept identical to the +# FCMAE overlay so the augmentation policy matches ER/Mito. +data: + init_args: + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [20, 600, 600] + num_samples: 4 + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_iPSC_Membrane + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_Membrane + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_pretrained diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..720334ea4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: cell membrane trained on joint +# iPSC+A549, predicting against a549-mantis-caax-denv test. +# Best val-loss checkpoint from job 31822529 (epoch 111, loss/validate=0.3754). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=111-step=59360.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_jointtrained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_JOINTTR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..90053e201 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: cell membrane trained on joint +# iPSC+A549, predicting against a549-mantis-caax-mock test. +# Best val-loss checkpoint from job 31822529 (epoch 111, loss/validate=0.3754). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=111-step=59360.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_jointtrained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_JOINTTR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..4e092905c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: cell membrane trained on joint +# iPSC+A549, predicting against a549-mantis-caax-zikv test. +# Best val-loss checkpoint from job 31822529 (epoch 111, loss/validate=0.3754). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=111-step=59360.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_pretrained_jointtrained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_JOINTTR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..c238d0325 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: cell membrane trained on joint +# iPSC+A549, predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31822529 (epoch 111, loss/validate=0.3754). +# Wandb run 20260501-004706_FCMAE_VSCyto3D_Pretrained_JOINT_MEMB (state=finished, +# 119 ep / 63,179 steps). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained/checkpoints/epoch=111-step=59360.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_pretrained_jointtrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_MEMB_JOINTTR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..9e543f5b1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,154 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on membrane (MEMB) — joint +# ipsc_confocal + a549_mantis pooled. Companion to +# fcmae_vscyto3d_scratch joint leaf — the two are identical except +# this one loads encoder weights from the published VSCyto3D FCMAE +# ckpt (400 ep on HEK + A549 + iPSC phase data). Mirrors +# membrane/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml on +# the joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/fcmae_vscyto3d_fit.yml is composed; +# the data block is authored inline because joint hparams live on +# the children. +# +# Topology: 4-GPU DDP (inherited from +# model_overlays/fcmae_vscyto3d_fit.yml's ddp_4gpu base; the overlay +# also pins strategy=ddp_find_unused_parameters_true because +# FullyConvolutionalMAE has decoder/head params that only receive +# gradients on some forward paths). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + gene: Membrane + target: membrane + target_id: membrane + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_JOINT_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Membrane + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr (Membrane channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled CAAX all-conditions train store (Membrane channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_JOINT_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_pretrained diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..e7f420a5e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: membrane trained on a549_mantis (caax), +# predicting against a549-mantis-caax-denv test. +# Best val-loss checkpoint from job 31822574 (epoch 119, loss/validate=0.2722). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: a549_mantis_caax_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=119-step=26040.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_a549trained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..dc0121cd6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: membrane trained on a549_mantis (caax), +# predicting against a549-mantis-caax-mock test. +# Best val-loss checkpoint from job 31822574 (epoch 119, loss/validate=0.2722). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: a549_mantis_caax_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=119-step=26040.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_a549trained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..3139c23e6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: membrane trained on a549_mantis (caax), +# predicting against a549-mantis-caax-zikv test. +# Best val-loss checkpoint from job 31822574 (epoch 119, loss/validate=0.2722). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: a549_mantis_caax_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=119-step=26040.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_a549trained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..1ec19e263 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: membrane trained on a549_mantis (caax), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31822574 (epoch 119, loss/validate=0.2722). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__a549_mantis__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=119-step=26040.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_scratch_a549trained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/train.yml new file mode 100644 index 000000000..a01a7175d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/a549_mantis/train.yml @@ -0,0 +1,59 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on membrane (Membrane marker). Scratch control for the +# pretrained counterpart — the two leaves are identical except this one +# does NOT load pretrained encoder weights. See UNEXT2_VS_FCMAE_CLASSES.md +# for why this is the paper-adjacent scratch baseline (and not unext2.yml). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__a549_mantis__fcmae_vscyto3d_scratch + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual membrane channel +# name in keys/w_key. spatial_size + num_samples kept identical to the +# FCMAE overlay so the augmentation policy matches ER/Mito. +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Membrane + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [20, 600, 600] + num_samples: 4 + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_A549_Membrane + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_A549_Membrane + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..5631ef696 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-caax-denv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-denv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_denv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fcmae_vscyto3d_scratch_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..089c61195 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-caax-mock. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-mock. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_mock + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fcmae_vscyto3d_scratch_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..d469ce9e6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-caax-zikv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-zikv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_zikv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fcmae_vscyto3d_scratch_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..4435780ac --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,58 @@ +# FCMAE_VSCyto3D_Scratch predict: membrane trained on iPSC, +# predicting against a549-mantis-caax-denv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side +# `membrane` target_id from targets/membrane.yml so the resolver finds +# the caax target on a549-mantis-caax-denv. +# +# TODO: replace ckpt_path once iPSC FCMAE scratch membrane training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + # Best checkpoint from J31710718 (FCMAE_VSCyto3D_Scratch_iPSC_Membrane): + # ep 136 / val_loss 0.39590 (27-epoch plateau, scancelled at 1d 11h elapsed). + # Note: pretrained variant (J31795524, ep 194 = 0.37878) outperforms scratch + # by 4.3%; prefer the pretrained predict configs for downstream eval. + # Hardlink alias at run_root; underlying epoch=136-step=42744.ckpt also + # preserved in checkpoints_frozen_ep136_20260501_005505/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/best_ep136_val0.39590.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..b7dc40411 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,58 @@ +# FCMAE_VSCyto3D_Scratch predict: membrane trained on iPSC, +# predicting against a549-mantis-caax-mock test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side +# `membrane` target_id from targets/membrane.yml so the resolver finds +# the caax target on a549-mantis-caax-mock. +# +# TODO: replace ckpt_path once iPSC FCMAE scratch membrane training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + # Best checkpoint from J31710718 (FCMAE_VSCyto3D_Scratch_iPSC_Membrane): + # ep 136 / val_loss 0.39590 (27-epoch plateau, scancelled at 1d 11h elapsed). + # Note: pretrained variant (J31795524, ep 194 = 0.37878) outperforms scratch + # by 4.3%; prefer the pretrained predict configs for downstream eval. + # Hardlink alias at run_root; underlying epoch=136-step=42744.ckpt also + # preserved in checkpoints_frozen_ep136_20260501_005505/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/best_ep136_val0.39590.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..2ed6cb656 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,58 @@ +# FCMAE_VSCyto3D_Scratch predict: membrane trained on iPSC, +# predicting against a549-mantis-caax-zikv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side +# `membrane` target_id from targets/membrane.yml so the resolver finds +# the caax target on a549-mantis-caax-zikv. +# +# TODO: replace ckpt_path once iPSC FCMAE scratch membrane training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + # Best checkpoint from J31710718 (FCMAE_VSCyto3D_Scratch_iPSC_Membrane): + # ep 136 / val_loss 0.39590 (27-epoch plateau, scancelled at 1d 11h elapsed). + # Note: pretrained variant (J31795524, ep 194 = 0.37878) outperforms scratch + # by 4.3%; prefer the pretrained predict configs for downstream eval. + # Hardlink alias at run_root; underlying epoch=136-step=42744.ckpt also + # preserved in checkpoints_frozen_ep136_20260501_005505/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/best_ep136_val0.39590.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..4348508b9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,51 @@ +# FCMAE_VSCyto3D_Scratch predict: membrane (CAAX) against ipsc_confocal test_cropped. +# +# TODO: replace ckpt_path with best-val ckpt once iPSC FCMAE scratch +# membrane training (J31710718, resumed from J31475106) completes. Expected dir: +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/checkpoints/ +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + # Best checkpoint from J31710718 (FCMAE_VSCyto3D_Scratch_iPSC_Membrane): + # ep 136 / val_loss 0.39590 (27-epoch plateau, scancelled at 1d 11h elapsed). + # Note: pretrained variant (J31795524, ep 194 = 0.37878) outperforms scratch + # by 4.3%; prefer the pretrained predict configs for downstream eval. + # Hardlink alias at run_root; underlying epoch=136-step=42744.ckpt also + # preserved in checkpoints_frozen_ep136_20260501_005505/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/best_ep136_val0.39590.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_scratch.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml new file mode 100644 index 000000000..5b5c2866e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml @@ -0,0 +1,56 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on membrane (Membrane marker). Scratch control for the +# pretrained counterpart — the two leaves are identical except this one +# does NOT load pretrained encoder weights. See UNEXT2_VS_FCMAE_CLASSES.md +# for why this is the paper-adjacent scratch baseline (and not unext2.yml). +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__ipsc_confocal__fcmae_vscyto3d_scratch + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual membrane channel +# name in keys/w_key. spatial_size + num_samples kept identical to the +# FCMAE overlay so the augmentation policy matches ER/Mito. +data: + init_args: + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [20, 600, 600] + num_samples: 4 + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_iPSC_Membrane + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_Membrane + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..4fadc338b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: cell membrane trained on joint +# iPSC+A549, predicting against a549-mantis-caax-denv test. +# Best val-loss checkpoint from job 31822536 (epoch 112, loss/validate=0.3859). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=112-step=59890.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_jointtrained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_JOINTTR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..01ac28ba4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: cell membrane trained on joint +# iPSC+A549, predicting against a549-mantis-caax-mock test. +# Best val-loss checkpoint from job 31822536 (epoch 112, loss/validate=0.3859). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=112-step=59890.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_jointtrained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_JOINTTR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..fb3944f47 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: cell membrane trained on joint +# iPSC+A549, predicting against a549-mantis-caax-zikv test. +# Best val-loss checkpoint from job 31822536 (epoch 112, loss/validate=0.3859). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=112-step=59890.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fcmae_vscyto3d_scratch_jointtrained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_JOINTTR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..897c0677e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: cell membrane trained on joint +# iPSC+A549, predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31822536 (epoch 112, loss/validate=0.3859). +# Wandb run 20260501-011350_FCMAE_VSCyto3D_Scratch_JOINT_MEMB (state=finished, +# 118 ep / 62,799 steps). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints/epoch=112-step=59890.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_scratch_jointtrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_MEMB_JOINTTR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..15d247bf7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on membrane (MEMB) — joint ipsc_confocal + +# a549_mantis pooled. Scratch control for the pretrained counterpart +# — the two leaves are identical except this one does NOT load +# pretrained encoder weights. Mirrors +# membrane/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml on the +# joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fcmae_vscyto3d_fit.yml is composed; data +# block inline. +# +# Topology: 4-GPU DDP +# (strategy=ddp_find_unused_parameters_true inherited from +# model_overlays/fcmae_vscyto3d_fit.yml). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + gene: Membrane + target: membrane + target_id: membrane + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_JOINT_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Membrane + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr (Membrane channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled CAAX all-conditions train store (Membrane channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_JOINT_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..31c8c60e9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FNet3D paper-baseline predict: membrane trained on a549_mantis (caax), +# predicting against a549-mantis-caax-denv test. +# Best val-loss checkpoint from job 31858488 (epoch 281, loss/validate=0.3143). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: a549_mantis_caax_denv + model_name: fnet3d_paper + experiment_id: membrane__a549_mantis__fnet3d_paper__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper/checkpoints/epoch=281-step=191760.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_a549trained_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..aa25897db --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FNet3D paper-baseline predict: membrane trained on a549_mantis (caax), +# predicting against a549-mantis-caax-mock test. +# Best val-loss checkpoint from job 31858488 (epoch 281, loss/validate=0.3143). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: a549_mantis_caax_mock + model_name: fnet3d_paper + experiment_id: membrane__a549_mantis__fnet3d_paper__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper/checkpoints/epoch=281-step=191760.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_a549trained_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..61070c591 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FNet3D paper-baseline predict: membrane trained on a549_mantis (caax), +# predicting against a549-mantis-caax-zikv test. +# Best val-loss checkpoint from job 31858488 (epoch 281, loss/validate=0.3143). +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: a549_mantis_caax_zikv + model_name: fnet3d_paper + experiment_id: membrane__a549_mantis__fnet3d_paper__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper/checkpoints/epoch=281-step=191760.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_a549trained_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..1b4db25bb --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# FNet3D paper-baseline predict: membrane trained on a549_mantis (caax), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31858488 (epoch 281, loss/validate=0.3143). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: membrane__a549_mantis__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper/checkpoints/epoch=281-step=191760.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fnet3d_paper_a549trained.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/train.yml new file mode 100644 index 000000000..f1ccc4a5f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/a549_mantis/train.yml @@ -0,0 +1,77 @@ +# FNet3D paper-baseline fit on membrane (Membrane channel of cell.zarr) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +# The overlay's norm/aug/val_aug are keyed on Structure (the SEC61B/TOMM20 target +# channel). Membrane target_channel is Membrane, so we list-replace those three +# lists here to re-key them. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: a549_mantis + model_name: fnet3d_paper + experiment_id: membrane__a549_mantis__fnet3d_paper + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Membrane + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [32, 64, 64] + num_samples: 8 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Membrane] + roi_size: [32, 64, 64] + +trainer: + logger: + init_args: + name: FNet3D_A549_MEMB_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper/checkpoints + +launcher: + job_name: FNet3DPaper_A549_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/memb/fnet3d_paper + # 512G to match the shared headroom convention across the fnet3d + # leaves on a549/joint workloads. mmap_preload after the BasicIndexer + # fix peaks at ~75 GB for CAAX_all alone (single-set); 512G gives + # generous headroom. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..e34390e82 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FNet3DPaper on a549-mantis-caax-denv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-denv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_denv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fnet3d_paper_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..8bd9c6f49 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FNet3DPaper on a549-mantis-caax-mock. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-mock. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_mock + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fnet3d_paper_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..603d7b581 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by FNet3DPaper on a549-mantis-caax-zikv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-zikv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_zikv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_fnet3d_paper_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..bd94247c2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,48 @@ +# FNet3D paper-baseline predict: membrane trained on iPSC, predicting against a549-mantis-caax-denv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 181, loss/validate=0.6214). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_denv + model_name: fnet3d_paper + experiment_id: membrane__ipsc_confocal__fnet3d_paper__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper/checkpoints/epoch=181-step=157612.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..c5303c088 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,48 @@ +# FNet3D paper-baseline predict: membrane trained on iPSC, predicting against a549-mantis-caax-mock test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 181, loss/validate=0.6214). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_mock + model_name: fnet3d_paper + experiment_id: membrane__ipsc_confocal__fnet3d_paper__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper/checkpoints/epoch=181-step=157612.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..0692cad41 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,48 @@ +# FNet3D paper-baseline predict: membrane trained on iPSC, predicting against a549-mantis-caax-zikv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 181, loss/validate=0.6214). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_zikv + model_name: fnet3d_paper + experiment_id: membrane__ipsc_confocal__fnet3d_paper__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper/checkpoints/epoch=181-step=157612.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..7676cb3e2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: membrane against ipsc_confocal test_cropped. +# Uses best val-loss checkpoint (epoch 181, loss/validate=0.6214). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: membrane__ipsc_confocal__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper/checkpoints/epoch=181-step=157612.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fnet3d_paper.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/train.yml new file mode 100644 index 000000000..196645011 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/ipsc_confocal/train.yml @@ -0,0 +1,72 @@ +# FNet3D paper-baseline fit on membrane (Membrane channel of cell.zarr) — AICS iPSC confocal. +# The overlay's norm/aug/val_aug are keyed on Structure (the SEC61B/TOMM20 target +# channel). Membrane target_channel is Membrane, so we list-replace those three +# lists here to re-key them. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: membrane__ipsc_confocal__fnet3d_paper + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [32, 64, 64] + num_samples: 8 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Membrane] + roi_size: [32, 64, 64] + +trainer: + logger: + init_args: + name: FNet3D_iPSC_MEMB_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper/checkpoints + +launcher: + job_name: FNet3DPaper_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/memb/fnet3d_paper + # cell.zarr-backed preload (same plate as nucleus) puts MaxVMSize over + # the shared 256G cap; bump to match nucleus. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..2bd51fb96 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline predict: cell membrane trained on joint iPSC+A549, +# predicting against a549-mantis-caax-denv test. +# Best val-loss checkpoint from job 31962519 (epoch 116, val 0.5759). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_denv + model_name: fnet3d_paper + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fnet3d_paper__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper/checkpoints/epoch=116-step=180882.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_jointtrained_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_JOINTTR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..cfc58fab7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline predict: cell membrane trained on joint iPSC+A549, +# predicting against a549-mantis-caax-mock test. +# Best val-loss checkpoint from job 31962519 (epoch 116, val 0.5759). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_mock + model_name: fnet3d_paper + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fnet3d_paper__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper/checkpoints/epoch=116-step=180882.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_jointtrained_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_JOINTTR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..0a9b86ee0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline predict: cell membrane trained on joint iPSC+A549, +# predicting against a549-mantis-caax-zikv test. +# Best val-loss checkpoint from job 31962519 (epoch 116, val 0.5759). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_caax_zikv + model_name: fnet3d_paper + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fnet3d_paper__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper/checkpoints/epoch=116-step=180882.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_jointtrained_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_JOINTTR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..411e02233 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,45 @@ +# FNet3D paper-baseline predict: cell membrane trained on joint iPSC+A549, +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31962519 (epoch 116, val 0.5759). +# Wandb run 20260503-181142_FNet3D_JOINT_MEMB_paper (state=finished, 129 ep / +# 199,999 steps; final val 0.6751 — drifted slightly past ep116 best). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper/checkpoints/epoch=116-step=180882.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fnet3d_paper_jointtrained.zarr + +launcher: + job_name: FNet3DPaper_PRED_MEMB_JOINTTR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..e3426b997 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,124 @@ +# FNet3D paper-baseline fit on membrane (MEMB) — joint +# ipsc_confocal + a549_mantis pooled. Mirrors +# membrane/fnet3d_paper/ipsc_confocal/train.yml on the joint +# train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fnet3d_paper_fit.yml is composed; data block +# inline. Norms + 8-crops-per-FOV diverge from the CellDiff/UNetViT +# conventions: target channel uses mean/std (not median/iqr) and +# val augmentations are CPU CenterSpatialCropd on the raw keys (the +# baseline's training pipeline doesn't go through GPU val transforms). +# +# Topology: single GPU, any model, long wall — same as +# fnet3d_paper/ipsc_confocal/train.yml. The paper baseline is single-GPU +# and we keep that here so iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + gene: Membrane + target: membrane + target_id: membrane + train_set: joint_ipsc_confocal_a549_mantis + model_name: fnet3d_paper + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_JOINT_MEMB_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Membrane + z_window_size: 32 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples (unlike single-set), + # so 6 * num_samples=8 = 48 GPU samples matches single-set effective. + batch_size: 6 + num_workers: 8 + yx_patch_size: [64, 64] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [32, 64, 64] + num_samples: 8 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [1] + prob: 0.5 + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [2] + prob: 0.5 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Membrane] + roi_size: [32, 64, 64] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr (Membrane channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled CAAX all-conditions train store (Membrane channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: FNet3DPaper_JOINT_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/memb/fnet3d_paper + # 512G to match the shared headroom convention across the fnet3d + # leaves on a549/joint workloads. mmap_preload after the BasicIndexer + # fix peaks at ~185 GB for joint cell.zarr + CAAX_all; 512G gives + # generous headroom for worker buffers and validation transients. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/a549_mantis/train.yml new file mode 100644 index 000000000..8bca97a50 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/a549_mantis/train.yml @@ -0,0 +1,43 @@ +# UNetViT3D fit on membrane (Membrane channel of cell.zarr) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: a549_mantis + model_name: unetvit3d + experiment_id: membrane__a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_A549_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb_temp/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb_temp/unetvit3d/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Membrane + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: UNetViT3D_A549_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb_temp/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..c5143063d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by UNetViT3D on a549-mantis-caax-denv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-denv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_denv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_unetvit3d_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_unetvit3d_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..91de71fa3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by UNetViT3D on a549-mantis-caax-mock. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-mock. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_mock + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_unetvit3d_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_unetvit3d_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..f78241efb --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Membrane (CAAX) predicted by UNetViT3D on a549-mantis-caax-zikv. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from the target group so the resolver finds caax on a549-mantis-caax-zikv. +defaults: + - override /target: membrane + - override /predict_set: a549_mantis_caax_zikv + +target_name: caax +benchmark: + dataset_ref: + target: caax + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_unetvit3d_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_memb_unetvit3d_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..12c736435 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: Membrane predicted by UNetViT3D on iPSC confocal. +defaults: + - override /target: membrane + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_unetvit3d.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_memb_unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..7095b8c04 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# UNetViT3D predict: membrane trained on iPSC, predicting against a549-mantis-caax-denv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_denv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_denv + model_name: unetvit3d + experiment_id: membrane__ipsc_confocal__unetvit3d__a549_mantis_caax_denv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_unetvit3d_denv.zarr + +launcher: + job_name: UNetViT3D_PRED_MEMB_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..b58ab02d5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# UNetViT3D predict: membrane trained on iPSC, predicting against a549-mantis-caax-mock test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_mock.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_mock + model_name: unetvit3d + experiment_id: membrane__ipsc_confocal__unetvit3d__a549_mantis_caax_mock + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_unetvit3d_mock.zarr + +launcher: + job_name: UNetViT3D_PRED_MEMB_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..222ad00ae --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# UNetViT3D predict: membrane trained on iPSC, predicting against a549-mantis-caax-zikv test. +# A549 manifest keys membrane by gene (`caax`); override the iPSC-side `membrane` +# target_id from targets/membrane.yml so the resolver finds the caax target on +# a549-mantis-caax-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_caax_zikv.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: a549_mantis_caax_zikv + model_name: unetvit3d + experiment_id: membrane__ipsc_confocal__unetvit3d__a549_mantis_caax_zikv + # Override the iPSC-side `membrane` target to a549's gene-keyed `caax`. + dataset_ref: + target: caax + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_unetvit3d_zikv.zarr + +launcher: + job_name: UNetViT3D_PRED_MEMB_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..b7c546e29 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: membrane against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: unetvit3d + experiment_id: membrane__ipsc_confocal__unetvit3d__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_unetvit3d.zarr + +launcher: + job_name: UNetViT3D_PRED_MEMB + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/train.yml new file mode 100644 index 000000000..daf2651d4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/ipsc_confocal/train.yml @@ -0,0 +1,37 @@ +# UNetViT3D fit on membrane (Membrane channel of cell.zarr) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/membrane.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + train_set: ipsc_confocal + model_name: unetvit3d + experiment_id: membrane__ipsc_confocal__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_iPSC_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb_temp/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb_temp/unetvit3d/checkpoints + +launcher: + job_name: UNetViT3D_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/memb_temp/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..79679628f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/membrane/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,143 @@ +# UNetViT3D fit on membrane (Membrane) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/unetvit3d_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# iPSC source is the multi-marker cell.zarr (Brightfield, Nuclei, +# Membrane, Phase3D); A549 source is the CAAX-marker pooled store +# CAAX_all.zarr. The shared target_channel name is `Membrane` in both. +# +# Topology: single H200, single GPU — same as unetvit3d/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: membrane + gene: Membrane + target: membrane + target_id: membrane + train_set: joint_ipsc_confocal_a549_mantis + model_name: unetvit3d + experiment_id: membrane__joint_ipsc_confocal_a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_JOINT_MEMB + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/unetvit3d/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Membrane + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Membrane] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Membrane] + w_key: Membrane + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_all.zarr + +launcher: + job_name: UNetViT3D_JOINT_MEMB + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/memb/unetvit3d + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/a549_mantis/train.yml new file mode 100644 index 000000000..ce92a5ee6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/a549_mantis/train.yml @@ -0,0 +1,42 @@ +# CellDiff fit on mitochondria (TOMM20 marker) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: a549_mantis + model_name: celldiff + experiment_id: mito__a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_A549_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/tomm20/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/tomm20/celldiff/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: CELLDiff_A549_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/tomm20/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..894495d04 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by CellDiff on a549-mantis-tomm20-denv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_celldiff_iterative__tomm20_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_celldiff_iterative__tomm20_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..f66818990 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by CellDiff on a549-mantis-tomm20-mock. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_celldiff_iterative__tomm20_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_celldiff_iterative__tomm20_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..958188888 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by CellDiff on a549-mantis-tomm20-zikv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_celldiff_iterative__tomm20_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_celldiff_iterative__tomm20_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..7d72d571c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by CellDiff on iPSC confocal. +defaults: + - override /target: mito_tomm20 + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_celldiff_iterative.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_tomm20_celldiff_iterative diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..afab52a3b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,44 @@ +# CellDiff predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_denv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_denv + model_name: celldiff + experiment_id: mito__ipsc_confocal__celldiff__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_celldiff_iterative__tomm20_denv.zarr + +launcher: + job_name: CELLDiff_PRED_TOMM20_ON_A549_tomm20_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..fef402aff --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,44 @@ +# CellDiff predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_mock test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_mock + model_name: celldiff + experiment_id: mito__ipsc_confocal__celldiff__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_celldiff_iterative__tomm20_mock.zarr + +launcher: + job_name: CELLDiff_PRED_TOMM20_ON_A549_tomm20_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..dfa488a36 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,44 @@ +# CellDiff predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_zikv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_zikv + model_name: celldiff + experiment_id: mito__ipsc_confocal__celldiff__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_celldiff_iterative__tomm20_zikv.zarr + +launcher: + job_name: CELLDiff_PRED_TOMM20_ON_A549_tomm20_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..5b228321f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,44 @@ +# CellDiff predict: mito (TOMM20) against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: celldiff + experiment_id: mito__ipsc_confocal__celldiff__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 40 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_celldiff_iterative.zarr + +launcher: + job_name: CELLDiff_PRED_TOMM20 + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/train.yml new file mode 100644 index 000000000..30c92cc5e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/ipsc_confocal/train.yml @@ -0,0 +1,36 @@ +# CellDiff fit on mitochondria (TOMM20 marker) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: ipsc_confocal + model_name: celldiff + experiment_id: mito__ipsc_confocal__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_iPSC_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff/checkpoints + +launcher: + job_name: CELLDiff_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..be0a43df0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/celldiff/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# CellDiff fit on mitochondria (TOMM20) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/celldiff_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# Topology: single H200, single GPU — same as celldiff/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + gene: TOMM20 + target: mito + target_id: mito_tomm20 + train_set: joint_ipsc_confocal_a549_mantis + model_name: celldiff + experiment_id: mito__joint_ipsc_confocal_a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_JOINT_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/tomm20/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/tomm20/celldiff/checkpoints + +# `_`-prefixed top-level keys are stripped by load_composed_config; see +# er/celldiff/joint_*/train.yml for the full anchor-convention rationale. +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc TOMM20 train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/TOMM20.zarr + # a549_mantis — pooled TOMM20 all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: CELLDiff_JOINT_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/tomm20/celldiff + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/a549_mantis/train.yml new file mode 100644 index 000000000..f19f0f1a6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/a549_mantis/train.yml @@ -0,0 +1,55 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on mito/TOMM20. Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). Mirrors +# er/ipsc_confocal/fcmae_vscyto3d_pretrained.yml. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__a549_mantis__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_A549_TOMM20_ws8500 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_pretrained_ws8500 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_A549_TOMM20_ws8500 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_pretrained_ws8500 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..5a4030259 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-tomm20-denv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_pretrained__tomm20_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fcmae_vscyto3d_pretrained__tomm20_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..37f1b4e65 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-tomm20-mock. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_pretrained__tomm20_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fcmae_vscyto3d_pretrained__tomm20_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..3b3f2e993 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-tomm20-zikv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_pretrained__tomm20_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fcmae_vscyto3d_pretrained__tomm20_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..3477eb06c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,47 @@ +# FCMAE_VSCyto3D_Pretrained predict: mito (TOMM20) trained on iPSC, +# predicting against a549_mantis_tomm20_denv test. +# +# Pinned to best-val checkpoint from training run J31523064 (ws8500 +# variant; val 0.5543, epoch 58). Run cancelled at epoch 92 — val +# plateaued at epoch 58 and never recovered (~34 epochs without +# improvement, drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=58-step=18408.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_pretrained__tomm20_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_TOMM20_ON_A549_tomm20_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..458d98fad --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,47 @@ +# FCMAE_VSCyto3D_Pretrained predict: mito (TOMM20) trained on iPSC, +# predicting against a549_mantis_tomm20_mock test. +# +# Pinned to best-val checkpoint from training run J31523064 (ws8500 +# variant; val 0.5543, epoch 58). Run cancelled at epoch 92 — val +# plateaued at epoch 58 and never recovered (~34 epochs without +# improvement, drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=58-step=18408.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_pretrained__tomm20_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_TOMM20_ON_A549_tomm20_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..59bf88f61 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,47 @@ +# FCMAE_VSCyto3D_Pretrained predict: mito (TOMM20) trained on iPSC, +# predicting against a549_mantis_tomm20_zikv test. +# +# Pinned to best-val checkpoint from training run J31523064 (ws8500 +# variant; val 0.5543, epoch 58). Run cancelled at epoch 92 — val +# plateaued at epoch 58 and never recovered (~34 epochs without +# improvement, drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=58-step=18408.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_pretrained__tomm20_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_TOMM20_ON_A549_tomm20_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..1e616b8ee --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Pretrained predict: mito (TOMM20) against ipsc_confocal test_cropped. +# +# Pinned to best-val checkpoint from training run J31523064 (ws8500 +# variant; val 0.5543, epoch 58). Run cancelled at epoch 92 — val +# plateaued at epoch 58 and never recovered (~34 epochs without +# improvement, drifting up in last 5 epochs). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints/epoch=58-step=18408.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fcmae_vscyto3d_pretrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_TOMM20 + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml new file mode 100644 index 000000000..57c1002ff --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml @@ -0,0 +1,49 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on mito/TOMM20. Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). Mirrors +# er/ipsc_confocal/fcmae_vscyto3d_pretrained.yml. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_iPSC_TOMM20_ws8500 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_TOMM20_ws8500 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_pretrained_ws8500 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..d7241e6bd --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,154 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on mito (TOMM20) — joint +# ipsc_confocal + a549_mantis pooled. Companion to +# fcmae_vscyto3d_scratch joint leaf — the two are identical except +# this one loads encoder weights from the published VSCyto3D FCMAE +# ckpt (400 ep on HEK + A549 + iPSC phase data). Mirrors +# mito/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml on +# the joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/fcmae_vscyto3d_fit.yml is composed; +# the data block is authored inline because joint hparams live on +# the children. +# +# Topology: 4-GPU DDP (inherited from +# model_overlays/fcmae_vscyto3d_fit.yml's ddp_4gpu base; the overlay +# also pins strategy=ddp_find_unused_parameters_true because +# FullyConvolutionalMAE has decoder/head params that only receive +# gradients on some forward paths). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + gene: TOMM20 + target: mito + target_id: mito_tomm20 + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: mito__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_JOINT_TOMM20_ws8500 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fcmae_vscyto3d_pretrained_ws8500 + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fcmae_vscyto3d_pretrained_ws8500/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc TOMM20 train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/TOMM20.zarr + # a549_mantis — pooled TOMM20 all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_JOINT_TOMM20_ws8500 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fcmae_vscyto3d_pretrained_ws8500 diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..780c47713 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,48 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: mito trained on a549_mantis (tomm20), +# predicting against a549-mantis-tomm20-denv test. +# Pinned to checkpoint epoch=113-step=18012 (val 0.7329 per resume's +# re-evaluation; wandb metrics for the original run are unrecoverable due +# to a run-id collision with the SEC61B training). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: a549_mantis_tomm20_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=113-step=18012.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch_a549trained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..41196158c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,48 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: mito trained on a549_mantis (tomm20), +# predicting against a549-mantis-tomm20-mock test. +# Pinned to checkpoint epoch=113-step=18012 (val 0.7329 per resume's +# re-evaluation; wandb metrics for the original run are unrecoverable due +# to a run-id collision with the SEC61B training). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: a549_mantis_tomm20_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=113-step=18012.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch_a549trained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..83a8b2199 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,48 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: mito trained on a549_mantis (tomm20), +# predicting against a549-mantis-tomm20-zikv test. +# Pinned to checkpoint epoch=113-step=18012 (val 0.7329 per resume's +# re-evaluation; wandb metrics for the original run are unrecoverable due +# to a run-id collision with the SEC61B training). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: a549_mantis_tomm20_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=113-step=18012.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch_a549trained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..08f2f895b --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,53 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: mito trained on a549_mantis (tomm20), +# predicting against ipsc_confocal test_cropped. +# Pinned to checkpoint epoch=113-step=18012 from the original training +# (J31910360, wandb display 20260502-204546_FCMAE_VSCyto3D_Scratch_A549_TOMM20). +# Best-val ckpt per resume's re-evaluation: 0.7329 (ep113) — virtually tied +# with ep118 at 0.7327. NOTE: the wandb run id `20260502-204536` collided with +# the simultaneous SEC61B training (J31910346 on the same node gpu-f-5); +# wandb's history values for that run are SEC61B's metrics, not TOMM20's, so +# the original-training's true val trajectory is not recoverable from wandb. +# The 0.7327/0.7329 figures come from the resume Lightning trainer's local +# best_k_models block in last.ckpt — that is the only TOMM20-pipeline number +# we have. Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml +# handles both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__a549_mantis__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=113-step=18012.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fcmae_vscyto3d_scratch_a549trained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/train.yml new file mode 100644 index 000000000..98e99984a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/a549_mantis/train.yml @@ -0,0 +1,46 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on mito/TOMM20. Scratch control for the pretrained counterpart — +# the two leaves are identical except this one does NOT load pretrained +# encoder weights. Mirrors er/ipsc_confocal/fcmae_vscyto3d_scratch.yml. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__a549_mantis__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_A549_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_A549_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..c30b3fe11 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-tomm20-denv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch__tomm20_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fcmae_vscyto3d_scratch__tomm20_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..12944a943 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-tomm20-mock. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch__tomm20_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fcmae_vscyto3d_scratch__tomm20_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..872880ce4 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-tomm20-zikv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch__tomm20_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fcmae_vscyto3d_scratch__tomm20_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..97a2cc4e7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch predict: mito (TOMM20) trained on iPSC, +# predicting against a549_mantis_tomm20_denv test. +# +# Pinned to best-val checkpoint from training run J31475715 +# (val 0.5527, epoch 69). Run cancelled at epoch 164 — val plateaued +# at epoch ~51 and never recovered, so later epochs are not better. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=69-step=21840.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch__tomm20_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_ON_A549_tomm20_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..bb4c8573e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch predict: mito (TOMM20) trained on iPSC, +# predicting against a549_mantis_tomm20_mock test. +# +# Pinned to best-val checkpoint from training run J31475715 +# (val 0.5527, epoch 69). Run cancelled at epoch 164 — val plateaued +# at epoch ~51 and never recovered, so later epochs are not better. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=69-step=21840.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch__tomm20_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_ON_A549_tomm20_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..65ae0dc92 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,46 @@ +# FCMAE_VSCyto3D_Scratch predict: mito (TOMM20) trained on iPSC, +# predicting against a549_mantis_tomm20_zikv test. +# +# Pinned to best-val checkpoint from training run J31475715 +# (val 0.5527, epoch 69). Run cancelled at epoch 164 — val plateaued +# at epoch ~51 and never recovered, so later epochs are not better. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=69-step=21840.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fcmae_vscyto3d_scratch__tomm20_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20_ON_A549_tomm20_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..684020067 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Scratch predict: mito (TOMM20) against ipsc_confocal test_cropped. +# +# Pinned to best-val checkpoint from training run J31475715 +# (val 0.5527, epoch 69). Run cancelled at epoch 164 — val plateaued +# at epoch ~51 and never recovered, so later epochs are not better. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch/checkpoints/epoch=69-step=21840.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fcmae_vscyto3d_scratch.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_TOMM20 + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml new file mode 100644 index 000000000..5f53c3e9d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml @@ -0,0 +1,40 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on mito/TOMM20. Scratch control for the pretrained counterpart — +# the two leaves are identical except this one does NOT load pretrained +# encoder weights. Mirrors er/ipsc_confocal/fcmae_vscyto3d_scratch.yml. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__ipsc_confocal__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_iPSC_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..27bb8d0e3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on mito (TOMM20) — joint ipsc_confocal + +# a549_mantis pooled. Scratch control for the pretrained counterpart +# — the two leaves are identical except this one does NOT load +# pretrained encoder weights. Mirrors +# mito/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml on the +# joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fcmae_vscyto3d_fit.yml is composed; data +# block inline. +# +# Topology: 4-GPU DDP +# (strategy=ddp_find_unused_parameters_true inherited from +# model_overlays/fcmae_vscyto3d_fit.yml). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + gene: TOMM20 + target: mito + target_id: mito_tomm20 + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: mito__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_JOINT_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fcmae_vscyto3d_scratch/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc TOMM20 train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/TOMM20.zarr + # a549_mantis — pooled TOMM20 all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_JOINT_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..d243585ea --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,46 @@ +# FNet3D paper-baseline predict: mito trained on a549_mantis (tomm20), +# predicting against a549-mantis-tomm20-denv test. +# Best val-loss checkpoint from job 31965119 (epoch 248, val 0.8291). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: a549_mantis_tomm20_denv + model_name: fnet3d_paper + experiment_id: mito__a549_mantis__fnet3d_paper__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper/checkpoints/epoch=248-step=126990.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper_a549trained_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..e7a5655ba --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,46 @@ +# FNet3D paper-baseline predict: mito trained on a549_mantis (tomm20), +# predicting against a549-mantis-tomm20-mock test. +# Best val-loss checkpoint from job 31965119 (epoch 248, val 0.8291). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: a549_mantis_tomm20_mock + model_name: fnet3d_paper + experiment_id: mito__a549_mantis__fnet3d_paper__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper/checkpoints/epoch=248-step=126990.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper_a549trained_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..071eae92a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,46 @@ +# FNet3D paper-baseline predict: mito trained on a549_mantis (tomm20), +# predicting against a549-mantis-tomm20-zikv test. +# Best val-loss checkpoint from job 31965119 (epoch 248, val 0.8291). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: a549_mantis_tomm20_zikv + model_name: fnet3d_paper + experiment_id: mito__a549_mantis__fnet3d_paper__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper/checkpoints/epoch=248-step=126990.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper_a549trained_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..601ea814f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,47 @@ +# FNet3D paper-baseline predict: mito trained on a549_mantis (tomm20), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31965119 (epoch 248, val 0.8291). +# Wandb run 20260503-193857_FNet3D_A549_TOMM20_paper (state=finished, +# 392 ep / 199,999 steps; final val 0.9493 — drifted up from ep248 best). +# Both iPSC and a549 manifests use `tomm20`; targets/mito_tomm20.yml handles +# both natively, no dataset_ref override needed. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: mito__a549_mantis__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper/checkpoints/epoch=248-step=126990.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fnet3d_paper_a549trained.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/train.yml new file mode 100644 index 000000000..9750e6613 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/a549_mantis/train.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline fit on mitochondria (TOMM20 marker) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +# target_channel=Structure, so the overlay's default norms/augs apply unchanged. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: a549_mantis + model_name: fnet3d_paper + experiment_id: mito__a549_mantis__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_A549_TOMM20_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: FNet3DPaper_A549_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/tomm20/fnet3d_paper + # 512G to match the shared headroom convention across the fnet3d + # leaves on a549/joint workloads. mmap_preload after the BasicIndexer + # fix peaks at ~75 GB for TOMM20_all alone (single-set); 512G gives + # generous headroom. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..bb78bad96 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FNet3DPaper on a549-mantis-tomm20-denv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper__tomm20_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fnet3d_paper__tomm20_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..ada4d35a2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FNet3DPaper on a549-mantis-tomm20-mock. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper__tomm20_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fnet3d_paper__tomm20_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..b8cab5cdc --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by FNet3DPaper on a549-mantis-tomm20-zikv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper__tomm20_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_fnet3d_paper__tomm20_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..eb89eaf39 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_denv test. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 215, loss/validate=0.7571). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_denv + model_name: fnet3d_paper + experiment_id: mito__ipsc_confocal__fnet3d_paper__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper/checkpoints/epoch=215-step=187056.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper__tomm20_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_ON_A549_tomm20_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..500f4b8d1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_mock test. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 215, loss/validate=0.7571). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_mock + model_name: fnet3d_paper + experiment_id: mito__ipsc_confocal__fnet3d_paper__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper/checkpoints/epoch=215-step=187056.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper__tomm20_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_ON_A549_tomm20_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..9c68fe164 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_zikv test. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 215, loss/validate=0.7571). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_zikv + model_name: fnet3d_paper + experiment_id: mito__ipsc_confocal__fnet3d_paper__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper/checkpoints/epoch=215-step=187056.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_fnet3d_paper__tomm20_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20_ON_A549_tomm20_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..1491202bd --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: mito (TOMM20) against ipsc_confocal test_cropped. +# Uses best val-loss checkpoint (epoch 215, loss/validate=0.7571). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: mito__ipsc_confocal__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper/checkpoints/epoch=215-step=187056.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fnet3d_paper.zarr + +launcher: + job_name: FNet3DPaper_PRED_TOMM20 + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/train.yml new file mode 100644 index 000000000..db67cfe4c --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/ipsc_confocal/train.yml @@ -0,0 +1,38 @@ +# FNet3D paper-baseline fit on mitochondria (TOMM20 marker) — AICS iPSC confocal. +# target_channel=Structure, so the overlay's default norms/augs apply unchanged. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: mito__ipsc_confocal__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_iPSC_TOMM20_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper/checkpoints + +launcher: + job_name: FNet3DPaper_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/tomm20/fnet3d_paper diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..a4fa68f83 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,123 @@ +# FNet3D paper-baseline fit on mito (TOMM20) — joint +# ipsc_confocal + a549_mantis pooled. Mirrors +# mito/fnet3d_paper/ipsc_confocal/train.yml on the joint +# train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fnet3d_paper_fit.yml is composed; data block +# inline. Norms + 8-crops-per-FOV diverge from the CellDiff/UNetViT +# conventions: target channel uses mean/std (not median/iqr) and +# val augmentations are CPU CenterSpatialCropd on the raw keys (the +# baseline's training pipeline doesn't go through GPU val transforms). +# +# Topology: single GPU, any model, long wall — same as +# fnet3d_paper/ipsc_confocal/train.yml. The paper baseline is single-GPU +# and we keep that here so iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + gene: TOMM20 + target: mito + target_id: mito_tomm20 + train_set: joint_ipsc_confocal_a549_mantis + model_name: fnet3d_paper + experiment_id: mito__joint_ipsc_confocal_a549_mantis__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_JOINT_TOMM20_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fnet3d_paper/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 32 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples (unlike single-set), + # so 6 * num_samples=8 = 48 GPU samples matches single-set effective. + batch_size: 6 + num_workers: 8 + yx_patch_size: [64, 64] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [32, 64, 64] + num_samples: 8 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [1] + prob: 0.5 + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [2] + prob: 0.5 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Structure] + roi_size: [32, 64, 64] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc TOMM20 train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/TOMM20.zarr + # a549_mantis — pooled TOMM20 all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: FNet3DPaper_JOINT_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/tomm20/fnet3d_paper + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/a549_mantis/train.yml new file mode 100644 index 000000000..4b0dfed93 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/a549_mantis/train.yml @@ -0,0 +1,43 @@ +# UNetViT3D fit on mitochondria (TOMM20 marker) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: a549_mantis + model_name: unetvit3d + experiment_id: mito__a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_A549_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/tomm20/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/tomm20/unetvit3d/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Structure + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: UNetViT3D_A549_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/tomm20/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..0c719bc66 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by UNetViT3D on a549-mantis-tomm20-denv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_denv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_unetvit3d__tomm20_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_unetvit3d__tomm20_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..93123288e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by UNetViT3D on a549-mantis-tomm20-mock. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_mock + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_unetvit3d__tomm20_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_unetvit3d__tomm20_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..08f72d9ea --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,15 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by UNetViT3D on a549-mantis-tomm20-zikv. +defaults: + - override /target: mito_tomm20 + - override /predict_set: a549_mantis_tomm20_zikv + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_unetvit3d__tomm20_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_tomm20_unetvit3d__tomm20_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..e85266660 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: Mitochondria (TOMM20) predicted by UNetViT3D on iPSC confocal. +defaults: + - override /target: mito_tomm20 + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_unetvit3d.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_tomm20_unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..7d745d251 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_denv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_denv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_denv + model_name: unetvit3d + experiment_id: mito__ipsc_confocal__unetvit3d__a549_mantis_tomm20_denv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_unetvit3d__tomm20_denv.zarr + +launcher: + job_name: UNetViT3D_PRED_TOMM20_ON_A549_tomm20_denv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..f59d5a996 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_mock test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_mock.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_mock + model_name: unetvit3d + experiment_id: mito__ipsc_confocal__unetvit3d__a549_mantis_tomm20_mock + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_unetvit3d__tomm20_mock.zarr + +launcher: + job_name: UNetViT3D_PRED_TOMM20_ON_A549_tomm20_mock + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..85e77fdf1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: mito (TOMM20) trained on iPSC, predicting against a549_mantis_tomm20_zikv test. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_tomm20_zikv.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: a549_mantis_tomm20_zikv + model_name: unetvit3d + experiment_id: mito__ipsc_confocal__unetvit3d__a549_mantis_tomm20_zikv + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/tomm20_unetvit3d__tomm20_zikv.zarr + +launcher: + job_name: UNetViT3D_PRED_TOMM20_ON_A549_tomm20_zikv + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..6fd8731b5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: mito (TOMM20) against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: unetvit3d + experiment_id: mito__ipsc_confocal__unetvit3d__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_unetvit3d.zarr + +launcher: + job_name: UNetViT3D_PRED_TOMM20 + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/train.yml new file mode 100644 index 000000000..47941d508 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/ipsc_confocal/train.yml @@ -0,0 +1,37 @@ +# UNetViT3D fit on mitochondria (TOMM20 marker) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/mito_tomm20.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + train_set: ipsc_confocal + model_name: unetvit3d + experiment_id: mito__ipsc_confocal__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_iPSC_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d/checkpoints + +launcher: + job_name: UNetViT3D_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/tomm20/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..49f9a631d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/mito/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,139 @@ +# UNetViT3D fit on mitochondria (TOMM20) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/unetvit3d_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# Topology: single H200, single GPU — same as unetvit3d/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: mito + gene: TOMM20 + target: mito + target_id: mito_tomm20 + train_set: joint_ipsc_confocal_a549_mantis + model_name: unetvit3d + experiment_id: mito__joint_ipsc_confocal_a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_JOINT_TOMM20 + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/tomm20/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/tomm20/unetvit3d/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Structure + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Structure] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Structure] + w_key: Structure + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/TOMM20.zarr + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_all.zarr + +launcher: + job_name: UNetViT3D_JOINT_TOMM20 + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/tomm20/unetvit3d + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/a549_mantis/train.yml new file mode 100644 index 000000000..57fba3822 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/a549_mantis/train.yml @@ -0,0 +1,42 @@ +# CellDiff fit on nucleus (Nuclei channel of cell.zarr) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: a549_mantis + model_name: celldiff + experiment_id: nucleus__a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_A549_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/nucl/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/nucl/celldiff/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Nuclei + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: CELLDiff_A549_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/nucl/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..c6474091a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by CellDiff on a549-mantis-h2b-denv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-denv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_denv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_celldiff_denoise_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_celldiff_denoise_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..75516b970 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by CellDiff on a549-mantis-h2b-mock. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-mock. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_mock + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_celldiff_denoise_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_celldiff_denoise_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..ea91a20ee --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by CellDiff on a549-mantis-h2b-zikv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-zikv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_zikv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_celldiff_denoise_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_celldiff_denoise_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..a013096af --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus predicted by CellDiff on iPSC confocal. +defaults: + - override /target: nucleus + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_celldiff_denoise.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_nucl_celldiff_denoise diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..e3b9d20ff --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-denv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_denv + model_name: celldiff + experiment_id: nucleus__ipsc_confocal__celldiff__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_celldiff_iterative_denv.zarr + +launcher: + job_name: CELLDiff_PRED_NUCL_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..97cdbe915 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# CellDiff predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-mock test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_mock + model_name: celldiff + experiment_id: nucleus__ipsc_confocal__celldiff__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_celldiff_iterative_mock.zarr + +launcher: + job_name: CELLDiff_PRED_NUCL_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..fe4612f9d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-zikv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_zikv + model_name: celldiff + experiment_id: nucleus__ipsc_confocal__celldiff__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_celldiff_iterative_zikv.zarr + +launcher: + job_name: CELLDiff_PRED_NUCL_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..34a743428 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,44 @@ +# CellDiff predict: nucleus against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: celldiff + experiment_id: nucleus__ipsc_confocal__celldiff__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff/checkpoints/last.ckpt + predict_method: denoise # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 8 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_celldiff_denoise.zarr + +launcher: + job_name: CELLDiff_PRED_NUCL + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/train.yml new file mode 100644 index 000000000..47d9e5bf3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/ipsc_confocal/train.yml @@ -0,0 +1,36 @@ +# CellDiff fit on nucleus (Nuclei channel of cell.zarr) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: ipsc_confocal + model_name: celldiff + experiment_id: nucleus__ipsc_confocal__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_iPSC_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff/checkpoints + +launcher: + job_name: CELLDiff_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/celldiff diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..f4e123c91 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: nucleus trained on joint iPSC+A549, predicting against a549-mantis-h2b-denv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_denv + model_name: celldiff + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/nucl_celldiff_denv.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_NUCL_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..25051024a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# CellDiff predict: nucleus trained on joint iPSC+A549, predicting against a549-mantis-h2b-mock test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_mock + model_name: celldiff + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/nucl_celldiff_joint_mock.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_NUCL_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..7c7ac0621 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# CellDiff predict: nucleus trained on joint iPSC+A549, predicting against a549-mantis-h2b-zikv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_zikv + model_name: celldiff + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__celldiff__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 48 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/nucl_celldiff_joint_zikv.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_NUCL_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..fba17ac3f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,44 @@ +# CellDiff predict: nucleus trained on joint iPSC+A549, predicting against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/celldiff_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: celldiff + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__celldiff__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff/checkpoints/last.ckpt + predict_method: iterative # denoise, generate, sliding_window, or iterative + predict_overlap: [4, 256, 256] + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + z_window_size: 40 # 8 for denoise and generate, 40 for iterative and sliding_window. + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions/nucl_celldiff.zarr + +launcher: + job_name: CELLDiff_JOINT_PRED_NUCL_ON_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..debdd13a5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/celldiff/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,145 @@ +# CellDiff fit on nucleus (Nuclei) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/celldiff_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# iPSC source is the multi-marker cell.zarr (Brightfield, Nuclei, +# Membrane, Phase3D); A549 source is the H2B-marker pooled store +# H2B_all.zarr. The shared target_channel name is `Nuclei` in both. +# +# Topology: single H200, single GPU — same as celldiff/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/celldiff_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + gene: Nuclei + target: nucleus + target_id: nucleus + train_set: joint_ipsc_confocal_a549_mantis + model_name: celldiff + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__celldiff + +trainer: + logger: + init_args: + name: CELLDiff_JOINT_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 1 + save_top_k: -1 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Nuclei + z_window_size: 13 + batch_size: 2 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr; HCSDataModule + # picks up only the requested target_channel (Nuclei). + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled H2B all-conditions train store + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: CELLDiff_JOINT_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/celldiff + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..e851b54b0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-denv test. +# Best val-loss checkpoint from job 31822558 (epoch 134, loss/validate=0.8142). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=134-step=29295.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_a549trained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..35a93b1a3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-mock test. +# Best val-loss checkpoint from job 31822558 (epoch 134, loss/validate=0.8142). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=134-step=29295.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_a549trained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..2b4ad0374 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-zikv test. +# Best val-loss checkpoint from job 31822558 (epoch 134, loss/validate=0.8142). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_pretrained__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=134-step=29295.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_a549trained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..7bbd70ff9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# FCMAE_VSCyto3D_Pretrained (VSCyto3D) predict: nucleus trained on a549_mantis (h2b), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31822558 (epoch 134, loss/validate=0.8142). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=134-step=29295.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_pretrained_a549trained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/train.yml new file mode 100644 index 000000000..9d32fbce2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/a549_mantis/train.yml @@ -0,0 +1,67 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on nucleus (Nuclei marker). Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). See vs_test/finetune_3d.py +# for the canonical recipe. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_pretrained + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual nucleus channel name +# in keys/w_key. spatial_size + num_samples kept identical to the FCMAE +# overlay so the augmentation policy matches ER/Mito. +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Nuclei + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [20, 600, 600] + num_samples: 4 + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_A549_Nucleus + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_A549_Nucleus + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_pretrained diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..e4aaab901 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-h2b-denv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-denv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_denv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fcmae_vscyto3d_pretrained_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..51b350c41 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-h2b-mock. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-mock. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_mock + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fcmae_vscyto3d_pretrained_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..2f4f95bf9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FCMAE_VSCyto3D_Pretrained on a549-mantis-h2b-zikv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-zikv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_zikv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fcmae_vscyto3d_pretrained_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..e7c770c1e --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,52 @@ +# FCMAE_VSCyto3D_Pretrained predict: nucleus trained on iPSC, +# predicting against a549-mantis-h2b-denv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side +# `nucleus` target_id from targets/nucleus.yml so the resolver finds +# the h2b target on a549-mantis-h2b-denv. +# +# Pinned to best-val checkpoint from training run J31475094 +# (val 0.3921, epoch 89). Run cancelled at epoch 172 — val plateaued +# at epoch 89 and never recovered (~83 epochs without improvement). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_denv + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=89-step=28080.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..ad00f10dd --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,52 @@ +# FCMAE_VSCyto3D_Pretrained predict: nucleus trained on iPSC, +# predicting against a549-mantis-h2b-mock test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side +# `nucleus` target_id from targets/nucleus.yml so the resolver finds +# the h2b target on a549-mantis-h2b-mock. +# +# Pinned to best-val checkpoint from training run J31475094 +# (val 0.3921, epoch 89). Run cancelled at epoch 172 — val plateaued +# at epoch 89 and never recovered (~83 epochs without improvement). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_mock + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=89-step=28080.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..6496d99b1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,52 @@ +# FCMAE_VSCyto3D_Pretrained predict: nucleus trained on iPSC, +# predicting against a549-mantis-h2b-zikv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side +# `nucleus` target_id from targets/nucleus.yml so the resolver finds +# the h2b target on a549-mantis-h2b-zikv. +# +# Pinned to best-val checkpoint from training run J31475094 +# (val 0.3921, epoch 89). Run cancelled at epoch 172 — val plateaued +# at epoch 89 and never recovered (~83 epochs without improvement). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_zikv + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_pretrained__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=89-step=28080.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_pretrained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..b1587ef64 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Pretrained predict: nucleus (H2B) against ipsc_confocal test_cropped. +# +# Pinned to best-val checkpoint from training run J31475094 +# (val 0.3921, epoch 89). Run cancelled at epoch 172 — val plateaued +# at epoch 89 and never recovered (~83 epochs without improvement). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_pretrained__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained/checkpoints/epoch=89-step=28080.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_pretrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_PRED_NUCL + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml new file mode 100644 index 000000000..fb8990970 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml @@ -0,0 +1,64 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on nucleus (Nuclei marker). Companion to +# fcmae_vscyto3d_scratch.yml — the two leaves are identical except this +# one loads encoder weights from the published VSCyto3D FCMAE ckpt +# (400 ep on HEK + A549 + iPSC phase data). See vs_test/finetune_3d.py +# for the canonical recipe. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_pretrained + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual nucleus channel name +# in keys/w_key. spatial_size + num_samples kept identical to the FCMAE +# overlay so the augmentation policy matches ER/Mito. +data: + init_args: + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [20, 600, 600] + num_samples: 4 + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_iPSC_Nucleus + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_Nucleus + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_pretrained diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..32a7756ef --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_pretrained/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,155 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) with FCMAE- +# pretrained encoder init on nucleus (NUCL) — joint +# ipsc_confocal + a549_mantis pooled. Companion to +# fcmae_vscyto3d_scratch joint leaf — the two are identical except +# this one loads encoder weights from the published VSCyto3D FCMAE +# ckpt (400 ep on HEK + A549 + iPSC phase data). Mirrors +# nucleus/fcmae_vscyto3d_pretrained/ipsc_confocal/train.yml on +# the joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/fcmae_vscyto3d_fit.yml is composed; +# the data block is authored inline because joint hparams live on +# the children. +# +# Topology: 4-GPU DDP (inherited from +# model_overlays/fcmae_vscyto3d_fit.yml's ddp_4gpu base; the overlay +# also pins strategy=ddp_find_unused_parameters_true because +# FullyConvolutionalMAE has decoder/head params that only receive +# gradients on some forward paths). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + gene: Nuclei + target: nucleus + target_id: nucleus + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_pretrained + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_pretrained + +model: + init_args: + # Load only the encoder from the canonical VSCyto3D FCMAE ckpt — + # decoder/head stay at fresh init. Matches vs_test/finetune_3d.py:247. + encoder_only: true + ckpt_path: /hpc/projects/virtual_staining/models/mehta-lab/VSCyto3D/fcmae.ckpt + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Pretrained_JOINT_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_pretrained + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_pretrained/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Nuclei + z_window_size: 20 + # batch_size is NOT divided by num_samples in joint mode (see + # nucleus/fnet3d_paper/joint_*/train.yml for the rationale): 8 + # indices * num_samples=4 = 32 GPU samples per DDP rank, matching + # single-set's effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr (Nuclei channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled H2B all-conditions train store (Nuclei channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Pretrained_JOINT_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_pretrained diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..71c75dd2a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-denv test. +# Best val-loss checkpoint from job 31822562 (epoch 110, loss/validate=0.8345). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=110-step=24087.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_a549trained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..39bdd04ea --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-mock test. +# Best val-loss checkpoint from job 31822562 (epoch 110, loss/validate=0.8345). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=110-step=24087.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_a549trained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..c154797db --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-zikv test. +# Best val-loss checkpoint from job 31822562 (epoch 110, loss/validate=0.8345). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=110-step=24087.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_a549trained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..18a6a46c3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on a549_mantis (h2b), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31822562 (epoch 110, loss/validate=0.8345). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=110-step=24087.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_scratch_a549trained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/train.yml new file mode 100644 index 000000000..75cc6f2e3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/a549_mantis/train.yml @@ -0,0 +1,59 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on nucleus (Nuclei marker). Scratch control for the pretrained +# counterpart — the two leaves are identical except this one does NOT +# load pretrained encoder weights. See UNEXT2_VS_FCMAE_CLASSES.md for +# why this is the paper-adjacent scratch baseline (and not unext2.yml). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__a549_mantis__fcmae_vscyto3d_scratch + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual nucleus channel name +# in keys/w_key. spatial_size + num_samples kept identical to the FCMAE +# overlay so the augmentation policy matches ER/Mito. +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Nuclei + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [20, 600, 600] + num_samples: 4 + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_A549_Nucleus + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_A549_Nucleus + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..a2eeed7a6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-h2b-denv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-denv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_denv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fcmae_vscyto3d_scratch_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..9be297472 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-h2b-mock. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-mock. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_mock + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fcmae_vscyto3d_scratch_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..4b8c35af7 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FCMAE_VSCyto3D_Scratch on a549-mantis-h2b-zikv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-zikv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_zikv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fcmae_vscyto3d_scratch_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..565688818 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,59 @@ +# FCMAE_VSCyto3D_Scratch predict: nucleus trained on iPSC, +# predicting against a549-mantis-h2b-denv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side +# `nucleus` target_id from targets/nucleus.yml so the resolver finds +# the h2b target on a549-mantis-h2b-denv. +# +# TODO: replace ckpt_path once iPSC FCMAE scratch nucleus training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + # Best checkpoint from J31710710 (FCMAE_VSCyto3D_Scratch_iPSC_Nucleus): + # ep 80 / val_loss 0.39342 (49-epoch plateau, scancelled at 1d 8h elapsed). + # Note: pretrained variant (J31475094, ep 89 = 0.39215) edges scratch on the + # same data; downstream eval should prefer the pretrained predict configs + # unless explicitly ablating against the scratch baseline. + # Hardlink alias at run_root; underlying epoch=80-step=25272.ckpt also + # preserved in checkpoints_frozen_ep80_/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/best_ep80_val0.39342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..a9a3ba697 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,59 @@ +# FCMAE_VSCyto3D_Scratch predict: nucleus trained on iPSC, +# predicting against a549-mantis-h2b-mock test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side +# `nucleus` target_id from targets/nucleus.yml so the resolver finds +# the h2b target on a549-mantis-h2b-mock. +# +# TODO: replace ckpt_path once iPSC FCMAE scratch nucleus training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + # Best checkpoint from J31710710 (FCMAE_VSCyto3D_Scratch_iPSC_Nucleus): + # ep 80 / val_loss 0.39342 (49-epoch plateau, scancelled at 1d 8h elapsed). + # Note: pretrained variant (J31475094, ep 89 = 0.39215) edges scratch on the + # same data; downstream eval should prefer the pretrained predict configs + # unless explicitly ablating against the scratch baseline. + # Hardlink alias at run_root; underlying epoch=80-step=25272.ckpt also + # preserved in checkpoints_frozen_ep80_/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/best_ep80_val0.39342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..a05034d68 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,59 @@ +# FCMAE_VSCyto3D_Scratch predict: nucleus trained on iPSC, +# predicting against a549-mantis-h2b-zikv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side +# `nucleus` target_id from targets/nucleus.yml so the resolver finds +# the h2b target on a549-mantis-h2b-zikv. +# +# TODO: replace ckpt_path once iPSC FCMAE scratch nucleus training +# completes. Expected output (per fit leaf): +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/checkpoints/last.ckpt +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_scratch__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + # Best checkpoint from J31710710 (FCMAE_VSCyto3D_Scratch_iPSC_Nucleus): + # ep 80 / val_loss 0.39342 (49-epoch plateau, scancelled at 1d 8h elapsed). + # Note: pretrained variant (J31475094, ep 89 = 0.39215) edges scratch on the + # same data; downstream eval should prefer the pretrained predict configs + # unless explicitly ablating against the scratch baseline. + # Hardlink alias at run_root; underlying epoch=80-step=25272.ckpt also + # preserved in checkpoints_frozen_ep80_/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/best_ep80_val0.39342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..7889bb8c9 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,52 @@ +# FCMAE_VSCyto3D_Scratch predict: nucleus (H2B) against ipsc_confocal test_cropped. +# +# TODO: replace ckpt_path with best-val ckpt once iPSC FCMAE scratch +# nucleus training (J31710710, resumed from J31475096) completes. Expected dir: +# /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/checkpoints/ +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + # Best checkpoint from J31710710 (FCMAE_VSCyto3D_Scratch_iPSC_Nucleus): + # ep 80 / val_loss 0.39342 (49-epoch plateau, scancelled at 1d 8h elapsed). + # Note: pretrained variant (J31475094, ep 89 = 0.39215) edges scratch on the + # same data; downstream eval should prefer the pretrained predict configs + # unless explicitly ablating against the scratch baseline. + # Hardlink alias at run_root; underlying epoch=80-step=25272.ckpt also + # preserved in checkpoints_frozen_ep80_/. + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/best_ep80_val0.39342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_scratch.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml new file mode 100644 index 000000000..37687d096 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml @@ -0,0 +1,56 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on nucleus (Nuclei marker). Scratch control for the pretrained +# counterpart — the two leaves are identical except this one does NOT +# load pretrained encoder weights. See UNEXT2_VS_FCMAE_CLASSES.md for +# why this is the paper-adjacent scratch baseline (and not unext2.yml). +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__ipsc_confocal__fcmae_vscyto3d_scratch + +# Override the FCMAE data overlay's hardcoded `Structure` augmentation +# keys (the overlay was authored for ER/Mito where target_channel == +# "Structure"). RandWeightedCropd needs the actual nucleus channel name +# in keys/w_key. spatial_size + num_samples kept identical to the FCMAE +# overlay so the augmentation policy matches ER/Mito. +data: + init_args: + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [20, 600, 600] + num_samples: 4 + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_iPSC_Nucleus + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch/checkpoints + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_Nucleus + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..18f8c4af8 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on joint iPSC+A549, +# predicting against a549-mantis-h2b-denv test. +# Best val-loss checkpoint from job 31822521 (epoch 92, loss/validate=0.6448). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_denv + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=92-step=49290.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_jointtrained_denv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_JOINTTR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..426e8f0a8 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on joint iPSC+A549, +# predicting against a549-mantis-h2b-mock test. +# Best val-loss checkpoint from job 31822521 (epoch 92, loss/validate=0.6448). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_mock + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=92-step=49290.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_jointtrained_mock.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_JOINTTR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..558c4006a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on joint iPSC+A549, +# predicting against a549-mantis-h2b-zikv test. +# Best val-loss checkpoint from job 31822521 (epoch 92, loss/validate=0.6448). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_zikv + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=92-step=49290.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fcmae_vscyto3d_scratch_jointtrained_zikv.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_JOINTTR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..8d0eb86da --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,45 @@ +# FCMAE_VSCyto3D_Scratch (UNeXt2) predict: nucleus trained on joint iPSC+A549, +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31822521 (epoch 92, loss/validate=0.6448). +# Job hit the 4-day SLURM wall at 2026-05-05 00:43 (TIMEOUT); 5 best-val ckpts +# saved by top-K — ep92 is the best of the 5. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints/epoch=92-step=49290.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_scratch_jointtrained.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_PRED_NUCL_JOINTTR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..5c78c67fa --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fcmae_vscyto3d_scratch/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,142 @@ +# FCMAE-class (FullyConvolutionalMAE, pretraining=False) random-init +# baseline on nucleus (NUCL) — joint ipsc_confocal + +# a549_mantis pooled. Scratch control for the pretrained counterpart +# — the two leaves are identical except this one does NOT load +# pretrained encoder weights. Mirrors +# nucleus/fcmae_vscyto3d_scratch/ipsc_confocal/train.yml on the +# joint train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fcmae_vscyto3d_fit.yml is composed; data +# block inline. +# +# Topology: 4-GPU DDP +# (strategy=ddp_find_unused_parameters_true inherited from +# model_overlays/fcmae_vscyto3d_fit.yml). +base: + - ../../../_internal/shared/model/model_overlays/fcmae_vscyto3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_4gpu.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + gene: Nuclei + target: nucleus + target_id: nucleus + train_set: joint_ipsc_confocal_a549_mantis + model_name: fcmae_vscyto3d_scratch + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fcmae_vscyto3d_scratch + +trainer: + logger: + init_args: + name: FCMAE_VSCyto3D_Scratch_JOINT_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 5 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Nuclei + z_window_size: 20 + # See nucleus/fnet3d_paper/joint_*/train.yml for the rationale: joint + # mode does not divide batch_size by num_samples, so 8 * 4 = 32 GPU + # samples per DDP rank matches single-set effective batch. + batch_size: 8 + num_workers: 4 + yx_patch_size: [384, 384] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [20, 600, 600] + num_samples: 4 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [15, 384, 384] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr (Nuclei channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled H2B all-conditions train store (Nuclei channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: FCMAE_VSCyto3D_Scratch_JOINT_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fcmae_vscyto3d_scratch diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..2a21beb99 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# FNet3D paper-baseline predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-denv test. +# Best val-loss checkpoint from job 31858491 (epoch 293, loss/validate=0.2088). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_denv + model_name: fnet3d_paper + experiment_id: nucleus__a549_mantis__fnet3d_paper__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=293-step=199920.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_a549trained_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_A549TR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..8ae0aa93f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# FNet3D paper-baseline predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-mock test. +# Best val-loss checkpoint from job 31858491 (epoch 293, loss/validate=0.2088). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_mock + model_name: fnet3d_paper + experiment_id: nucleus__a549_mantis__fnet3d_paper__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=293-step=199920.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_a549trained_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_A549TR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..758fbb8c1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# FNet3D paper-baseline predict: nucleus trained on a549_mantis (h2b), +# predicting against a549-mantis-h2b-zikv test. +# Best val-loss checkpoint from job 31858491 (epoch 293, loss/validate=0.2088). +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: a549_mantis_h2b_zikv + model_name: fnet3d_paper + experiment_id: nucleus__a549_mantis__fnet3d_paper__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=293-step=199920.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_a549trained_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_A549TR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..64003dc5f --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# FNet3D paper-baseline predict: nucleus trained on a549_mantis (h2b), +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31858491 (epoch 293, loss/validate=0.2088). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: a549_mantis + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: nucleus__a549_mantis__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=293-step=199920.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fnet3d_paper_a549trained.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_A549TR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/train.yml new file mode 100644 index 000000000..42097f045 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/a549_mantis/train.yml @@ -0,0 +1,77 @@ +# FNet3D paper-baseline fit on nucleus (Nuclei channel of cell.zarr) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +# The overlay's norm/aug/val_aug are keyed on Structure (the SEC61B/TOMM20 target +# channel). Nucleus target_channel is Nuclei, so we list-replace those three lists +# here to re-key them. +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: a549_mantis + model_name: fnet3d_paper + experiment_id: nucleus__a549_mantis__fnet3d_paper + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Nuclei + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [32, 64, 64] + num_samples: 8 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Nuclei] + roi_size: [32, 64, 64] + +trainer: + logger: + init_args: + name: FNet3D_A549_NUCL_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper/checkpoints + +launcher: + job_name: FNet3DPaper_A549_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/a549_mantis/nucl/fnet3d_paper + # 512G to match the shared headroom convention across the fnet3d + # leaves on a549/joint workloads. mmap_preload after the BasicIndexer + # fix peaks at ~75 GB for H2B_all alone (single-set) or ~185 GB for + # joint cell.zarr + H2B_all; 512G gives generous headroom. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..4121aefd5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FNet3DPaper on a549-mantis-h2b-denv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-denv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_denv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fnet3d_paper_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..e0c5936db --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FNet3DPaper on a549-mantis-h2b-mock. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-mock. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_mock + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fnet3d_paper_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..2f0ab45f5 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by FNet3DPaper on a549-mantis-h2b-zikv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-zikv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_zikv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucl_fnet3d_paper_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..7c7a9a61d --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,48 @@ +# FNet3D paper-baseline predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-denv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 226, loss/validate=0.7932). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_denv + model_name: fnet3d_paper + experiment_id: nucleus__ipsc_confocal__fnet3d_paper__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper/checkpoints/epoch=226-step=196582.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..702abe805 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,48 @@ +# FNet3D paper-baseline predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-mock test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 226, loss/validate=0.7932). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_mock + model_name: fnet3d_paper + experiment_id: nucleus__ipsc_confocal__fnet3d_paper__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper/checkpoints/epoch=226-step=196582.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..073339ae3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,48 @@ +# FNet3D paper-baseline predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-zikv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +# Same iPSC best val-loss checkpoint as predict__ipsc_confocal.yml (epoch 226, loss/validate=0.7932). +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_zikv + model_name: fnet3d_paper + experiment_id: nucleus__ipsc_confocal__fnet3d_paper__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper/checkpoints/epoch=226-step=196582.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..2459e3042 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,42 @@ +# FNet3D paper-baseline predict: nucleus against ipsc_confocal test_cropped. +# Uses best val-loss checkpoint (epoch 226, loss/validate=0.7932). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: nucleus__ipsc_confocal__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper/checkpoints/epoch=226-step=196582.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fnet3d_paper.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/train.yml new file mode 100644 index 000000000..6978bc815 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/ipsc_confocal/train.yml @@ -0,0 +1,72 @@ +# FNet3D paper-baseline fit on nucleus (Nuclei channel of cell.zarr) — AICS iPSC confocal. +# The overlay's norm/aug/val_aug are keyed on Structure (the SEC61B/TOMM20 target +# channel). Nucleus target_channel is Nuclei, so we list-replace those three lists +# here to re-key them. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: nucleus__ipsc_confocal__fnet3d_paper + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [32, 64, 64] + num_samples: 8 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Nuclei] + roi_size: [32, 64, 64] + +trainer: + logger: + init_args: + name: FNet3D_iPSC_NUCL_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper/checkpoints + +launcher: + job_name: FNet3DPaper_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/nucl/fnet3d_paper + # cell.zarr-backed preload pushes MaxVMSize past the shared 256G cap + # (observed 264G on the first launch; worker OOM-killed in validation). + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..9afb866ad --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_denv.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline predict: nucleus trained on joint iPSC+A549, +# predicting against a549-mantis-h2b-denv test. +# Best val-loss checkpoint from job 31962520 (epoch 126, val 0.9709). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_denv + model_name: fnet3d_paper + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fnet3d_paper__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=126-step=196342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_jointtrained_denv.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_JOINTTR_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..646ac95b0 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_mock.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline predict: nucleus trained on joint iPSC+A549, +# predicting against a549-mantis-h2b-mock test. +# Best val-loss checkpoint from job 31962520 (epoch 126, val 0.9709). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_mock + model_name: fnet3d_paper + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fnet3d_paper__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=126-step=196342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_jointtrained_mock.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_JOINTTR_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..879f1d60a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__a549_mantis_zikv.yml @@ -0,0 +1,50 @@ +# FNet3D paper-baseline predict: nucleus trained on joint iPSC+A549, +# predicting against a549-mantis-h2b-zikv test. +# Best val-loss checkpoint from job 31962520 (epoch 126, val 0.9709). See +# predict__ipsc_confocal.yml in this dir for full provenance. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: a549_mantis_h2b_zikv + model_name: fnet3d_paper + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fnet3d_paper__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=126-step=196342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_jointtrained_zikv.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_JOINTTR_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml new file mode 100644 index 000000000..0906f9bc1 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/predict__ipsc_confocal.yml @@ -0,0 +1,44 @@ +# FNet3D paper-baseline predict: nucleus trained on joint iPSC+A549, +# predicting against ipsc_confocal test_cropped. +# Best val-loss checkpoint from job 31962520 (epoch 126, val 0.9709). +# Job completed at 2026-05-05T15:23:37 (elapsed 1d 21h 12m). +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: joint_ipsc_confocal_a549_mantis + predict_set: ipsc_confocal + model_name: fnet3d_paper + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fnet3d_paper__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper/checkpoints/epoch=126-step=196342.ckpt + +data: + init_args: + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fnet3d_paper_jointtrained.zarr + +launcher: + job_name: FNet3DPaper_PRED_NUCL_JOINTTR_IPSC + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..2ba6acbec --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/fnet3d_paper/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,127 @@ +# FNet3D paper-baseline fit on nucleus (NUCL) — joint +# ipsc_confocal + a549_mantis pooled. Mirrors +# nucleus/fnet3d_paper/ipsc_confocal/train.yml on the joint +# train_set. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. +# BatchedConcatDataModule + two explicit HCSDataModule children; +# only model_overlays/fnet3d_paper_fit.yml is composed; data block +# inline. Norms + 8-crops-per-FOV diverge from the CellDiff/UNetViT +# conventions: target channel uses mean/std (not median/iqr) and +# val augmentations are CPU CenterSpatialCropd on the raw keys (the +# baseline's training pipeline doesn't go through GPU val transforms). +# +# Topology: single GPU, any model, long wall — same as +# fnet3d_paper/ipsc_confocal/train.yml. The paper baseline is single-GPU +# and we keep that here so iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/fnet3d_paper_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_gpu_any_long.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + gene: Nuclei + target: nucleus + target_id: nucleus + train_set: joint_ipsc_confocal_a549_mantis + model_name: fnet3d_paper + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__fnet3d_paper + +trainer: + logger: + init_args: + name: FNet3D_JOINT_NUCL_paper + save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Nuclei + z_window_size: 32 + # batch_size in joint mode is NOT divided by RandWeightedCropd + # num_samples (BatchedConcatDataModule.train_dataloader uses + # batch_size as-is, unlike HCSDataModule.train_dataloader which + # divides by train_patches_per_stack). To match single-set's + # effective on-GPU batch of 48 (single-set's batch_size 48 / 8), + # use 6 here so 6 indices * 8 num_samples = 48 GPU samples. + batch_size: 6 + num_workers: 8 + yx_patch_size: [64, 64] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: mean + divisor: std + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [32, 64, 64] + num_samples: 8 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [1] + prob: 0.5 + - class_path: viscy_transforms.BatchedRandFlipd + init_args: + keys: [source, target] + spatial_axes: [2] + prob: 0.5 + val_augmentations: + - class_path: viscy_transforms.CenterSpatialCropd + init_args: + keys: [Phase3D, Nuclei] + roi_size: [32, 64, 64] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + # ipsc_confocal — aics-hipsc multi-marker cell.zarr (Nuclei channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + # a549_mantis — pooled H2B all-conditions train store (Nuclei channel) + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: FNet3DPaper_JOINT_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/dynacell/joint_ipsc_confocal_a549_mantis/nucl/fnet3d_paper + # 512G to match the shared headroom convention across the fnet3d + # leaves on a549/joint workloads. mmap_preload after the BasicIndexer + # fix peaks at ~185 GB for joint cell.zarr + H2B_all; 512G gives + # generous headroom for worker buffers and validation transients. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/a549_mantis/train.yml new file mode 100644 index 000000000..759e49978 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/a549_mantis/train.yml @@ -0,0 +1,43 @@ +# UNetViT3D fit on nucleus (Nuclei channel of cell.zarr) — A549 mantis-lightsheet pooled (mock + DENV + ZIKV). +base: + - ../../../_internal/shared/model/train_sets/a549_mantis.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: a549_mantis + model_name: unetvit3d + experiment_id: nucleus__a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_A549_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/nucl/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/nucl/unetvit3d/checkpoints + +data: + init_args: + # A549 pooled store + target_channel — no resolver in this train_set. + target_channel: Nuclei + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: UNetViT3D_A549_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/a549_mantis/nucl/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml new file mode 100644 index 000000000..1ccfa09e3 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_denv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by UNetViT3D on a549-mantis-h2b-denv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-denv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_denv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucleus_unetvit3d_denv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucleus_unetvit3d_denv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml new file mode 100644 index 000000000..fde231483 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_mock.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by UNetViT3D on a549-mantis-h2b-mock. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-mock. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_mock + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucleus_unetvit3d_mock.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucleus_unetvit3d_mock diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml new file mode 100644 index 000000000..3490b9504 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__a549_mantis_zikv.yaml @@ -0,0 +1,22 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus (H2B) predicted by UNetViT3D on a549-mantis-h2b-zikv. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from the target group so the resolver finds h2b on a549-mantis-h2b-zikv. +defaults: + - override /target: nucleus + - override /predict_set: a549_mantis_h2b_zikv + +target_name: h2b +benchmark: + dataset_ref: + target: h2b + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucleus_unetvit3d_zikv.zarr + +# A549 manifests don't carry cell_segmentation paths (no segmentation +# pipeline yet). Skip feature metrics until segmentation lands. +compute_feature_metrics: false + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/eval_nucleus_unetvit3d_zikv diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml new file mode 100644 index 000000000..e22f22915 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/eval__ipsc_confocal.yaml @@ -0,0 +1,13 @@ +# @package _global_ +# Benchmark eval leaf: Nucleus predicted by UNetViT3D on iPSC confocal. +defaults: + - override /target: nucleus + - override /predict_set: ipsc_confocal + +io: + pred_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucleus_unetvit3d.zarr + +compute_feature_metrics: true + +save: + save_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/eval_nucleus_unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml new file mode 100644 index 000000000..60e0189ba --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_denv.yml @@ -0,0 +1,49 @@ +# UNetViT3D predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-denv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-denv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_denv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_denv + model_name: unetvit3d + experiment_id: nucleus__ipsc_confocal__unetvit3d__a549_mantis_h2b_denv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucleus_unetvit3d_denv.zarr + +launcher: + job_name: UNetViT3D_PRED_NUCLEUS_ON_A549_DENV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml new file mode 100644 index 000000000..d0714c7f6 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_mock.yml @@ -0,0 +1,49 @@ +# UNetViT3D predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-mock test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-mock. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_mock.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_mock + model_name: unetvit3d + experiment_id: nucleus__ipsc_confocal__unetvit3d__a549_mantis_h2b_mock + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucleus_unetvit3d_mock.zarr + +launcher: + job_name: UNetViT3D_PRED_NUCLEUS_ON_A549_MOCK + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml new file mode 100644 index 000000000..76be8fc28 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__a549_mantis_zikv.yml @@ -0,0 +1,49 @@ +# UNetViT3D predict: nucleus trained on iPSC, predicting against a549-mantis-h2b-zikv test. +# A549 manifest keys nucleus by gene (`h2b`); override the iPSC-side `nucleus` +# target_id from targets/nucleus.yml so the resolver finds the h2b target on +# a549-mantis-h2b-zikv. +base: + - ../../../_internal/shared/model/predict_sets/a549_mantis_h2b_zikv.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: a549_mantis_h2b_zikv + model_name: unetvit3d + experiment_id: nucleus__ipsc_confocal__unetvit3d__a549_mantis_h2b_zikv + # Override the iPSC-side `nucleus` target to a549's gene-keyed `h2b`. + dataset_ref: + target: h2b + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucleus_unetvit3d_zikv.zarr + +launcher: + job_name: UNetViT3D_PRED_NUCLEUS_ON_A549_ZIKV + run_root: /hpc/projects/virtual_staining/training/dynacell/a549/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml new file mode 100644 index 000000000..c29cba8b2 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/predict__ipsc_confocal.yml @@ -0,0 +1,43 @@ +# UNetViT3D predict: Nucleus against ipsc_confocal test_cropped. +base: + - ../../../_internal/shared/model/predict_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_predict.yml + - ../../../_internal/shared/model/launcher_profiles/mode_predict.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + trained_on: ipsc_confocal + predict_set: ipsc_confocal + model_name: unetvit3d + experiment_id: nucleus__ipsc_confocal__unetvit3d__ipsc_confocal + +model: + init_args: + ckpt_path: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d/checkpoints/last.ckpt + +data: + init_args: + # override target-inherited normalizations: predict only reads source + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + # clear target-inherited RandWeightedCropd; predict has no CPU augs + augmentations: [] + +trainer: + callbacks: + - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter + init_args: + output_store: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucleus_unetvit3d.zarr + +launcher: + job_name: UNetViT3D_PRED_NUCLEUS + run_root: /hpc/projects/virtual_staining/training/dynacell/ipsc/predictions diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/train.yml new file mode 100644 index 000000000..ef6f7334a --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/ipsc_confocal/train.yml @@ -0,0 +1,37 @@ +# UNetViT3D fit on nucleus (Nuclei channel of cell.zarr) — AICS iPSC confocal. +base: + - ../../../_internal/shared/model/train_sets/ipsc_confocal.yml + - ../../../_internal/shared/model/targets/nucleus.yml + - ../../../_internal/shared/model/data_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + train_set: ipsc_confocal + model_name: unetvit3d + experiment_id: nucleus__ipsc_confocal__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_iPSC_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d/checkpoints + +launcher: + job_name: UNetViT3D_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/ipsc/nucl/unetvit3d diff --git a/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml new file mode 100644 index 000000000..875860261 --- /dev/null +++ b/applications/dynacell/configs/benchmarks/virtual_staining/nucleus/unetvit3d/joint_ipsc_confocal_a549_mantis/train.yml @@ -0,0 +1,143 @@ +# UNetViT3D fit on nucleus (Nuclei) — joint ipsc_confocal + a549_mantis pooled. +# +# Joint leaf per Stage 7 of A549_EXPANSION_ROADMAP.md. Uses +# BatchedConcatDataModule with two explicit HCSDataModule children +# (no benchmark.dataset_ref — joint leaves bypass the single-dataset +# resolver). Only model_overlays/unetvit3d_fit.yml is composed; the data +# block is authored inline because joint hparams live on the children. +# +# iPSC source is the multi-marker cell.zarr (Brightfield, Nuclei, +# Membrane, Phase3D); A549 source is the H2B-marker pooled store +# H2B_all.zarr. The shared target_channel name is `Nuclei` in both. +# +# Topology: single H200, single GPU — same as unetvit3d/ipsc_confocal/train.yml. +# The paper baseline pattern is single-GPU and we keep that here so +# iPSC-only and joint runs are apples-to-apples. +base: + - ../../../_internal/shared/model/model_overlays/unetvit3d_fit.yml + - ../../../_internal/shared/model/launcher_profiles/mode_fit.yml + - ../../../_internal/shared/model/launcher_profiles/hardware_h200_single.yml + - ../../../_internal/shared/model/launcher_profiles/runtime_shared.yml + +benchmark: + task: virtual_staining + organelle: nucleus + gene: Nuclei + target: nucleus + target_id: nucleus + train_set: joint_ipsc_confocal_a549_mantis + model_name: unetvit3d + experiment_id: nucleus__joint_ipsc_confocal_a549_mantis__unetvit3d + +trainer: + logger: + init_args: + name: UNetViT3D_JOINT_NUCL + save_dir: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/unetvit3d + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + monitor: loss/validate + every_n_epochs: 1 + save_top_k: 4 + save_last: true + dirpath: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/unetvit3d/checkpoints + +_hcs_init_args: &hcs_init_args + source_channel: Phase3D + target_channel: Nuclei + z_window_size: 13 + batch_size: 4 + num_workers: 4 + yx_patch_size: [512, 512] + split_ratio: 0.8 + mmap_preload: true + scratch_dir: /dev/shm + persistent_workers: true + normalizations: + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Phase3D] + level: fov_statistics + subtrahend: mean + divisor: std + - class_path: viscy_transforms.NormalizeSampled + init_args: + keys: [Nuclei] + level: fov_statistics + subtrahend: median + divisor: iqr + augmentations: + - class_path: viscy_transforms.RandWeightedCropd + init_args: + keys: [Phase3D, Nuclei] + w_key: Nuclei + spatial_size: [13, 624, 624] + num_samples: 2 + gpu_augmentations: + - class_path: viscy_transforms.BatchedRandAffined + init_args: + keys: [source, target] + prob: 0.8 + rotate_range: [3.14, 0, 0] + shear_range: [0.0, 0.05, 0.05] + scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] + safe_crop_size: [8, 512, 512] + safe_crop_coverage: 0.9 + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + - class_path: viscy_transforms.BatchedRandAdjustContrastd + init_args: + keys: [source] + prob: 0.5 + gamma: [0.8, 1.2] + - class_path: viscy_transforms.BatchedRandScaleIntensityd + init_args: + keys: [source] + prob: 0.5 + factors: 0.5 + - class_path: viscy_transforms.BatchedRandGaussianNoised + init_args: + keys: [source] + prob: 0.5 + mean: 0.0 + std: 0.3 + - class_path: viscy_transforms.BatchedRandGaussianSmoothd + init_args: + keys: [source] + prob: 0.5 + sigma_x: [0.25, 0.75] + sigma_y: [0.25, 0.75] + sigma_z: [0.25, 0.75] + val_gpu_augmentations: + - class_path: viscy_transforms.BatchedCenterSpatialCropd + init_args: + keys: [source, target] + roi_size: [8, 512, 512] + +data: + class_path: viscy_data.BatchedConcatDataModule + init_args: + data_modules: + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + - class_path: viscy_data.hcs.HCSDataModule + init_args: + <<: *hcs_init_args + data_path: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_all.zarr + +launcher: + job_name: UNetViT3D_JOINT_NUCL + run_root: /hpc/projects/comp.micro/virtual_staining/models/cell_diff_vs_viscy/joint_ipsc_confocal_a549_mantis/nucl/unetvit3d + # Joint preloads two stores (iPSC + A549 pool) into /dev/shm; the default + # 256G cap is too tight (256G iPSC mem + ~50G A549 + worker peak OOMs). + # 512G is the smallest tier that fits joint preload + worker overhead. + sbatch: + mem: "512G" diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_celldiff_a549.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_celldiff_a549.sh new file mode 100644 index 000000000..781945e1a --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_celldiff_a549.sh @@ -0,0 +1,52 @@ +#!/usr/bin/env bash +# A549 CellDiff (iterative) evaluation — 4 organelles × 3 infection conditions. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_with_embeddings + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_celldiff_iterative_${target}_${infection}" + echo ">>> celldiff_iterative ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# SEC61B (ER) +run_eval er mock SEC61B_mock sec61b_celldiff_iterative__sec61b_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er denv SEC61B_DENV sec61b_celldiff_iterative__sec61b_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er zikv SEC61B_ZIKV sec61b_celldiff_iterative__sec61b_zikv.zarr Structure_prediction Structure "${V1_SPACING}" + +# # CAAX (membrane) +# run_eval membrane mock CAAX_mock memb_celldiff_iterative_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane denv CAAX_DENV memb_celldiff_iterative_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane zikv CAAX_ZIKV memb_celldiff_iterative_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" + +# # H2B (nucleus) +# run_eval nucleus mock H2B_mock nucl_celldiff_iterative_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus denv H2B_DENV nucl_celldiff_iterative_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus zikv H2B_ZIKV nucl_celldiff_iterative_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# TOMM20 (mitochondria) +run_eval mitochondria mock TOMM20_mock tomm20_celldiff_iterative__tomm20_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria denv TOMM20_DENV tomm20_celldiff_iterative__tomm20_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria zikv TOMM20_ZIKV tomm20_celldiff_iterative__tomm20_zikv.zarr Structure_prediction Structure "${V1_SPACING}" diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_denoise.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_denoise.sh new file mode 100644 index 000000000..17839df8d --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_denoise.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# CELL-Diff denoise — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_celldiff_denoise.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_denoise_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff denoise — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_celldiff_denoise.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_denoise_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff denoise — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_celldiff_denoise.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_denoise_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff denoise — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_celldiff_denoise.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_denoise_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_iterative.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_iterative.sh new file mode 100644 index 000000000..ce5c7645f --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_iterative.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# CELL-Diff iterative — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_celldiff_iterative.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_iterative_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff iterative — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_celldiff_iterative.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_iterative_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff iterative — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_celldiff_iterative.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_iterative_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff iterative — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_celldiff_iterative.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_iterative_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_denv.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_denv.sh new file mode 100644 index 000000000..c7cfcc9de --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_denv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# CellDiff joint (iPSC+A549) model — membrane prediction on A549 DENV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_celldiff_denv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_celldiff_joint_membrane_denv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_mock.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_mock.sh new file mode 100644 index 000000000..6eb6ded19 --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_mock.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# CellDiff joint (iPSC+A549) model — membrane prediction on A549 mock test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_celldiff_mock.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_celldiff_joint_membrane_mock \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_zikv.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_zikv.sh new file mode 100644 index 000000000..5f9376c36 --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_a549_pred_zikv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# CellDiff joint (iPSC+A549) model — membrane prediction on A549 ZIKV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_celldiff_zikv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_celldiff_joint_membrane_zikv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_ipsc_pred.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_ipsc_pred.sh new file mode 100644 index 000000000..73da2af79 --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_mix_trained_ipsc_pred.sh @@ -0,0 +1,22 @@ +#!/usr/bin/env bash +# CellDiff joint (iPSC+A549) model — membrane prediction on iPSC test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +# Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions/memb_celldiff.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_evaluations/eval_celldiff_joint_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/celldiff/run_eval_sliding_window.sh b/applications/dynacell/configs/evaluations/celldiff/run_eval_sliding_window.sh new file mode 100644 index 000000000..f5fcf8141 --- /dev/null +++ b/applications/dynacell/configs/evaluations/celldiff/run_eval_sliding_window.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# CELL-Diff sliding window — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_celldiff_sliding_window.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_sliding_window_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff sliding window — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_celldiff_sliding_window.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_sliding_window_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff sliding window — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_celldiff_sliding_window.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_sliding_window_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# CELL-Diff sliding window — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_celldiff_sliding_window.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_celldiff_sliding_window_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_a549_trained_a549.sh b/applications/dynacell/configs/evaluations/fnet3d/run_a549_trained_a549.sh new file mode 100644 index 000000000..de6181aeb --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_a549_trained_a549.sh @@ -0,0 +1,42 @@ +#!/usr/bin/env bash +# FNet3D A549-trained — evaluate on A549 test set (nucleus + membrane × 3 infections). + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_a549trained + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_fnet3d_a549trained_${target}_${infection}" + echo ">>> fnet3d a549trained ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# H2B (nucleus) +run_eval nucleus mock H2B_mock nucl_fnet3d_paper_a549trained_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +run_eval nucleus denv H2B_DENV nucl_fnet3d_paper_a549trained_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +run_eval nucleus zikv H2B_ZIKV nucl_fnet3d_paper_a549trained_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# CAAX (membrane) +run_eval membrane mock CAAX_mock memb_fnet3d_paper_a549trained_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +run_eval membrane denv CAAX_DENV memb_fnet3d_paper_a549trained_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +run_eval membrane zikv CAAX_ZIKV memb_fnet3d_paper_a549trained_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_a549_trained_ipsc.sh b/applications/dynacell/configs/evaluations/fnet3d/run_a549_trained_ipsc.sh new file mode 100644 index 000000000..03996ecac --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_a549_trained_ipsc.sh @@ -0,0 +1,43 @@ +#!/usr/bin/env bash +# FNet3D A549-trained — evaluate on iPSC test set (nucleus + membrane). + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations_a549trained + +IPSC_SPACING="[0.29,0.108,0.108]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +# Nucleus (H2B) +echo ">>> fnet3d a549trained nucleus (iPSC)" +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path="${PRED_ROOT}/nucl_fnet3d_paper_a549trained.zarr" \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path="${GT_ROOT}/cell.zarr" \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path="${GT_ROOT}/cell_segmented_cleaned.zarr" \ + pixel_metrics.spacing="${IPSC_SPACING}" \ + save.save_dir="${OUT_ROOT}/eval_fnet3d_a549trained_nucleus" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true + +# Membrane (CAAX) +echo ">>> fnet3d a549trained membrane (iPSC)" +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path="${PRED_ROOT}/memb_fnet3d_paper_a549trained.zarr" \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path="${GT_ROOT}/cell.zarr" \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path="${GT_ROOT}/cell_segmented_cleaned.zarr" \ + pixel_metrics.spacing="${IPSC_SPACING}" \ + save.save_dir="${OUT_ROOT}/eval_fnet3d_a549trained_membrane" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d.sh new file mode 100644 index 000000000..fa59cab8c --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# FNet3D — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fnet3d_paper.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_fnet3d_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# FNet3D — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fnet3d_paper.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_fnet3d_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# FNet3D — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fnet3d_paper.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_fnet3d_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# FNet3D — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fnet3d_paper.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_fnet3d_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_a549.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_a549.sh new file mode 100755 index 000000000..38bb128bc --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_a549.sh @@ -0,0 +1,52 @@ +#!/usr/bin/env bash +# A549 FNet3D evaluation — 4 organelles × 3 infection conditions. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_with_embeddings + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_fnet3d_${target}_${infection}" + echo ">>> fnet3d ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# SEC61B (ER) +run_eval er mock SEC61B_mock sec61b_fnet3d_paper__sec61b_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er denv SEC61B_DENV sec61b_fnet3d_paper__sec61b_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er zikv SEC61B_ZIKV sec61b_fnet3d_paper__sec61b_zikv.zarr Structure_prediction Structure "${V1_SPACING}" + +# CAAX (membrane) +# run_eval membrane mock CAAX_mock memb_fnet3d_paper_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane denv CAAX_DENV memb_fnet3d_paper_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane zikv CAAX_ZIKV memb_fnet3d_paper_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" + +# H2B (nucleus) +# run_eval nucleus mock H2B_mock nucl_fnet3d_paper_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus denv H2B_DENV nucl_fnet3d_paper_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus zikv H2B_ZIKV nucl_fnet3d_paper_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# TOMM20 (mitochondria) +run_eval mitochondria mock TOMM20_mock tomm20_fnet3d_paper__tomm20_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria denv TOMM20_DENV tomm20_fnet3d_paper__tomm20_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria zikv TOMM20_ZIKV tomm20_fnet3d_paper__tomm20_zikv.zarr Structure_prediction Structure "${V1_SPACING}" diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_jointtrained_a549.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_jointtrained_a549.sh new file mode 100755 index 000000000..6e357c9d3 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_jointtrained_a549.sh @@ -0,0 +1,38 @@ +#!/usr/bin/env bash +# FNet3D joint-trained (iPSC + A549 mantis) — evaluate on A549 test set (nucleus × 3 infections). +# Membrane already evaluated under joint_evaluations/eval_fnet3d_joint_membrane_. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +mkdir -p "${OUT_ROOT}" + +run_eval () { + local infection=$1 gt_basename=$2 pred_zarr=$3 + local save_dir="${OUT_ROOT}/eval_fnet3d_joint_nucleus_${infection}" + echo ">>> fnet3d joint nucleus ${infection}" + uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${V1_SPACING}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +run_eval mock H2B_mock nucl_fnet3d_paper_jointtrained_mock.zarr +run_eval denv H2B_DENV nucl_fnet3d_paper_jointtrained_denv.zarr +run_eval zikv H2B_ZIKV nucl_fnet3d_paper_jointtrained_zikv.zarr diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_jointtrained_ipsc.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_jointtrained_ipsc.sh new file mode 100755 index 000000000..d48e232e3 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_fnet3d_jointtrained_ipsc.sh @@ -0,0 +1,30 @@ +#!/usr/bin/env bash +# FNet3D joint-trained (iPSC + A549 mantis) — evaluate on iPSC test set (nucleus). +# Membrane already evaluated under joint_evaluations/eval_fnet3d_joint_membrane. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_evaluations + +IPSC_SPACING="[0.29,0.108,0.108]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +mkdir -p "${OUT_ROOT}" + +echo ">>> fnet3d joint nucleus (iPSC)" +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path="${PRED_ROOT}/nucl_fnet3d_paper_jointtrained.zarr" \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path="${GT_ROOT}/cell.zarr" \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path="${GT_ROOT}/cell_segmented_cleaned.zarr" \ + pixel_metrics.spacing="${IPSC_SPACING}" \ + save.save_dir="${OUT_ROOT}/eval_fnet3d_joint_nucleus" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_denv.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_denv.sh new file mode 100644 index 000000000..7b614d282 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_denv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — membrane prediction on A549 DENV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_jointtrained_denv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_fnet3d_joint_membrane_denv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_mock.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_mock.sh new file mode 100644 index 000000000..ee3ab8fd8 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_mock.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — membrane prediction on A549 mock test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_jointtrained_mock.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_fnet3d_joint_membrane_mock \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_zikv.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_zikv.sh new file mode 100644 index 000000000..d98172220 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_membrane_zikv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — membrane prediction on A549 ZIKV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/predictions/memb_fnet3d_paper_jointtrained_zikv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_fnet3d_joint_membrane_zikv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_denv.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_denv.sh new file mode 100644 index 000000000..9c023d9f1 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_denv.sh @@ -0,0 +1,17 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — nucleus prediction on A549 DENV test set. + +set -euo pipefail +ml uv +source ".envrc" + +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_jointtrained_denv.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/H2B_DENV.ozx \ + io.gt_channel_name=Nuclei \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_fnet3d_joint_nucleus_denv \ + compute_feature_metrics=false \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_mock.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_mock.sh new file mode 100644 index 000000000..6d5d4b74e --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_mock.sh @@ -0,0 +1,17 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — nucleus prediction on A549 mock test set. + +set -euo pipefail +ml uv +source ".envrc" + +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_jointtrained_mock.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/H2B_mock.ozx \ + io.gt_channel_name=Nuclei \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_fnet3d_joint_nucleus_mock \ + compute_feature_metrics=false \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_zikv.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_zikv.sh new file mode 100644 index 000000000..bccd05a36 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_a549_pred_nucleus_zikv.sh @@ -0,0 +1,17 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — nucleus prediction on A549 ZIKV test set. + +set -euo pipefail +ml uv +source ".envrc" + +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/predictions/nucl_fnet3d_paper_jointtrained_zikv.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/H2B_ZIKV.ozx \ + io.gt_channel_name=Nuclei \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_fnet3d_joint_nucleus_zikv \ + compute_feature_metrics=false \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_ipsc_pred.sh b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_ipsc_pred.sh new file mode 100644 index 000000000..539cfa239 --- /dev/null +++ b/applications/dynacell/configs/evaluations/fnet3d/run_eval_mix_trained_ipsc_pred.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# FNet3D joint (iPSC+A549) model — membrane prediction on iPSC test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions/memb_fnet3d_paper_jointtrained.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_evaluations/eval_fnet3d_joint_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unetvit3d/run_eval_unetvit3d.sh b/applications/dynacell/configs/evaluations/unetvit3d/run_eval_unetvit3d.sh new file mode 100644 index 000000000..b06680bd7 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unetvit3d/run_eval_unetvit3d.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# UNetViT3D — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_unetvit3d.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unetvit3d_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# UNetViT3D — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_unetvit3d.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unetvit3d_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# UNetViT3D — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_unetvit3d.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unetvit3d_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# UNetViT3D — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_unetvit3d.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unetvit3d_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unetvit3d/run_eval_unetvit3d_a549.sh b/applications/dynacell/configs/evaluations/unetvit3d/run_eval_unetvit3d_a549.sh new file mode 100755 index 000000000..847f82ab5 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unetvit3d/run_eval_unetvit3d_a549.sh @@ -0,0 +1,52 @@ +#!/usr/bin/env bash +# A549 UNetViT3D evaluation — 4 organelles × 3 infection conditions. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_with_embeddings + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_unetvit3d_${target}_${infection}" + echo ">>> unetvit3d ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# SEC61B (ER) +run_eval er mock SEC61B_mock sec61b_unetvit3d__sec61b_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er denv SEC61B_DENV sec61b_unetvit3d__sec61b_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er zikv SEC61B_ZIKV sec61b_unetvit3d__sec61b_zikv.zarr Structure_prediction Structure "${V1_SPACING}" + +# CAAX (membrane) +# run_eval membrane mock CAAX_mock memb_unetvit3d_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane denv CAAX_DENV memb_unetvit3d_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane zikv CAAX_ZIKV memb_unetvit3d_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" + +# H2B (nucleus) +# run_eval nucleus mock H2B_mock nucleus_unetvit3d_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus denv H2B_DENV nucleus_unetvit3d_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus zikv H2B_ZIKV nucleus_unetvit3d_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# TOMM20 (mitochondria) +run_eval mitochondria mock TOMM20_mock tomm20_unetvit3d__tomm20_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria denv TOMM20_DENV tomm20_unetvit3d__tomm20_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria zikv TOMM20_ZIKV tomm20_unetvit3d__tomm20_zikv.zarr Structure_prediction Structure "${V1_SPACING}" diff --git a/applications/dynacell/configs/evaluations/unext2/run_a549_trained_a549.sh b/applications/dynacell/configs/evaluations/unext2/run_a549_trained_a549.sh new file mode 100644 index 000000000..35e68123d --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_a549_trained_a549.sh @@ -0,0 +1,42 @@ +#!/usr/bin/env bash +# UNeXt2 (fcmae_vscyto3d_scratch) A549-trained — evaluate on A549 test set (nucleus + membrane × 3 infections). + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_a549trained + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_unext2_a549trained_${target}_${infection}" + echo ">>> unext2 a549trained ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# H2B (nucleus) +run_eval nucleus mock H2B_mock nucl_fcmae_vscyto3d_scratch_a549trained_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +run_eval nucleus denv H2B_DENV nucl_fcmae_vscyto3d_scratch_a549trained_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +run_eval nucleus zikv H2B_ZIKV nucl_fcmae_vscyto3d_scratch_a549trained_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# CAAX (membrane) +run_eval membrane mock CAAX_mock memb_fcmae_vscyto3d_scratch_a549trained_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +run_eval membrane denv CAAX_DENV memb_fcmae_vscyto3d_scratch_a549trained_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +run_eval membrane zikv CAAX_ZIKV memb_fcmae_vscyto3d_scratch_a549trained_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" diff --git a/applications/dynacell/configs/evaluations/unext2/run_a549_trained_ipsc.sh b/applications/dynacell/configs/evaluations/unext2/run_a549_trained_ipsc.sh new file mode 100644 index 000000000..e97280bc8 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_a549_trained_ipsc.sh @@ -0,0 +1,43 @@ +#!/usr/bin/env bash +# UNeXt2 (fcmae_vscyto3d_scratch) A549-trained — evaluate on iPSC test set (nucleus + membrane). + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations_a549trained + +IPSC_SPACING="[0.29,0.108,0.108]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +# Nucleus (H2B) +echo ">>> unext2 a549trained nucleus (iPSC)" +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path="${PRED_ROOT}/nucl_fcmae_vscyto3d_scratch_a549trained.zarr" \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path="${GT_ROOT}/cell.zarr" \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path="${GT_ROOT}/cell_segmented_cleaned.zarr" \ + pixel_metrics.spacing="${IPSC_SPACING}" \ + save.save_dir="${OUT_ROOT}/eval_unext2_a549trained_nucleus" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true + +# Membrane (CAAX) +echo ">>> unext2 a549trained membrane (iPSC)" +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path="${PRED_ROOT}/memb_fcmae_vscyto3d_scratch_a549trained.zarr" \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path="${GT_ROOT}/cell.zarr" \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path="${GT_ROOT}/cell_segmented_cleaned.zarr" \ + pixel_metrics.spacing="${IPSC_SPACING}" \ + save.save_dir="${OUT_ROOT}/eval_unext2_a549trained_membrane" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_denv.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_denv.sh new file mode 100644 index 000000000..892fa6be7 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_denv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# UNeXt2 joint (iPSC+A549) model — membrane prediction on A549 DENV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_fcmae_vscyto3d_scratch_jointtrained_denv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_unext2_joint_membrane_denv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_mock.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_mock.sh new file mode 100644 index 000000000..af4d918fc --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_mock.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# UNeXt2 joint (iPSC+A549) model — membrane prediction on A549 mock test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_fcmae_vscyto3d_scratch_jointtrained_mock.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_unext2_joint_membrane_mock \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_zikv.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_zikv.sh new file mode 100644 index 000000000..fecd31cef --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_a549_pred_zikv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# UNeXt2 joint (iPSC+A549) model — membrane prediction on A549 ZIKV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_fcmae_vscyto3d_scratch_jointtrained_zikv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_unext2_joint_membrane_zikv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_ipsc_pred.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_ipsc_pred.sh new file mode 100644 index 000000000..8a4d35dc6 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_mix_trained_ipsc_pred.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# UNeXt2 joint (iPSC+A549) model — membrane prediction on iPSC test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions/memb_fcmae_vscyto3d_scratch_jointtrained.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_evaluations/eval_unext2_joint_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_unext2.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2.sh new file mode 100644 index 000000000..75f353462 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# UNext2 — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fcmae_vscyto3d_scratch.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unext2_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# UNext2 — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_scratch.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unext2_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# UNext2 — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fcmae_vscyto3d_scratch.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unext2_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# UNext2 — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_scratch.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_unext2_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_a549.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_a549.sh new file mode 100644 index 000000000..e3f20b11b --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_a549.sh @@ -0,0 +1,52 @@ +#!/usr/bin/env bash +# A549 UNext2 (fcmae_vscyto3d_scratch) evaluation — 4 organelles × 3 infection conditions. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_with_embeddings + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_unext2_${target}_${infection}" + echo ">>> unext2 ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# SEC61B (ER) +run_eval er mock SEC61B_mock sec61b_fcmae_vscyto3d_scratch__sec61b_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er denv SEC61B_DENV sec61b_fcmae_vscyto3d_scratch__sec61b_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er zikv SEC61B_ZIKV sec61b_fcmae_vscyto3d_scratch__sec61b_zikv.zarr Structure_prediction Structure "${V1_SPACING}" + +# CAAX (membrane) +# run_eval membrane mock CAAX_mock memb_fcmae_vscyto3d_scratch_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane denv CAAX_DENV memb_fcmae_vscyto3d_scratch_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane zikv CAAX_ZIKV memb_fcmae_vscyto3d_scratch_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" + +# H2B (nucleus) +# run_eval nucleus mock H2B_mock nucl_fcmae_vscyto3d_scratch_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +run_eval nucleus denv H2B_DENV nucl_fcmae_vscyto3d_scratch_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +run_eval nucleus zikv H2B_ZIKV nucl_fcmae_vscyto3d_scratch_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# TOMM20 (mitochondria) +run_eval mitochondria mock TOMM20_mock tomm20_fcmae_vscyto3d_scratch__tomm20_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria denv TOMM20_DENV tomm20_fcmae_vscyto3d_scratch__tomm20_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria zikv TOMM20_ZIKV tomm20_fcmae_vscyto3d_scratch__tomm20_zikv.zarr Structure_prediction Structure "${V1_SPACING}" diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_jointtrained_a549.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_jointtrained_a549.sh new file mode 100755 index 000000000..d1269f6fc --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_jointtrained_a549.sh @@ -0,0 +1,39 @@ +#!/usr/bin/env bash +# UNeXt2 (fcmae_vscyto3d_scratch) joint-trained (iPSC + A549 mantis) — +# evaluate on A549 test set (membrane × 3 infections). Companion to the +# existing joint membrane evals (eval_{fnet3d,vscyto3d,celldiff}_joint_membrane_). + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +mkdir -p "${OUT_ROOT}" + +run_eval () { + local infection=$1 gt_basename=$2 pred_zarr=$3 + local save_dir="${OUT_ROOT}/eval_unext2_joint_membrane_${infection}" + echo ">>> unext2 joint membrane ${infection}" + uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${V1_SPACING}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +run_eval mock CAAX_mock memb_fcmae_vscyto3d_scratch_jointtrained_mock.zarr +run_eval denv CAAX_DENV memb_fcmae_vscyto3d_scratch_jointtrained_denv.zarr +run_eval zikv CAAX_ZIKV memb_fcmae_vscyto3d_scratch_jointtrained_zikv.zarr diff --git a/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_jointtrained_ipsc.sh b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_jointtrained_ipsc.sh new file mode 100755 index 000000000..4bed1e2b1 --- /dev/null +++ b/applications/dynacell/configs/evaluations/unext2/run_eval_unext2_jointtrained_ipsc.sh @@ -0,0 +1,31 @@ +#!/usr/bin/env bash +# UNeXt2 (fcmae_vscyto3d_scratch) joint-trained (iPSC + A549 mantis) — +# evaluate on iPSC test set (membrane). Companion to the existing +# joint membrane evals (eval_{fnet3d,vscyto3d,celldiff}_joint_membrane). + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_evaluations + +IPSC_SPACING="[0.29,0.108,0.108]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +mkdir -p "${OUT_ROOT}" + +echo ">>> unext2 joint membrane (iPSC)" +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path="${PRED_ROOT}/memb_fcmae_vscyto3d_scratch_jointtrained.zarr" \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path="${GT_ROOT}/cell.zarr" \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path="${GT_ROOT}/cell_segmented_cleaned.zarr" \ + pixel_metrics.spacing="${IPSC_SPACING}" \ + save.save_dir="${OUT_ROOT}/eval_unext2_joint_membrane" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_denv.sh b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_denv.sh new file mode 100644 index 000000000..97b849ec5 --- /dev/null +++ b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_denv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# VSCyto3D joint (iPSC+A549) model — membrane prediction on A549 DENV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_fcmae_vscyto3d_pretrained_jointtrained_denv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_vscyto3d_joint_membrane_denv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_mock.sh b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_mock.sh new file mode 100644 index 000000000..440f6389f --- /dev/null +++ b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_mock.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# VSCyto3D joint (iPSC+A549) model — membrane prediction on A549 mock test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_fcmae_vscyto3d_pretrained_jointtrained_mock.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_vscyto3d_joint_membrane_mock \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_zikv.sh b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_zikv.sh new file mode 100644 index 000000000..974159ef8 --- /dev/null +++ b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_a549_pred_zikv.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# VSCyto3D joint (iPSC+A549) model — membrane prediction on A549 ZIKV test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/a549/joint_predictions/memb_fcmae_vscyto3d_pretrained_jointtrained_zikv.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV.ozx \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV_seg_cleaned.zarr \ + pixel_metrics.spacing=[0.174,0.1494,0.1494] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/a549/joint_evaluations/eval_vscyto3d_joint_membrane_zikv \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_ipsc_pred.sh b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_ipsc_pred.sh new file mode 100644 index 000000000..02adfee78 --- /dev/null +++ b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_mix_trained_ipsc_pred.sh @@ -0,0 +1,21 @@ +#!/usr/bin/env bash +# VSCyto3D joint (iPSC+A549) model — membrane prediction on iPSC test set. + +set -euo pipefail +ml uv +source ".envrc" + +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_predictions/memb_fcmae_vscyto3d_pretrained_jointtrained.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/joint_evaluations/eval_vscyto3d_joint_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/vscyto3d/run_eval_vscyto3d.sh b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_vscyto3d.sh new file mode 100644 index 000000000..afd40f3b9 --- /dev/null +++ b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_vscyto3d.sh @@ -0,0 +1,59 @@ +ml uv + +source ".envrc" + +# VSCyto3D — ER (SEC61B) +uv run dynacell evaluate \ + target_name=er \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/sec61b_fcmae_vscyto3d_pretrained.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_vscyto3d_sec61b \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# VSCyto3D — Membrane +uv run dynacell evaluate \ + target_name=membrane \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/memb_fcmae_vscyto3d_pretrained.zarr \ + io.pred_channel_name=Membrane_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Membrane \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_vscyto3d_membrane \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# VSCyto3D — Mitochondria (TOMM20) +uv run dynacell evaluate \ + target_name=mitochondria \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/tomm20_fcmae_vscyto3d_pretrained.zarr \ + io.pred_channel_name=Structure_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr \ + io.gt_channel_name=Structure \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_vscyto3d_tomm20 \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true + +# VSCyto3D — Nucleus +uv run dynacell evaluate \ + target_name=nucleus \ + io.pred_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/predictions/nucl_fcmae_vscyto3d_pretrained.zarr \ + io.pred_channel_name=Nuclei_prediction \ + io.gt_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr \ + io.gt_channel_name=Nuclei \ + io.cell_segmentation_path=/hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr \ + pixel_metrics.spacing=[0.29,0.108,0.108] \ + save.save_dir=/hpc/projects/virtual_staining/training/dynacell/ipsc/evaluations/eval_vscyto3d_nucleus \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt'" \ + force_recompute.all=true diff --git a/applications/dynacell/configs/evaluations/vscyto3d/run_eval_vscyto3d_a549.sh b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_vscyto3d_a549.sh new file mode 100644 index 000000000..7edbdcce2 --- /dev/null +++ b/applications/dynacell/configs/evaluations/vscyto3d/run_eval_vscyto3d_a549.sh @@ -0,0 +1,52 @@ +#!/usr/bin/env bash +# A549 VSCyto3D (fcmae_vscyto3d_pretrained) evaluation — 4 organelles × 3 infection conditions. + +set -euo pipefail +ml uv +source ".envrc" + +PRED_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/predictions +GT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1 +OUT_ROOT=/hpc/projects/virtual_staining/training/dynacell/a549/evaluations_with_embeddings + +V1_SPACING="[0.174,0.1494,0.1494]" +DYNACLR_CKPT='/hpc/projects/organelle_phenotyping/models/SEC61_TOMM20_G3BP1_Sensor/time_interval/dynaclr_gfp_rfp_Ph/organelle_sensor_phase_maxproj_ver3_150epochs/saved_checkpoints/epoch=104-step=53760.ckpt' + +run_eval () { + local target=$1 infection=$2 gt_basename=$3 \ + pred_zarr=$4 pred_chan=$5 gt_chan=$6 spacing=$7 + local save_dir="${OUT_ROOT}/eval_vscyto3d_${target}_${infection}" + echo ">>> vscyto3d ${target} ${infection}" + uv run dynacell evaluate \ + target_name="${target}" \ + io.pred_path="${PRED_ROOT}/${pred_zarr}" \ + io.pred_channel_name="${pred_chan}" \ + io.gt_path="${GT_ROOT}/test/${gt_basename}.ozx" \ + io.gt_channel_name="${gt_chan}" \ + io.cell_segmentation_path="${GT_ROOT}/test/${gt_basename}_seg_cleaned.zarr" \ + pixel_metrics.spacing="${spacing}" \ + save.save_dir="${save_dir}" \ + compute_feature_metrics=true \ + "feature_extractor.dynaclr.checkpoint='${DYNACLR_CKPT}'" \ + force_recompute.all=true +} + +# SEC61B (ER) +run_eval er mock SEC61B_mock sec61b_fcmae_vscyto3d_pretrained__sec61b_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er denv SEC61B_DENV sec61b_fcmae_vscyto3d_pretrained__sec61b_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval er zikv SEC61B_ZIKV sec61b_fcmae_vscyto3d_pretrained__sec61b_zikv.zarr Structure_prediction Structure "${V1_SPACING}" + +# CAAX (membrane) +# run_eval membrane mock CAAX_mock memb_fcmae_vscyto3d_pretrained_mock.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane denv CAAX_DENV memb_fcmae_vscyto3d_pretrained_denv.zarr Membrane_prediction Membrane "${V1_SPACING}" +# run_eval membrane zikv CAAX_ZIKV memb_fcmae_vscyto3d_pretrained_zikv.zarr Membrane_prediction Membrane "${V1_SPACING}" + +# H2B (nucleus) +# run_eval nucleus mock H2B_mock nucl_fcmae_vscyto3d_pretrained_mock.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus denv H2B_DENV nucl_fcmae_vscyto3d_pretrained_denv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" +# run_eval nucleus zikv H2B_ZIKV nucl_fcmae_vscyto3d_pretrained_zikv.zarr Nuclei_prediction Nuclei "${V1_SPACING}" + +# TOMM20 (mitochondria) +run_eval mitochondria mock TOMM20_mock tomm20_fcmae_vscyto3d_pretrained__tomm20_mock.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria denv TOMM20_DENV tomm20_fcmae_vscyto3d_pretrained__tomm20_denv.zarr Structure_prediction Structure "${V1_SPACING}" +run_eval mitochondria zikv TOMM20_ZIKV tomm20_fcmae_vscyto3d_pretrained__tomm20_zikv.zarr Structure_prediction Structure "${V1_SPACING}" diff --git a/applications/dynacell/configs/examples/celldiff/fit.yml b/applications/dynacell/configs/examples/celldiff/fit.yml new file mode 100644 index 000000000..a4ce46588 --- /dev/null +++ b/applications/dynacell/configs/examples/celldiff/fit.yml @@ -0,0 +1,34 @@ +# CellDiff flow-matching: fit from scratch. +# Usage: cd applications/dynacell/configs/examples && uv run dynacell fit -c celldiff/fit.yml +base: + - ../../recipes/trainer/fit.yml + - ../../recipes/topology/ddp_4gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/celldiff_fm.yml + +model: + init_args: + lr: 0.0002 + schedule: WarmupCosine + num_log_steps: 10 + +trainer: + precision: bf16-mixed + max_epochs: 200 + # Flow-matching training checkpoints by epoch count, not validation loss. + callbacks: + - class_path: lightning.pytorch.callbacks.LearningRateMonitor + init_args: + logging_interval: step + - class_path: lightning.pytorch.callbacks.ModelCheckpoint + init_args: + every_n_epochs: 10 + save_top_k: -1 + save_last: true + +data: + init_args: + data_path: #TODO + z_window_size: 8 + batch_size: 4 + yx_patch_size: [512, 512] diff --git a/applications/dynacell/configs/examples/celldiff/predict.yml b/applications/dynacell/configs/examples/celldiff/predict.yml new file mode 100644 index 000000000..53c16a583 --- /dev/null +++ b/applications/dynacell/configs/examples/celldiff/predict.yml @@ -0,0 +1,22 @@ +# CellDiff flow-matching: predict from checkpoint. +# Usage: cd applications/dynacell/configs/examples && uv run dynacell predict -c celldiff/predict.yml +base: + - ../../recipes/trainer/predict.yml + - ../../recipes/topology/single_gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/celldiff_fm.yml + +model: + init_args: + num_generate_steps: 100 +# predict_method: generate + predict_method: iterative # denoise, generate, sliding_window (non-overlapping), or iterative (overlapping) + predict_overlap: [4, 256, 256] + ckpt_path: #TODO checkpoint path + +data: + init_args: + data_path: #TODO HCS OME-Zarr test data + z_window_size: 40 + batch_size: 1 + yx_patch_size: [512, 512] diff --git a/applications/dynacell/examples/configs/fnet3d/fit.yml b/applications/dynacell/configs/examples/fnet3d/fit.yml similarity index 53% rename from applications/dynacell/examples/configs/fnet3d/fit.yml rename to applications/dynacell/configs/examples/fnet3d/fit.yml index 3a74fea38..74e536750 100644 --- a/applications/dynacell/examples/configs/fnet3d/fit.yml +++ b/applications/dynacell/configs/examples/fnet3d/fit.yml @@ -1,9 +1,10 @@ # FNet3D: supervised training (benchmark baseline). -# Usage: cd applications/dynacell/examples/configs && uv run dynacell fit -c fnet3d/fit.yml +# Usage: cd applications/dynacell/configs/examples && uv run dynacell fit -c fnet3d/fit.yml base: - - ../recipes/trainer/fit_4gpu.yml - - ../recipes/data/hcs_phase_fluor_3d.yml - - ../recipes/models/fnet3d.yml + - ../../recipes/trainer/fit.yml + - ../../recipes/topology/ddp_4gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/fnet3d.yml model: init_args: @@ -11,6 +12,8 @@ model: schedule: Constant trainer: + precision: 16-mixed + max_epochs: 200 max_steps: 50000 data: diff --git a/applications/dynacell/configs/examples/fnet3d/predict.yml b/applications/dynacell/configs/examples/fnet3d/predict.yml new file mode 100644 index 000000000..7b90b1f1c --- /dev/null +++ b/applications/dynacell/configs/examples/fnet3d/predict.yml @@ -0,0 +1,18 @@ +# FNet3D: predict from checkpoint. +# Usage: cd applications/dynacell/configs/examples && uv run dynacell predict -c fnet3d/predict.yml +base: + - ../../recipes/trainer/predict.yml + - ../../recipes/topology/single_gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/fnet3d.yml + +model: + init_args: + ckpt_path: #TODO checkpoint path + +data: + init_args: + data_path: #TODO HCS OME-Zarr data + z_window_size: 32 + batch_size: 4 + yx_patch_size: [64, 64] diff --git a/applications/dynacell/examples/configs/unetvit3d/fit.yml b/applications/dynacell/configs/examples/unetvit3d/fit.yml similarity index 54% rename from applications/dynacell/examples/configs/unetvit3d/fit.yml rename to applications/dynacell/configs/examples/unetvit3d/fit.yml index cd2eb6d61..742606466 100644 --- a/applications/dynacell/examples/configs/unetvit3d/fit.yml +++ b/applications/dynacell/configs/examples/unetvit3d/fit.yml @@ -1,9 +1,10 @@ # UNetViT3D: supervised training. -# Usage: cd applications/dynacell/examples/configs && uv run dynacell fit -c unetvit3d/fit.yml +# Usage: cd applications/dynacell/configs/examples && uv run dynacell fit -c unetvit3d/fit.yml base: - - ../recipes/trainer/fit_4gpu.yml - - ../recipes/data/hcs_phase_fluor_3d.yml - - ../recipes/models/unetvit3d.yml + - ../../recipes/trainer/fit.yml + - ../../recipes/topology/ddp_4gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/unetvit3d.yml model: init_args: @@ -11,6 +12,7 @@ model: schedule: WarmupCosine trainer: + precision: 16-mixed max_epochs: 200 data: diff --git a/applications/dynacell/examples/configs/unetvit3d/predict.yml b/applications/dynacell/configs/examples/unetvit3d/predict.yml similarity index 50% rename from applications/dynacell/examples/configs/unetvit3d/predict.yml rename to applications/dynacell/configs/examples/unetvit3d/predict.yml index 9f6c7aac6..9e0c179f9 100644 --- a/applications/dynacell/examples/configs/unetvit3d/predict.yml +++ b/applications/dynacell/configs/examples/unetvit3d/predict.yml @@ -1,10 +1,15 @@ # UNetViT3D: predict from checkpoint. # yx_patch_size and z_window_size must match the model's input_spatial_size. -# Usage: cd applications/dynacell/examples/configs && uv run dynacell predict -c unetvit3d/predict.yml +# Usage: cd applications/dynacell/configs/examples && uv run dynacell predict -c unetvit3d/predict.yml base: - - ../recipes/trainer/predict_gpu.yml - - ../recipes/data/hcs_phase_fluor_3d.yml - - ../recipes/models/unetvit3d.yml + - ../../recipes/trainer/predict.yml + - ../../recipes/topology/single_gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/unetvit3d.yml + +model: + init_args: + ckpt_path: #TODO checkpoint path data: init_args: diff --git a/applications/dynacell/configs/examples/unext2/fit.yml b/applications/dynacell/configs/examples/unext2/fit.yml new file mode 100644 index 000000000..d066abd6c --- /dev/null +++ b/applications/dynacell/configs/examples/unext2/fit.yml @@ -0,0 +1,23 @@ +# UNeXt2 (VSCyto3D): supervised training. +# Usage: cd applications/dynacell/configs/examples && uv run dynacell fit -c unext2/fit.yml +base: + - ../../recipes/trainer/fit.yml + - ../../recipes/topology/ddp_4gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/unext2_3d.yml + +model: + init_args: + lr: 0.0002 + schedule: WarmupCosine + +trainer: + precision: 16-mixed + max_epochs: 200 + +data: + init_args: + data_path: #TODO HCS OME-Zarr data + z_window_size: 15 + batch_size: 8 + yx_patch_size: [256, 256] diff --git a/applications/dynacell/configs/examples/unext2/predict.yml b/applications/dynacell/configs/examples/unext2/predict.yml new file mode 100644 index 000000000..c2a7d38c1 --- /dev/null +++ b/applications/dynacell/configs/examples/unext2/predict.yml @@ -0,0 +1,18 @@ +# UNeXt2 (VSCyto3D): predict from checkpoint. +# Usage: cd applications/dynacell/configs/examples && uv run dynacell predict -c unext2/predict.yml +base: + - ../../recipes/trainer/predict.yml + - ../../recipes/topology/single_gpu.yml + - ../../recipes/data/hcs_phase_fluor_3d.yml + - ../../recipes/models/unext2_3d.yml + +model: + init_args: + ckpt_path: #TODO checkpoint path + +data: + init_args: + data_path: #TODO HCS OME-Zarr test data + z_window_size: 15 + batch_size: 1 + yx_patch_size: [256, 256] diff --git a/applications/dynacell/examples/configs/recipes/data/hcs_phase_fluor_3d.yml b/applications/dynacell/configs/recipes/data/hcs_phase_fluor_3d.yml similarity index 95% rename from applications/dynacell/examples/configs/recipes/data/hcs_phase_fluor_3d.yml rename to applications/dynacell/configs/recipes/data/hcs_phase_fluor_3d.yml index 1adfddfa5..45f16c829 100644 --- a/applications/dynacell/examples/configs/recipes/data/hcs_phase_fluor_3d.yml +++ b/applications/dynacell/configs/recipes/data/hcs_phase_fluor_3d.yml @@ -10,7 +10,7 @@ data: batch_size: 16 num_workers: 8 yx_patch_size: [512, 512] - preload: true + mmap_preload: false normalizations: - class_path: viscy_transforms.NormalizeSampled init_args: diff --git a/applications/dynacell/examples/configs/recipes/models/celldiff_fm.yml b/applications/dynacell/configs/recipes/models/celldiff_fm.yml similarity index 100% rename from applications/dynacell/examples/configs/recipes/models/celldiff_fm.yml rename to applications/dynacell/configs/recipes/models/celldiff_fm.yml diff --git a/applications/dynacell/examples/configs/recipes/models/fnet3d.yml b/applications/dynacell/configs/recipes/models/fnet3d.yml similarity index 100% rename from applications/dynacell/examples/configs/recipes/models/fnet3d.yml rename to applications/dynacell/configs/recipes/models/fnet3d.yml diff --git a/applications/dynacell/examples/configs/recipes/models/fnet3d_z8.yml b/applications/dynacell/configs/recipes/models/fnet3d_z8.yml similarity index 100% rename from applications/dynacell/examples/configs/recipes/models/fnet3d_z8.yml rename to applications/dynacell/configs/recipes/models/fnet3d_z8.yml diff --git a/applications/dynacell/examples/configs/recipes/models/unetvit3d.yml b/applications/dynacell/configs/recipes/models/unetvit3d.yml similarity index 72% rename from applications/dynacell/examples/configs/recipes/models/unetvit3d.yml rename to applications/dynacell/configs/recipes/models/unetvit3d.yml index 18b01a23b..bf0242c21 100644 --- a/applications/dynacell/examples/configs/recipes/models/unetvit3d.yml +++ b/applications/dynacell/configs/recipes/models/unetvit3d.yml @@ -7,12 +7,10 @@ model: input_spatial_size: [8, 512, 512] in_channels: 1 out_channels: 1 - dims: [32, 64, 128] - num_res_block: [2, 2] + dims: [64, 128, 256, 256] + num_res_block: [2, 2, 2] hidden_size: 512 num_heads: 8 dim_head: 64 - dropout: 0.0 - final_dropout: 0.0 - num_hidden_layers: 2 + num_hidden_layers: 8 patch_size: 4 diff --git a/applications/dynacell/examples/configs/recipes/models/unext2_3d.yml b/applications/dynacell/configs/recipes/models/unext2_3d.yml similarity index 100% rename from applications/dynacell/examples/configs/recipes/models/unext2_3d.yml rename to applications/dynacell/configs/recipes/models/unext2_3d.yml diff --git a/applications/dynacell/examples/configs/recipes/models/unext2_3d_z8.yml b/applications/dynacell/configs/recipes/models/unext2_3d_z8.yml similarity index 100% rename from applications/dynacell/examples/configs/recipes/models/unext2_3d_z8.yml rename to applications/dynacell/configs/recipes/models/unext2_3d_z8.yml diff --git a/applications/dynacell/examples/configs/recipes/modes/spotlight.yml b/applications/dynacell/configs/recipes/modes/spotlight.yml similarity index 100% rename from applications/dynacell/examples/configs/recipes/modes/spotlight.yml rename to applications/dynacell/configs/recipes/modes/spotlight.yml diff --git a/applications/dynacell/configs/recipes/topology/ddp_4gpu.yml b/applications/dynacell/configs/recipes/topology/ddp_4gpu.yml new file mode 100644 index 000000000..6ecdb4ad8 --- /dev/null +++ b/applications/dynacell/configs/recipes/topology/ddp_4gpu.yml @@ -0,0 +1,6 @@ +# Topology recipe: 4-GPU DDP training on a single node. +trainer: + accelerator: gpu + strategy: ddp + devices: 4 + num_nodes: 1 diff --git a/applications/dynacell/configs/recipes/topology/single_gpu.yml b/applications/dynacell/configs/recipes/topology/single_gpu.yml new file mode 100644 index 000000000..a05fa451a --- /dev/null +++ b/applications/dynacell/configs/recipes/topology/single_gpu.yml @@ -0,0 +1,7 @@ +# Single-GPU training. strategy=auto lets Lightning pick single_device; +# plain ddp at devices=1 would add pointless process-group overhead. +trainer: + accelerator: gpu + strategy: auto + devices: 1 + num_nodes: 1 diff --git a/applications/dynacell/examples/configs/recipes/trainer/fit_1gpu.yml b/applications/dynacell/configs/recipes/trainer/fit.yml similarity index 59% rename from applications/dynacell/examples/configs/recipes/trainer/fit_1gpu.yml rename to applications/dynacell/configs/recipes/trainer/fit.yml index c1bd01a47..25c4fa085 100644 --- a/applications/dynacell/examples/configs/recipes/trainer/fit_1gpu.yml +++ b/applications/dynacell/configs/recipes/trainer/fit.yml @@ -1,15 +1,11 @@ -# Trainer recipe: 1-GPU training with WandB logging and checkpointing. -# W&B convention: -# - run name: YYYYMMDD-HHMMSS_ -# - group: VISCY_WANDB_GROUP, else VISCY_WANDB_LAUNCH, else the base name +# Topology (accelerator / devices / strategy / num_nodes) lives in +# recipes/topology/*.yml. Precision lives in model overlays. +# max_epochs and max_steps also live in model overlays or leaves. seed_everything: 42 trainer: - accelerator: gpu - strategy: ddp - devices: 1 - num_nodes: 1 - precision: bf16-mixed log_every_n_steps: 10 + enable_checkpointing: true + inference_mode: true logger: class_path: lightning.pytorch.loggers.WandbLogger init_args: @@ -22,7 +18,5 @@ trainer: init_args: monitor: loss/validate every_n_epochs: 1 - save_top_k: 4 + save_top_k: 5 save_last: true - enable_checkpointing: true - inference_mode: true diff --git a/applications/cytoland/examples/configs/recipes/trainer/predict_gpu.yml b/applications/dynacell/configs/recipes/trainer/predict.yml similarity index 62% rename from applications/cytoland/examples/configs/recipes/trainer/predict_gpu.yml rename to applications/dynacell/configs/recipes/trainer/predict.yml index a8baf2f63..d6a6bd349 100644 --- a/applications/cytoland/examples/configs/recipes/trainer/predict_gpu.yml +++ b/applications/dynacell/configs/recipes/trainer/predict.yml @@ -1,11 +1,10 @@ -# Trainer recipe: single-GPU prediction. +# Unified predict trainer recipe. +# Topology lives in recipes/topology/single_gpu.yml; prediction is always +# single-GPU here. trainer: - accelerator: gpu - devices: 1 precision: 32-true callbacks: - class_path: viscy_utils.callbacks.prediction_writer.HCSPredictionWriter init_args: output_store: #TODO output zarr path return_predictions: false -ckpt_path: #TODO checkpoint path diff --git a/applications/dynacell/examples/configs/celldiff/fit.yml b/applications/dynacell/examples/configs/celldiff/fit.yml deleted file mode 100644 index a82977835..000000000 --- a/applications/dynacell/examples/configs/celldiff/fit.yml +++ /dev/null @@ -1,22 +0,0 @@ -# CellDiff flow-matching: fit from scratch. -# Usage: cd applications/dynacell/examples/configs && uv run dynacell fit -c celldiff/fit.yml -base: - - ../recipes/trainer/fit_fm_4gpu.yml - - ../recipes/data/hcs_phase_fluor_3d.yml - - ../recipes/models/celldiff_fm.yml - -model: - init_args: - lr: 0.0002 - schedule: WarmupCosine - num_log_steps: 10 - -trainer: - max_epochs: 200 - -data: - init_args: - data_path: #TODO - z_window_size: 8 - batch_size: 4 - yx_patch_size: [512, 512] diff --git a/applications/dynacell/examples/configs/celldiff/predict.yml b/applications/dynacell/examples/configs/celldiff/predict.yml deleted file mode 100644 index 7a5e94335..000000000 --- a/applications/dynacell/examples/configs/celldiff/predict.yml +++ /dev/null @@ -1,18 +0,0 @@ -# CellDiff flow-matching: predict from checkpoint. -# Usage: cd applications/dynacell/examples/configs && uv run dynacell predict -c celldiff/predict.yml --ckpt_path=/path/to/checkpoint.ckpt -base: - - ../recipes/trainer/predict_gpu.yml - - ../recipes/data/hcs_phase_fluor_3d.yml - - ../recipes/models/celldiff_fm.yml - -model: - init_args: - num_generate_steps: 100 - predict_method: generate - -data: - init_args: - data_path: #TODO - z_window_size: 8 - batch_size: 1 - yx_patch_size: [512, 512] diff --git a/applications/dynacell/examples/configs/fnet3d/predict.yml b/applications/dynacell/examples/configs/fnet3d/predict.yml deleted file mode 100644 index 31974c5af..000000000 --- a/applications/dynacell/examples/configs/fnet3d/predict.yml +++ /dev/null @@ -1,13 +0,0 @@ -# FNet3D: predict from checkpoint. -# Usage: cd applications/dynacell/examples/configs && uv run dynacell predict -c fnet3d/predict.yml -base: - - ../recipes/trainer/predict_gpu.yml - - ../recipes/data/hcs_phase_fluor_3d.yml - - ../recipes/models/fnet3d.yml - -data: - init_args: - data_path: #TODO HCS OME-Zarr data - z_window_size: 32 - batch_size: 4 - yx_patch_size: [64, 64] diff --git a/applications/dynacell/examples/configs/recipes/trainer/fit_fm_4gpu.yml b/applications/dynacell/examples/configs/recipes/trainer/fit_fm_4gpu.yml deleted file mode 100644 index ce5da0068..000000000 --- a/applications/dynacell/examples/configs/recipes/trainer/fit_fm_4gpu.yml +++ /dev/null @@ -1,23 +0,0 @@ -# Trainer recipe: 4-GPU DDP training for flow-matching models. -# Flow-matching has no validation loss — checkpoint by epoch count. -seed_everything: 42 -trainer: - accelerator: gpu - strategy: ddp - devices: 4 - num_nodes: 1 - precision: bf16-mixed - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - every_n_epochs: 10 - save_top_k: -1 - save_last: true - fast_dev_run: false - max_epochs: 200 - log_every_n_steps: 10 - enable_checkpointing: true - inference_mode: true diff --git a/applications/dynacell/examples/configs/sec61b/fit_celldiff.yml b/applications/dynacell/examples/configs/sec61b/fit_celldiff.yml deleted file mode 100644 index 242d54b1c..000000000 --- a/applications/dynacell/examples/configs/sec61b/fit_celldiff.yml +++ /dev/null @@ -1,110 +0,0 @@ -# CellDiff flow-matching on AICS iPSC SEC61B (ER). -# Data pipeline aligned with VSCyto3D SEC61B config (same dataset, same -# augmentation strategy). Architecture: CELLDiffNet with ViT bottleneck, -# z=8, yx=512, Linear transport, velocity prediction. -# Usage: uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_celldiff.yml -base: - - ../recipes/trainer/fit_fm_4gpu.yml - - ../recipes/models/celldiff_fm.yml - -model: - init_args: - lr: 0.0001 - schedule: WarmupCosine - num_log_steps: 10 - -trainer: - precision: bf16-mixed - max_epochs: 200 - logger: - init_args: - name: CELLDiff_iPSC_SEC61B - save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/celldiff - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - every_n_epochs: 1 - save_top_k: -1 - save_last: true - dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/celldiff/checkpoints - -data: - class_path: viscy_data.hcs.HCSDataModule - init_args: - data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr - source_channel: Phase3D - target_channel: Structure - split_ratio: 0.8 - z_window_size: 8 - batch_size: 4 - num_workers: 8 - yx_patch_size: [512, 512] - preload: true - persistent_workers: true - normalizations: - - class_path: viscy_transforms.NormalizeSampled - init_args: - keys: [Phase3D] - level: fov_statistics - subtrahend: median - divisor: iqr - - class_path: viscy_transforms.NormalizeSampled - init_args: - keys: [Structure] - level: fov_statistics - subtrahend: median - divisor: iqr - augmentations: - # CPU: 2 foreground-weighted patches per FOV (amortizes zarr read). - # batch_size=4 → DataLoader loads 2 FOVs, each yields 2 patches = 4 effective. - # Oversized crop in YX (768) leaves border for affine rotation artifacts. - - class_path: viscy_transforms.RandWeightedCropd - init_args: - keys: [Phase3D, Structure] - w_key: Structure - spatial_size: [8, 768, 768] - num_samples: 2 - gpu_augmentations: - # GPU: affine on oversized patch → center crop to final 8×512×512. - - class_path: viscy_transforms.BatchedRandAffined - init_args: - keys: [source, target] - prob: 0.8 - rotate_range: [3.14, 0, 0] - shear_range: [0.0, 0.05, 0.05] - scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] - - class_path: viscy_transforms.BatchedCenterSpatialCropd - init_args: - keys: [source, target] - roi_size: [8, 512, 512] - - class_path: viscy_transforms.BatchedRandAdjustContrastd - init_args: - keys: [source] - prob: 0.5 - gamma: [0.8, 1.2] - - class_path: viscy_transforms.BatchedRandScaleIntensityd - init_args: - keys: [source] - prob: 0.5 - factors: 0.5 - - class_path: viscy_transforms.BatchedRandGaussianNoised - init_args: - keys: [source] - prob: 0.5 - mean: 0.0 - std: 0.3 - - class_path: viscy_transforms.BatchedRandGaussianSmoothd - init_args: - keys: [source] - prob: 0.5 - sigma_x: [0.25, 0.75] - sigma_y: [0.25, 0.75] - sigma_z: [0.25, 0.75] - val_gpu_augmentations: - - class_path: viscy_transforms.BatchedDivisibleCropd - init_args: - keys: [source, target] - k: [1, 64, 64] diff --git a/applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml b/applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml deleted file mode 100644 index 0e103e64e..000000000 --- a/applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml +++ /dev/null @@ -1,40 +0,0 @@ -# FNet3D on AICS iPSC SEC61B (ER) — dynacell benchmark. -# Usage: uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml -# Batch related launches with: -# export VISCY_WANDB_LAUNCH=20260401-augfix-r1 -base: - - ../recipes/trainer/fit_1gpu.yml - - ../recipes/data/hcs_sec61b_3d.yml - - ../recipes/models/fnet3d_z8.yml - -model: - init_args: - loss_function: - class_path: viscy_utils.losses.MixedLoss - init_args: - l1_alpha: 0.5 - ms_dssim_alpha: 0.5 - lr: 0.001 - schedule: WarmupCosine - -trainer: - max_epochs: 100 - logger: - init_args: - name: FNet3D_iPSC_SEC61B - save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - monitor: loss/validate - every_n_epochs: 1 - save_top_k: 4 - save_last: true - dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d/checkpoints - -data: - init_args: - batch_size: 64 diff --git a/applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml b/applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml deleted file mode 100644 index ab3d65c21..000000000 --- a/applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml +++ /dev/null @@ -1,89 +0,0 @@ -# FNet3D on AICS iPSC SEC61B (ER) using paper-native baseline settings on Dynacell data. -# Matches the pytorch_fnet baseline architecture and core training hyperparameters: -# depth=4, mult_chan=32, z_window_size=32, yx_patch_size=64, batch_size=48 -# (6 FOVs × 8 patches via num_samples=8), lr=1e-3, no scheduler, 50k steps, -# seed=0, single-GPU execution, plus the baseline's basic paired Y/X flip augmentation. -# Usage: uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml -seed_everything: 0 - -base: - - ../recipes/trainer/fit_1gpu.yml - - ../recipes/models/fnet3d.yml - -model: - init_args: - loss_function: - class_path: torch.nn.MSELoss - lr: 0.001 - schedule: Constant - -trainer: - precision: 32-true - max_steps: 50000 - logger: - init_args: - name: FNet3D_iPSC_SEC61B_paper - save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - monitor: loss/validate - every_n_epochs: 1 - save_top_k: 4 - save_last: true - dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints - -data: - class_path: viscy_data.hcs.HCSDataModule - init_args: - data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr - source_channel: Phase3D - target_channel: Structure - split_ratio: 0.8 - z_window_size: 32 - batch_size: 48 - num_workers: 8 - yx_patch_size: [64, 64] - preload: true - persistent_workers: true - normalizations: - - class_path: viscy_transforms.NormalizeSampled - init_args: - keys: [Phase3D] - level: fov_statistics - subtrahend: mean - divisor: std - - class_path: viscy_transforms.NormalizeSampled - init_args: - keys: [Structure] - level: fov_statistics - subtrahend: mean - divisor: std - augmentations: - # CPU: 8 patches per FOV (amortizes zarr decompression). - # batch_size=48 → DataLoader loads 6 FOVs, each yields 8 patches = 48 effective. - - class_path: viscy_transforms.RandWeightedCropd - init_args: - keys: [Phase3D, Structure] - w_key: Structure - spatial_size: [32, 64, 64] - num_samples: 8 - gpu_augmentations: - - class_path: viscy_transforms.BatchedRandFlipd - init_args: - keys: [source, target] - spatial_axes: [1] - prob: 0.5 - - class_path: viscy_transforms.BatchedRandFlipd - init_args: - keys: [source, target] - spatial_axes: [2] - prob: 0.5 - val_augmentations: - - class_path: viscy_transforms.CenterSpatialCropd - init_args: - keys: [Phase3D, Structure] - roi_size: [32, 64, 64] diff --git a/applications/dynacell/examples/configs/sec61b/fit_unext2.yml b/applications/dynacell/examples/configs/sec61b/fit_unext2.yml deleted file mode 100644 index e2d3b71d9..000000000 --- a/applications/dynacell/examples/configs/sec61b/fit_unext2.yml +++ /dev/null @@ -1,118 +0,0 @@ -# UNeXt2 (VSCyto3D) on SEC61B — matches published VSCyto3D training settings. -# Augmentation parameters from vs_test/finetune_3d.py (actual training script). -# Architecture: convnextv2_tiny, z=15, MixedLoss(L1+DSSIM). -# Adapted for single-channel ER target on single GPU. -# Usage: uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_unext2.yml -base: - - ../recipes/trainer/fit_1gpu.yml - - ../recipes/models/unext2_3d.yml - -model: - init_args: - loss_function: - class_path: viscy_utils.losses.MixedLoss - init_args: - l1_alpha: 0.5 - l2_alpha: 0.0 - ms_dssim_alpha: 0.5 - lr: 0.0002 - schedule: WarmupCosine - -trainer: - devices: 4 - precision: 16-mixed - max_epochs: 200 - logger: - init_args: - name: UNeXt2_iPSC_SEC61B - save_dir: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2 - callbacks: - - class_path: lightning.pytorch.callbacks.LearningRateMonitor - init_args: - logging_interval: step - - class_path: lightning.pytorch.callbacks.ModelCheckpoint - init_args: - monitor: loss/validate - every_n_epochs: 1 - save_top_k: 5 - save_last: true - dirpath: /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/checkpoints - -data: - class_path: viscy_data.hcs.HCSDataModule - init_args: - data_path: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr - source_channel: Phase3D - target_channel: Structure - split_ratio: 0.8 - z_window_size: 15 - batch_size: 8 - num_workers: 8 - yx_patch_size: [384, 384] - preload: true - scratch_dir: /dev/shm - persistent_workers: true - normalizations: - - class_path: viscy_transforms.NormalizeSampled - init_args: - keys: [Phase3D] - level: fov_statistics - subtrahend: mean - divisor: std - - class_path: viscy_transforms.NormalizeSampled - init_args: - keys: [Structure] - level: fov_statistics - subtrahend: median - divisor: iqr - augmentations: - # CPU: 2 foreground-weighted patches per FOV (amortizes zarr read). - # batch_size=8 → DataLoader loads 4 FOVs, each yields 2 patches = 8 effective. - - class_path: viscy_transforms.RandWeightedCropd - init_args: - keys: [Phase3D, Structure] - w_key: Structure - spatial_size: [20, 600, 600] - num_samples: 2 - gpu_augmentations: - # GPU: affine on oversized patch → center crop to final size. - # Border pixels prevent zero-padded rotation artifacts. - - class_path: viscy_transforms.BatchedRandAffined - init_args: - keys: [source, target] - prob: 0.8 - rotate_range: [3.14, 0, 0] - shear_range: [0.0, 0.05, 0.05] - scale_range: [[0.7, 1.3], [0.5, 1.5], [0.5, 1.5]] - - class_path: viscy_transforms.BatchedCenterSpatialCropd - init_args: - keys: [source, target] - roi_size: [15, 384, 384] - - class_path: viscy_transforms.BatchedRandAdjustContrastd - init_args: - keys: [source] - prob: 0.5 - gamma: [0.8, 1.2] - - class_path: viscy_transforms.BatchedRandScaleIntensityd - init_args: - keys: [source] - prob: 0.5 - factors: 0.5 - - class_path: viscy_transforms.BatchedRandGaussianNoised - init_args: - keys: [source] - prob: 0.5 - mean: 0.0 - std: 0.3 - - class_path: viscy_transforms.BatchedRandGaussianSmoothd - init_args: - keys: [source] - prob: 0.5 - sigma_x: [0.25, 0.75] - sigma_y: [0.25, 0.75] - sigma_z: [0.25, 0.75] - val_gpu_augmentations: - - class_path: viscy_transforms.BatchedDivisibleCropd - init_args: - keys: [source, target] - k: [1, 64, 64] diff --git a/applications/dynacell/examples/configs/sec61b/run_fnet3d.slurm b/applications/dynacell/examples/configs/sec61b/run_fnet3d.slurm deleted file mode 100644 index f8eac33a5..000000000 --- a/applications/dynacell/examples/configs/sec61b/run_fnet3d.slurm +++ /dev/null @@ -1,22 +0,0 @@ -#!/bin/bash - -#SBATCH --job-name=FNet3D_SEC61B -#SBATCH --time=20-00:00:00 -#SBATCH --nodes=1 -#SBATCH --ntasks=1 -#SBATCH --partition=gpu -#SBATCH --cpus-per-task=32 -#SBATCH --gpus=1 -#SBATCH --mem=256G -#SBATCH --output=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d/slurm/%j.out -#SBATCH --error=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d/slurm/%j.err - -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d/slurm -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d/checkpoints - -ml uv - -export PYTHONUNBUFFERED=1 - -nvidia-smi -uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_fnet3d.yml diff --git a/applications/dynacell/examples/configs/sec61b/run_fnet3d_paper.slurm b/applications/dynacell/examples/configs/sec61b/run_fnet3d_paper.slurm deleted file mode 100644 index 4879fe93d..000000000 --- a/applications/dynacell/examples/configs/sec61b/run_fnet3d_paper.slurm +++ /dev/null @@ -1,22 +0,0 @@ -#!/bin/bash - -#SBATCH --job-name=FNet3DPaper_SEC61B -#SBATCH --time=20-00:00:00 -#SBATCH --nodes=1 -#SBATCH --ntasks=1 -#SBATCH --partition=gpu -#SBATCH --cpus-per-task=32 -#SBATCH --gpus=1 -#SBATCH --mem=256G -#SBATCH --output=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/slurm/%j.out -#SBATCH --error=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/slurm/%j.err - -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/slurm -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/fnet3d_paper/checkpoints - -ml uv - -export PYTHONUNBUFFERED=1 - -nvidia-smi -uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_fnet3d_paper.yml diff --git a/applications/dynacell/examples/configs/sec61b/run_unext2.slurm b/applications/dynacell/examples/configs/sec61b/run_unext2.slurm deleted file mode 100644 index 5ac743e98..000000000 --- a/applications/dynacell/examples/configs/sec61b/run_unext2.slurm +++ /dev/null @@ -1,32 +0,0 @@ -#!/bin/bash - -#SBATCH --job-name=UNeXt2_SEC61B -#SBATCH --time=20:00:00 -#SBATCH --nodes=1 -#SBATCH --ntasks-per-node=4 -#SBATCH --partition=gpu -#SBATCH --cpus-per-task=12 -#SBATCH --gres=gpu:4 -#SBATCH --mem-per-cpu=30G -#SBATCH --constraint="a100_80|h100|h200" -#SBATCH --output=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/slurm/%j.out -#SBATCH --error=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/slurm/%j.err - -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/slurm -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/checkpoints - -function cleanup() { - rm -rf /tmp/$SLURM_JOB_ID/*.zarr - echo "Cleanup Completed." -} -trap cleanup EXIT - -ml uv - -export PYTHONUNBUFFERED=1 -export NCCL_DEBUG=INFO -export PYTHONFAULTHANDLER=1 - -scontrol show job $SLURM_JOB_ID -nvidia-smi -srun uv run python -m dynacell fit --config applications/dynacell/examples/configs/sec61b/fit_unext2.yml diff --git a/applications/dynacell/examples/configs/sec61b/run_unext2_continue.slurm b/applications/dynacell/examples/configs/sec61b/run_unext2_continue.slurm deleted file mode 100644 index 7811df29e..000000000 --- a/applications/dynacell/examples/configs/sec61b/run_unext2_continue.slurm +++ /dev/null @@ -1,34 +0,0 @@ -#!/bin/bash - -#SBATCH --job-name=UNeXt2_SEC61B_cont -#SBATCH --time=20:00:00 -#SBATCH --nodes=1 -#SBATCH --ntasks-per-node=4 -#SBATCH --partition=gpu -#SBATCH --cpus-per-task=12 -#SBATCH --gres=gpu:4 -#SBATCH --mem-per-cpu=12G -#SBATCH --constraint="a100_80|h100|h200" -#SBATCH --output=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/slurm/%j.out -#SBATCH --error=/hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/slurm/%j.err - -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/slurm -mkdir -p -m 775 /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/checkpoints - -function cleanup() { - rm -rf /tmp/$SLURM_JOB_ID /dev/shm/$SLURM_JOB_ID - echo "Cleanup Completed." -} -trap cleanup EXIT - -ml uv - -export PYTHONUNBUFFERED=1 -export NCCL_DEBUG=INFO -export PYTHONFAULTHANDLER=1 - -scontrol show job $SLURM_JOB_ID -nvidia-smi -srun uv run python -m dynacell fit \ - --config applications/dynacell/examples/configs/sec61b/fit_unext2_continue.yml \ - --ckpt_path /hpc/projects/comp.micro/virtual_staining/models/dynacell/ipsc/sec61b/unext2/checkpoints/last-v1.ckpt diff --git a/applications/dynacell/pyproject.toml b/applications/dynacell/pyproject.toml index e9444aef9..10f76f535 100644 --- a/applications/dynacell/pyproject.toml +++ b/applications/dynacell/pyproject.toml @@ -14,14 +14,13 @@ keywords = [ ] license = "BSD-3-Clause" authors = [ { name = "Biohub", email = "compmicro@czbiohub.org" } ] -requires-python = ">=3.11" +requires-python = ">=3.12" classifiers = [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python :: 3 :: Only", - "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.13", "Programming Language :: Python :: 3.14", @@ -32,17 +31,64 @@ dynamic = [ "version" ] dependencies = [ "lightning>=2.3", "monai", - "viscy-data", + "omegaconf", + "pydantic>=2", + "viscy-data[mmap]", "viscy-models[celldiff]", "viscy-transforms", "viscy-utils", ] - +optional-dependencies.eval = [ + "accelerate>=1.13", + "aicsmlsegment", + "aicssegmentation", + "cellpose", + "cubic==0.7.0a2", + "dynaclr", + "hydra-core>=1.2", + "iohub", + "itk", + "matplotlib", + "microssim @ git+https://github.com/juglab/microssim.git@8bccb17d", + "pandas", + "scikit-image", + "scipy", + "segmenter-model-zoo", + "tqdm", + "transformers", +] +optional-dependencies.eval_gpu = [ + "cucim-cu12", + "cupy-cuda12x", +] +optional-dependencies.preprocess = [ + "iohub", + "tqdm", +] +optional-dependencies.report = [ + "hydra-core>=1.2", + "matplotlib", + "pandas", +] +# Default fit/predict trainer logger in configs/recipes/trainer/fit.yml is +# WandbLogger. Install with `pip install dynacell[wandb]` to satisfy that +# default, or override `trainer.logger=null` (or supply your own logger +# block) in the leaf / via `--override` to opt out of W&B entirely. +optional-dependencies.wandb = [ + "wandb", +] urls.Homepage = "https://github.com/mehta-lab/VisCy" urls.Issues = "https://github.com/mehta-lab/VisCy/issues" urls.Repository = "https://github.com/mehta-lab/VisCy" + scripts.dynacell = "dynacell.__main__:main_cli" +# Default manifest registry. Auto-discovered by +# ``dynacell.data.resolver.discover_manifest_roots`` so the resolver +# works without ``DYNACELL_MANIFEST_ROOTS`` on a fresh clone. Override +# the env var (or pass cli_roots) to point at a different registry. +entry-points."dynacell.manifest_roots".dynacell_default = "dynacell._manifests" + [dependency-groups] dev = [ { include-group = "test" } ] test = [ @@ -51,6 +97,9 @@ test = [ "tensorboard", ] +[tool.hatch.metadata] +allow-direct-references = true + [tool.hatch.version] source = "uv-dynamic-versioning" diff --git a/applications/dynacell/src/dynacell/__init__.py b/applications/dynacell/src/dynacell/__init__.py index 82b3fbec4..5214f837e 100644 --- a/applications/dynacell/src/dynacell/__init__.py +++ b/applications/dynacell/src/dynacell/__init__.py @@ -1,5 +1,16 @@ """Dynacell: benchmark virtual staining application.""" -from dynacell.engine import DynacellFlowMatching, DynacellUNet - __all__ = ["DynacellFlowMatching", "DynacellUNet"] + + +def __getattr__(name: str): + # Lazy imports to avoid pulling in heavy training deps on every import. + if name == "DynacellFlowMatching": + from dynacell.engine import DynacellFlowMatching + + return DynacellFlowMatching + if name == "DynacellUNet": + from dynacell.engine import DynacellUNet + + return DynacellUNet + raise AttributeError(f"module {__name__!r} has no attribute {name!r}") diff --git a/applications/dynacell/src/dynacell/__main__.py b/applications/dynacell/src/dynacell/__main__.py index 912631c92..86eadeac5 100644 --- a/applications/dynacell/src/dynacell/__main__.py +++ b/applications/dynacell/src/dynacell/__main__.py @@ -1,19 +1,154 @@ -"""Lightning CLI entry point for the Dynacell application. +"""CLI entry point for the Dynacell application. + +Routes Lightning subcommands (fit, predict, test, validate) to +``viscy_utils.cli.main()`` and Hydra subcommands (evaluate, report) +to their respective entry points. Usage ----- -cd applications/dynacell/examples/configs +cd applications/dynacell/configs/examples uv run dynacell fit -c unetvit3d/fit.yml -uv run python -m dynacell fit --config unetvit3d/fit.yml +uv run dynacell evaluate io.pred_path=... target_name=sec61b +uv run dynacell report results_dirs.ModelA=/path/to/results """ -from viscy_utils.cli import main +import importlib +import os +import sys +from pathlib import Path + +_HYDRA_COMMANDS: dict[str, tuple[str, str, str]] = { + "evaluate": ("dynacell.evaluation.pipeline", "evaluate_model", "eval"), + "precompute-gt": ("dynacell.evaluation.precompute_cli", "precompute_gt", "eval"), + "report": ("dynacell.reporting.cli", "generate_report", "report"), +} + +# HPC-specific config groups (target, feature_extractor/dynaclr, benchmark eval +# leaves) live outside the Python package so the wheel ships only schema + path- +# free references. Editable installs / repo checkouts expose these through +# hydra.searchpath; wheel installs without the repo simply don't see them, and +# external users provide their own groups via --config-dir. +_EXTERNAL_SEARCHPATHS: tuple[str, ...] = ( + "configs/benchmarks/virtual_staining/_internal", + "configs/benchmarks/virtual_staining/_internal/shared/eval", +) + +# Team-shared Hugging Face hub cache on project storage. CZ Biohub-specific +# default path; other sites override via the ``DYNACELL_SHARED_HF_CACHE`` +# environment variable. Repo-checkout invocations of the Hydra subcommands +# default ``HF_HUB_CACHE`` here so gated models (e.g. DINOv3) download once +# per team instead of once per user. +# +# We set ``HF_HUB_CACHE`` rather than ``HF_HOME``: ``HF_HOME`` relocates +# the entire HF directory including the auth token file, so a shared +# ``HF_HOME`` blocks HF from finding each user's personal ``~/.cache/ +# huggingface/token``. That breaks per-user gated-repo ACLs (HF returns +# 401 because the request goes out unauthenticated). ``HF_HUB_CACHE`` +# only relocates weights/datasets; tokens stay at the per-user default. +_DEFAULT_SHARED_HF_CACHE = "/hpc/projects/comp.micro/virtual_staining/models/dynacell/evaluation/hf_cache" +_SHARED_HF_CACHE = Path(os.environ.get("DYNACELL_SHARED_HF_CACHE", _DEFAULT_SHARED_HF_CACHE)) + + +def _external_configs_dirs() -> list[Path]: + """Return existing repo-checkout searchpath roots for Hydra eval groups. + + Walks up from this module until it finds the nearest ``pyproject.toml`` + (the application root in editable installs), then returns every + configured subpath that exists on disk. Missing paths are silently + skipped so wheel installs behave the same as repo checkouts where the + dirs were removed. + """ + for parent in Path(__file__).resolve().parents: + if (parent / "pyproject.toml").exists(): + return [p for sub in _EXTERNAL_SEARCHPATHS if (p := parent / sub).is_dir()] + return [] + + +def _maybe_set_shared_hf_cache() -> None: + """Point HF_HUB_CACHE at the team-shared cache on a repo checkout. + + Only fires when (a) ``HF_HUB_CACHE`` is not already set by the + caller, (b) we're running from a repo checkout (external Hydra + searchpaths resolve), and (c) the shared cache dir exists on this + machine. Wheel installs and non-HPC environments fall through to + the normal per-user ``~/.cache/huggingface/hub`` default. + """ + if "HF_HUB_CACHE" in os.environ: + return + if not _external_configs_dirs(): + return + if not _SHARED_HF_CACHE.is_dir(): + return + os.environ["HF_HUB_CACHE"] = str(_SHARED_HF_CACHE) + + +def _inject_external_configs(argv: list[str]) -> list[str]: + """Inject a hydra.searchpath override so external configs are discoverable. + + Hydra's argparse uses a single ``overrides`` positional with + ``nargs="*"``, which means the first contiguous run of positional args + is greedily consumed and any later positional (after a flag like + ``-c job``) is reported as an unrecognized argument. To keep both + ``dynacell evaluate -c job leaf=x`` and + ``dynacell evaluate leaf=x -c job`` working, insert the token + adjacent to an existing positional override when one is present; + otherwise append. + """ + dirs = _external_configs_dirs() + if not dirs: + return argv + token = f"hydra.searchpath=[{','.join(f'file://{d}' for d in dirs)}]" + for i, arg in enumerate(argv[1:], start=1): + if not arg.startswith("-") and "=" in arg: + return argv[:i] + [token] + argv[i:] + return argv + [token] def main_cli(): """Console script entry point for ``dynacell`` command.""" - main() + if len(sys.argv) >= 2 and sys.argv[1] in _HYDRA_COMMANDS: + command = sys.argv[1] + module_path, func_name, extra = _HYDRA_COMMANDS[command] + sys.argv = [sys.argv[0]] + sys.argv[2:] # strip subcommand for Hydra + sys.argv = _inject_external_configs(sys.argv) + _maybe_set_shared_hf_cache() + try: + module = importlib.import_module(module_path) + except ModuleNotFoundError as e: + print(f"Missing dependencies for 'dynacell {command}': {e}\nInstall with: pip install 'dynacell[{extra}]'") + raise SystemExit(1) from e + from dynacell.data.resolver import ( + ManifestNotFoundError, + NoManifestRootsError, + TargetNotFoundError, + ) + + # Hydra's @hydra.main decorator wraps exceptions in a generic + # "Error executing job" banner and calls sys.exit(1) unless + # HYDRA_FULL_ERROR=1 is set. Force the full-error path so our + # dataset-resolver errors propagate here and we can print a + # clean message + SystemExit(2) instead of a cryptic banner. + os.environ.setdefault("HYDRA_FULL_ERROR", "1") + try: + getattr(module, func_name)() + except (NoManifestRootsError, ManifestNotFoundError, TargetNotFoundError) as e: + print(str(e), file=sys.stderr) + raise SystemExit(2) from e + else: + from dynacell._compose_hook import _dynacell_ref_resolver + from dynacell.data.resolver import ( + ManifestNotFoundError, + NoManifestRootsError, + TargetNotFoundError, + ) + from viscy_utils.cli import main + + try: + main(resolver=_dynacell_ref_resolver) + except (NoManifestRootsError, ManifestNotFoundError, TargetNotFoundError) as e: + print(str(e), file=sys.stderr) + raise SystemExit(2) from e if __name__ == "__main__": - main() + main_cli() diff --git a/applications/dynacell/src/dynacell/_compose_hook.py b/applications/dynacell/src/dynacell/_compose_hook.py new file mode 100644 index 000000000..53a6cc402 --- /dev/null +++ b/applications/dynacell/src/dynacell/_compose_hook.py @@ -0,0 +1,83 @@ +"""Composition-time resolver hook for DynaCell benchmark leaves. + +Threaded into :func:`viscy_utils.compose.load_composed_config` via the +``resolver`` keyword argument; run once after the final deep-merge. +Reads ``benchmark.dataset_ref: {dataset, target}`` from the composed dict +and splices concrete ``data_path``, ``source_channel``, ``target_channel`` +into ``data.init_args`` from the resolved :class:`DatasetManifest`. + +Partial references (only ``dataset`` or only ``target``) are a strict +no-op, so shared train/predict-set fragments can declare one half of +``dataset_ref`` without breaking leaves whose target fragment has not +yet been migrated. +""" + +from __future__ import annotations + +import copy +import sys + +from dynacell.data import ( + DatasetRef, + ResolvedDataset, + dataset_ref_from_dict, + resolve_dataset_ref, +) + + +def _infer_mode(composed: dict) -> str: + """Return the Lightning subcommand ("fit", "predict", or "validate").""" + launcher_mode = composed.get("launcher", {}).get("mode") + if launcher_mode in {"fit", "predict", "validate"}: + return launcher_mode + for arg in sys.argv[1:]: + if arg in {"fit", "predict", "validate"}: + return arg + raise ValueError("Cannot infer Lightning mode for dataset_ref resolution; set launcher.mode in the leaf config.") + + +def _splice_resolved(composed: dict, resolved: ResolvedDataset, mode: str, ref: DatasetRef) -> dict: + """Return a deep-copied composed dict with resolved fields spliced in. + + Raises ``ValueError`` if the composed dict already declares any of + the resolved data fields. A full ``dataset_ref`` is the single + source of truth — composed fragments must not co-declare + ``data_path``, ``source_channel``, or ``target_channel``. + """ + out = copy.deepcopy(composed) + data = out.setdefault("data", {}) + init_args = data.setdefault("init_args", {}) + resolved_values = { + "data_path": str(resolved.data_path_test if mode == "predict" else resolved.data_path_train), + "source_channel": resolved.source_channel, + "target_channel": resolved.target_channel, + } + conflicts = {field: (init_args[field], value) for field, value in resolved_values.items() if field in init_args} + if conflicts: + details = "; ".join( + f"{k}: composed={composed_value!r} vs manifest={manifest_value!r}" + for k, (composed_value, manifest_value) in conflicts.items() + ) + raise ValueError( + f"benchmark.dataset_ref={{dataset: {ref.dataset}, target: {ref.target}}} " + f"conflicts with explicit data.init_args fields: {details}. " + "Remove one side — either drop the conflicting explicit fields " + "or remove dataset_ref." + ) + init_args.update(resolved_values) + out.setdefault("benchmark", {})["spacing"] = resolved.spacing.as_list() + return out + + +def _dynacell_ref_resolver(composed: dict) -> dict: + """Resolve ``benchmark.dataset_ref`` against the manifest registry. + + Strict partial-ref no-op: returns the input dict unchanged unless + both ``dataset`` and ``target`` keys are present under + ``benchmark.dataset_ref``. + """ + ref = dataset_ref_from_dict(composed.get("benchmark", {}).get("dataset_ref")) + if ref is None: + return composed + resolved = resolve_dataset_ref(ref) + return _splice_resolved(composed, resolved, _infer_mode(composed), ref) diff --git a/applications/dynacell/src/dynacell/_manifests/__init__.py b/applications/dynacell/src/dynacell/_manifests/__init__.py new file mode 100644 index 000000000..dc4a24739 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/__init__.py @@ -0,0 +1,15 @@ +"""Bundled dataset manifests — the default registry for the DynaCell resolver. + +This package ships canonical manifest YAMLs (mirrored from +``dynacell-paper/_configs/datasets/``) so the resolver works out-of-the-box +on any clone. Auto-discovered via the ``dynacell.manifest_roots`` entry +point declared in ``applications/dynacell/pyproject.toml``. + +VisCy is the source of truth for manifest *content* (this directory). +``dynacell-paper`` is the source of truth for manifest *authoring* — when +a new dataset is preprocessed there, the change is mirrored back here and +``tests/test_manifest_sync.py`` enforces the parity. + +Override at runtime with ``DYNACELL_MANIFEST_ROOTS=/path/to/other/registry`` +(env var) or by passing ``cli_roots=`` to ``discover_manifest_roots``. +""" diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-denv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-denv/manifest.yaml new file mode 100644 index 000000000..14e30c828 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-denv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-caax-denv +version: '1' +description: "A549 mantis condition-pooled \u2014 caax on DENV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.116 + x: 0.116 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + caax: + gene: CAAX + organelle: membrane + display_name: Membrane (CAAX) + target_channel: Membrane + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_DENV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_DENV.ozx + splits: splits/caax_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-denv/splits/caax_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-denv/splits/caax_train_test.yaml new file mode 100644 index 000000000..7eef68a46 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-denv/splits/caax_train_test.yaml @@ -0,0 +1,31 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: caax + condition: DENV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 6 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-mock/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-mock/manifest.yaml new file mode 100644 index 000000000..aebb9a01a --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-mock/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-caax-mock +version: '1' +description: "A549 mantis condition-pooled \u2014 caax on mock (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.116 + x: 0.116 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + caax: + gene: CAAX + organelle: membrane + display_name: Membrane (CAAX) + target_channel: Membrane + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_mock.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_mock.ozx + splits: splits/caax_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-mock/splits/caax_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-mock/splits/caax_train_test.yaml new file mode 100644 index 000000000..f5e315358 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-mock/splits/caax_train_test.yaml @@ -0,0 +1,37 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: caax + condition: mock + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-zikv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-zikv/manifest.yaml new file mode 100644 index 000000000..671f834dc --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-zikv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-caax-zikv +version: '1' +description: "A549 mantis condition-pooled \u2014 caax on ZIKV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.116 + x: 0.116 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + caax: + gene: CAAX + organelle: membrane + display_name: Membrane (CAAX) + target_channel: Membrane + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/CAAX_ZIKV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/CAAX_ZIKV.ozx + splits: splits/caax_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-zikv/splits/caax_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-zikv/splits/caax_train_test.yaml new file mode 100644 index 000000000..68c02e1cc --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-caax-zikv/splits/caax_train_test.yaml @@ -0,0 +1,37 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: caax + condition: ZIKV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-denv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-denv/manifest.yaml new file mode 100644 index 000000000..0ad284742 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-denv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-h2b-denv +version: '1' +description: "A549 mantis condition-pooled \u2014 h2b on DENV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.116 + x: 0.116 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + h2b: + gene: H2B + organelle: nuclei + display_name: Nuclei (H2B) + target_channel: Nuclei + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_DENV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/H2B_DENV.ozx + splits: splits/h2b_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-denv/splits/h2b_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-denv/splits/h2b_train_test.yaml new file mode 100644 index 000000000..6ecba4b06 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-denv/splits/h2b_train_test.yaml @@ -0,0 +1,31 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: h2b + condition: DENV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 6 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-mock/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-mock/manifest.yaml new file mode 100644 index 000000000..eabc02f5d --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-mock/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-h2b-mock +version: '1' +description: "A549 mantis condition-pooled \u2014 h2b on mock (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.116 + x: 0.116 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + h2b: + gene: H2B + organelle: nuclei + display_name: Nuclei (H2B) + target_channel: Nuclei + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_mock.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/H2B_mock.ozx + splits: splits/h2b_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-mock/splits/h2b_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-mock/splits/h2b_train_test.yaml new file mode 100644 index 000000000..6fa7400e6 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-mock/splits/h2b_train_test.yaml @@ -0,0 +1,37 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: h2b + condition: mock + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-zikv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-zikv/manifest.yaml new file mode 100644 index 000000000..8cb123e19 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-zikv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-h2b-zikv +version: '1' +description: "A549 mantis condition-pooled \u2014 h2b on ZIKV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.116 + x: 0.116 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + h2b: + gene: H2B + organelle: nuclei + display_name: Nuclei (H2B) + target_channel: Nuclei + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/H2B_ZIKV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/H2B_ZIKV.ozx + splits: splits/h2b_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-zikv/splits/h2b_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-zikv/splits/h2b_train_test.yaml new file mode 100644 index 000000000..99292f5dd --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-h2b-zikv/splits/h2b_train_test.yaml @@ -0,0 +1,37 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: h2b + condition: ZIKV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-denv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-denv/manifest.yaml new file mode 100644 index 000000000..4a7254444 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-denv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-sec61b-denv +version: '1' +description: "A549 mantis condition-pooled \u2014 sec61b on DENV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.1494 + x: 0.1494 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + sec61b: + gene: SEC61B + organelle: er + display_name: ER (Sec61b) + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_DENV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/SEC61B_DENV.ozx + splits: splits/sec61b_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-denv/splits/sec61b_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-denv/splits/sec61b_train_test.yaml new file mode 100644 index 000000000..e001c1ff6 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-denv/splits/sec61b_train_test.yaml @@ -0,0 +1,27 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: sec61b + condition: DENV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 2 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-mock/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-mock/manifest.yaml new file mode 100644 index 000000000..28276a07f --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-mock/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-sec61b-mock +version: '1' +description: "A549 mantis condition-pooled \u2014 sec61b on mock (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.1494 + x: 0.1494 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + sec61b: + gene: SEC61B + organelle: er + display_name: ER (Sec61b) + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_mock.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/SEC61B_mock.ozx + splits: splits/sec61b_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-mock/splits/sec61b_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-mock/splits/sec61b_train_test.yaml new file mode 100644 index 000000000..18141424a --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-mock/splits/sec61b_train_test.yaml @@ -0,0 +1,36 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: sec61b + condition: mock + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 11 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-zikv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-zikv/manifest.yaml new file mode 100644 index 000000000..1513e78ad --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-zikv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-sec61b-zikv +version: '1' +description: "A549 mantis condition-pooled \u2014 sec61b on ZIKV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.1494 + x: 0.1494 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + sec61b: + gene: SEC61B + organelle: er + display_name: ER (Sec61b) + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/SEC61B_ZIKV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/SEC61B_ZIKV.ozx + splits: splits/sec61b_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-zikv/splits/sec61b_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-zikv/splits/sec61b_train_test.yaml new file mode 100644 index 000000000..fb2693ada --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-sec61b-zikv/splits/sec61b_train_test.yaml @@ -0,0 +1,40 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: sec61b + condition: ZIKV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 15 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 + - 0/0/fov0012 + - 0/0/fov0013 + - 0/0/fov0014 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-denv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-denv/manifest.yaml new file mode 100644 index 000000000..87c4da64e --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-denv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-tomm20-denv +version: '1' +description: "A549 mantis condition-pooled \u2014 tomm20 on DENV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.1494 + x: 0.1494 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + tomm20: + gene: TOMM20 + organelle: mitochondria + display_name: Mitochondria (TOMM20) + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_DENV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/TOMM20_DENV.ozx + splits: splits/tomm20_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-denv/splits/tomm20_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-denv/splits/tomm20_train_test.yaml new file mode 100644 index 000000000..76678fa21 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-denv/splits/tomm20_train_test.yaml @@ -0,0 +1,30 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: tomm20 + condition: DENV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 5 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-mock/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-mock/manifest.yaml new file mode 100644 index 000000000..b2b640068 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-mock/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-tomm20-mock +version: '1' +description: "A549 mantis condition-pooled \u2014 tomm20 on mock (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.1494 + x: 0.1494 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + tomm20: + gene: TOMM20 + organelle: mitochondria + display_name: Mitochondria (TOMM20) + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_mock.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/TOMM20_mock.ozx + splits: splits/tomm20_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-mock/splits/tomm20_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-mock/splits/tomm20_train_test.yaml new file mode 100644 index 000000000..9009cf38c --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-mock/splits/tomm20_train_test.yaml @@ -0,0 +1,38 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: tomm20 + condition: mock + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 13 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 + - 0/0/fov0012 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-zikv/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-zikv/manifest.yaml new file mode 100644 index 000000000..53b04d4ff --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-zikv/manifest.yaml @@ -0,0 +1,25 @@ +name: a549-mantis-tomm20-zikv +version: '1' +description: "A549 mantis condition-pooled \u2014 tomm20 on ZIKV (pool-internal 0/0/fov\ + \ naming, plate provenance in per-position zattrs and the colocated provenance.json\ + \ sidecar)." +cell_type: A549 +imaging_modality: mantis-lightsheet +spacing: + z: 0.174 + y: 0.1494 + x: 0.1494 +channels: + source: Phase3D + auxiliary: + - Brightfield +targets: + tomm20: + gene: TOMM20 + organelle: mitochondria + display_name: Mitochondria (TOMM20) + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/train/TOMM20_ZIKV.ozx + test: /hpc/projects/virtual_staining/training/dynacell/a549/mantis_v1/test/TOMM20_ZIKV.ozx + splits: splits/tomm20_train_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-zikv/splits/tomm20_train_test.yaml b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-zikv/splits/tomm20_train_test.yaml new file mode 100644 index 000000000..909097382 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/a549-mantis-tomm20-zikv/splits/tomm20_train_test.yaml @@ -0,0 +1,37 @@ +split_version: '1.0' +random_seed: 0 +selection_criteria: + source: a549-mantis condition-pooled assembly + target: tomm20 + condition: ZIKV + pool_naming: 0/0/fov sequential across contributing plates +train: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 +test: + count: 12 + fovs: + - 0/0/fov0000 + - 0/0/fov0001 + - 0/0/fov0002 + - 0/0/fov0003 + - 0/0/fov0004 + - 0/0/fov0005 + - 0/0/fov0006 + - 0/0/fov0007 + - 0/0/fov0008 + - 0/0/fov0009 + - 0/0/fov0010 + - 0/0/fov0011 diff --git a/applications/dynacell/src/dynacell/_manifests/aics-hipsc/manifest.yaml b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/manifest.yaml new file mode 100644 index 000000000..7043e8761 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/manifest.yaml @@ -0,0 +1,66 @@ +name: aics-hipsc +version: "4" +description: "WTC-11 hiPSC confocal dataset from Allen Institute for Cell Science" +cell_type: WTC-11 hiPSC +imaging_modality: confocal + +spacing: + z: 0.290 + y: 0.108 + x: 0.108 + +channels: + source: Phase3D + auxiliary: + - Brightfield + - Nuclei + - Membrane + +targets: + sec61b: + gene: SEC61B + organelle: er + display_name: "ER (Sec61b)" + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/SEC61B.zarr + test: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B.zarr + cell_segmentation: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/SEC61B_segmented_cleaned.zarr + gt_cache_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/eval_cache/SEC61B + splits: splits/sec61b_train_val_test.yaml + + tomm20: + gene: TOMM20 + organelle: mitochondria + display_name: "Mitochondria (TOMM20)" + target_channel: Structure + stores: + train: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/TOMM20.zarr + test: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20.zarr + cell_segmentation: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/TOMM20_segmented_cleaned.zarr + gt_cache_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/eval_cache/TOMM20 + splits: splits/tomm20_train_val_test.yaml + + nucleus: + gene: Nuclei + organelle: nucleus + display_name: "Nucleus" + target_channel: Nuclei + stores: + train: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + test: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr + cell_segmentation: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr + gt_cache_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/eval_cache/nucleus + splits: splits/nucleus_train_val_test.yaml + + membrane: + gene: Membrane + organelle: membrane + display_name: "Membrane" + target_channel: Membrane + stores: + train: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/train/cell.zarr + test: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell.zarr + cell_segmentation: /hpc/projects/virtual_staining/training/dynacell/ipsc/dataset_v4/test_cropped/cell_segmented_cleaned.zarr + gt_cache_dir: /hpc/projects/virtual_staining/training/dynacell/ipsc/eval_cache/membrane + splits: splits/membrane_train_val_test.yaml diff --git a/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/membrane_train_val_test.yaml b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/membrane_train_val_test.yaml new file mode 100644 index 000000000..575c3929d --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/membrane_train_val_test.yaml @@ -0,0 +1,12 @@ +split_version: "1.0" +random_seed: 42 +selection_criteria: + organelle: Membrane + source_store: cell.zarr + notes: "Membrane channel selected from shared cell.zarr (also serves nucleus target)." +train: + count: 500 + fovs: [] +test: + count: 100 + fovs: [] diff --git a/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/nucleus_train_val_test.yaml b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/nucleus_train_val_test.yaml new file mode 100644 index 000000000..34606e731 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/nucleus_train_val_test.yaml @@ -0,0 +1,12 @@ +split_version: "1.0" +random_seed: 42 +selection_criteria: + organelle: Nuclei + source_store: cell.zarr + notes: "Nucleus channel selected from shared cell.zarr (also serves membrane target)." +train: + count: 500 + fovs: [] +test: + count: 100 + fovs: [] diff --git a/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/sec61b_train_val_test.yaml b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/sec61b_train_val_test.yaml new file mode 100644 index 000000000..d27c7b7c5 --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/sec61b_train_val_test.yaml @@ -0,0 +1,11 @@ +split_version: "1.0" +random_seed: 42 +selection_criteria: + organelle: SEC61B + min_depth: 44 +train: + count: 500 + fovs: [] +test: + count: 100 + fovs: [] diff --git a/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/tomm20_train_val_test.yaml b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/tomm20_train_val_test.yaml new file mode 100644 index 000000000..f3aecdaba --- /dev/null +++ b/applications/dynacell/src/dynacell/_manifests/aics-hipsc/splits/tomm20_train_val_test.yaml @@ -0,0 +1,11 @@ +split_version: "1.0" +random_seed: 42 +selection_criteria: + organelle: TOMM20 + min_depth: 44 +train: + count: 500 + fovs: [] +test: + count: 100 + fovs: [] diff --git a/applications/dynacell/src/dynacell/celldiff_wrapper.py b/applications/dynacell/src/dynacell/celldiff_wrapper.py index 1085cf0e0..0bc7659e5 100644 --- a/applications/dynacell/src/dynacell/celldiff_wrapper.py +++ b/applications/dynacell/src/dynacell/celldiff_wrapper.py @@ -120,12 +120,15 @@ def fn(xt: Tensor, t: Tensor) -> Tensor: return target - def generate_non_overlapping(self, phase: Tensor, num_steps: int = 100) -> Tensor: - """Generate virtual staining via non-overlapping tiling. + def generate_sliding_window(self, phase: Tensor, num_steps: int = 100) -> Tensor: + """Generate virtual staining via tiled sliding window (stride == patch size). - Tiles the full input into non-overlapping patches matching - ``net.input_spatial_size``, generates each patch independently, - and assembles the results. + Partitions the input into non-overlapping patches of size + ``net.input_spatial_size``. Each patch is generated independently + with fresh Gaussian noise and the results are written back into the + corresponding region of the output tensor. The last tile along each + axis is snapped to the image edge, so it may overlap its predecessor + when the image size is not an exact multiple of the patch size. Parameters ---------- @@ -179,16 +182,26 @@ def fn( return out - def generate_sliding_window( + def generate_iterative( self, phase: Tensor, num_steps: int = 100, overlap_size: int | tuple[int, ...] = 256, ) -> Tensor: - """Generate virtual staining via overlapping sliding window. + """Generate virtual staining via overlapping sliding window with velocity anchoring. - Uses overlapping patches for generation, anchoring already-computed - values in the overlap region to guide subsequent patches. + Slides overlapping patches across the input. For each patch the + overlap region (already generated by an earlier patch) is used to + steer the ODE trajectory toward the previously computed output values + rather than letting the solver integrate freely. + + **Anchoring mechanism** (requires Linear path + velocity prediction): + At every ODE step the network predicts a velocity ``v``. Under the + Linear flow the starting point is ``x0 = xt - t * v``. For pixels in + the overlap region we override the velocity with + ``v_anchored = out_known - x0``, which is the exact velocity that + would integrate ``x0`` to the already-computed target ``out_known``. + Outside the overlap the free velocity ``v`` is used unchanged. Parameters ---------- @@ -204,6 +217,12 @@ def generate_sliding_window( ------- Tensor Predicted fluorescence of shape ``(..., D, H, W)``. + + Raises + ------ + NotImplementedError + If ``path_type`` is not ``"Linear"`` or ``prediction`` is not + ``"velocity"``, since the anchoring formula is path-specific. """ spatial = tuple(phase.shape[-3:]) patch_spatial = tuple(self.net.input_spatial_size) @@ -223,10 +242,9 @@ def generate_sliding_window( if not (0 <= ov < p_i): raise ValueError(f"overlap at dim {i} must satisfy 0 <= overlap < patch (got {ov} vs patch {p_i})") - # Overlap anchoring uses x0 = xt - t*v which assumes Linear path + velocity prediction. if self.path_type != "Linear" or self.prediction != "velocity": raise NotImplementedError( - "generate_sliding_window only supports Linear path with velocity prediction, " + "generate_iterative only supports Linear path with velocity prediction, " f"got path_type={self.path_type!r}, prediction={self.prediction!r}" ) @@ -269,15 +287,99 @@ def fn( _mask: Tensor = known_mask, ) -> Tensor: v = self.net(xt_, _p, t_) - # Reshape t from (B,) to (B, 1, 1, 1, 1) for broadcasting. + # Infer x0 from the Linear-path formula: x0 = xt - t*v. t_exp = t_.reshape(t_.shape[0], *([1] * (xt_.dim() - 1))) x0_ = xt_ - t_exp * v + # Velocity that integrates x0 exactly to the known target: v = x1 - x0. v_out = _out - x0_ + # Use the anchored velocity in the overlap region, free velocity elsewhere. return torch.where(_mask, v_out, v) patch_out = sample_fn(xt, fn)[-1] - # Preserve already-computed values in the overlap region. - patch_out = torch.where(known_mask, out_patch, patch_out) out[tuple(slicer)] = patch_out return out + + def denoise_sliding_window( + self, + phase: Tensor, + overlap_size: int | tuple[int, ...] = 0, + ) -> Tensor: + """Estimate the conditional mean via overlapping tiled single-step Euler updates. + + Slides overlapping patches across the input. Each patch is denoised + independently with fresh Gaussian noise and the results are accumulated + with a count tensor; overlapping regions are averaged, which reduces + variance and approximates the conditional mean. + + Parameters + ---------- + phase : Tensor + Phase contrast input of shape ``(..., D, H, W)``. + overlap_size : int or tuple of int + Overlap in each spatial dimension ``(od, oh, ow)``. + A single int applies the same overlap to all three dimensions. + + Returns + ------- + Tensor + Predicted fluorescence of shape ``(..., D, H, W)``. + """ + + if self.path_type != "Linear" or self.prediction != "velocity": + raise NotImplementedError( + "denoise_sliding_window only supports Linear path with velocity prediction, " + f"got path_type={self.path_type!r}, prediction={self.prediction!r}" + ) + + spatial = tuple(phase.shape[-3:]) + patch_spatial = tuple(self.net.input_spatial_size) + n_spatial = 3 + + if isinstance(overlap_size, int): + overlap = (overlap_size,) * n_spatial + else: + overlap = tuple(overlap_size) + if len(overlap) != n_spatial: + raise ValueError("overlap_size must be int or a 3-tuple") + + for i in range(n_spatial): + S, P, O = spatial[i], patch_spatial[i], overlap[i] + if S < P: + raise ValueError(f"spatial dim {i} ({S}) must be >= patch dim ({P})") + if not (0 <= O < P): + raise ValueError(f"overlap at dim {i} must satisfy 0 <= overlap < patch (got {O} vs {P})") + + in_ch = self.net.inconv.in_channels + out_shape = (*phase.shape[:-4], in_ch, *phase.shape[-3:]) + prediction_sum = torch.zeros(out_shape, device=phase.device, dtype=phase.dtype) + prediction_count = torch.zeros(out_shape, device=phase.device, dtype=phase.dtype) + + start_lists: list[list[int]] = [] + for i in range(n_spatial): + S, P, O = spatial[i], patch_spatial[i], overlap[i] + stride = P - O + last = S - P + starts = [0] + while starts[-1] + stride < last: + starts.append(starts[-1] + stride) + if starts[-1] != last: + starts.append(last) + start_lists.append(starts) + + with torch.no_grad(): + for starts in itertools.product(*start_lists): + slicer = [slice(None)] * phase.dim() + for i, st in enumerate(starts): + slicer[-(n_spatial - i)] = slice(st, st + patch_spatial[i]) + phase_patch = phase[tuple(slicer)] + xt = self._noise_like_target(phase_patch) + t = torch.zeros(xt.shape[0], device=xt.device, dtype=xt.dtype) + pred = self.net(xt, phase_patch, t) + patch_out = pred + xt + prediction_sum[tuple(slicer)] += patch_out + prediction_count[tuple(slicer)] += 1 + + if not torch.all(prediction_count > 0): + raise RuntimeError("sliding window left uncovered voxels") + return prediction_sum / prediction_count diff --git a/applications/dynacell/src/dynacell/data/__init__.py b/applications/dynacell/src/dynacell/data/__init__.py new file mode 100644 index 000000000..9843e505a --- /dev/null +++ b/applications/dynacell/src/dynacell/data/__init__.py @@ -0,0 +1,56 @@ +"""Dataset schemas and path-based loaders for the DynaCell benchmark.""" + +from dynacell.data.collections import ( + BenchmarkCollection, + ChannelEntry, + CollectionExperiment, + Provenance, + load_collection, +) +from dynacell.data.manifests import ( + DatasetManifest, + DatasetRef, + SplitDefinition, + StoreLocations, + TargetConfig, + VoxelSpacing, + get_target, + load_manifest, + load_splits, +) +from dynacell.data.resolver import ( + ManifestNotFoundError, + NoManifestRootsError, + ResolvedDataset, + TargetNotFoundError, + dataset_ref_from_dict, + discover_manifest_roots, + resolve_dataset_ref, +) +from dynacell.data.specs import BenchmarkSpec, load_benchmark_spec + +__all__ = [ + "BenchmarkCollection", + "BenchmarkSpec", + "ChannelEntry", + "CollectionExperiment", + "DatasetManifest", + "DatasetRef", + "ManifestNotFoundError", + "NoManifestRootsError", + "Provenance", + "ResolvedDataset", + "SplitDefinition", + "StoreLocations", + "TargetConfig", + "TargetNotFoundError", + "VoxelSpacing", + "dataset_ref_from_dict", + "discover_manifest_roots", + "get_target", + "load_benchmark_spec", + "load_collection", + "load_manifest", + "load_splits", + "resolve_dataset_ref", +] diff --git a/applications/dynacell/src/dynacell/data/_yaml.py b/applications/dynacell/src/dynacell/data/_yaml.py new file mode 100644 index 000000000..0da122a48 --- /dev/null +++ b/applications/dynacell/src/dynacell/data/_yaml.py @@ -0,0 +1,30 @@ +"""Shared OmegaConf + Pydantic YAML loading.""" + +from __future__ import annotations + +from pathlib import Path +from typing import TypeVar + +from omegaconf import OmegaConf +from pydantic import BaseModel + +T = TypeVar("T", bound=BaseModel) + + +def load_yaml(path: Path, model_class: type[T]) -> T: + """Load a YAML file and validate it against a Pydantic model. + + Parameters + ---------- + path : Path + Path to a YAML file. + model_class : type[T] + Pydantic model class to validate against. + + Returns + ------- + T + Validated model instance. + """ + raw = OmegaConf.to_container(OmegaConf.load(path), resolve=True) + return model_class.model_validate(raw) diff --git a/applications/dynacell/src/dynacell/data/collections.py b/applications/dynacell/src/dynacell/data/collections.py new file mode 100644 index 000000000..9102fcd6c --- /dev/null +++ b/applications/dynacell/src/dynacell/data/collections.py @@ -0,0 +1,67 @@ +"""Frozen collection schemas for benchmark data curation.""" + +from __future__ import annotations + +from pathlib import Path + +from pydantic import BaseModel, Field + +from dynacell.data._yaml import load_yaml +from viscy_data.collection import ChannelEntry + + +class Provenance(BaseModel): + """Airtable-derived provenance for a frozen collection. + + Stricter than ``viscy_data.collection.Provenance`` — requires + ``created_at`` and ``created_by`` for benchmark traceability. + """ + + airtable_base_id: str | None = None + airtable_query: str | None = None + record_ids: list[str] = Field(default_factory=list) + created_at: str + created_by: str + + +class CollectionExperiment(BaseModel): + """One experiment within a benchmark collection.""" + + name: str + data_path: Path + channels: list[ChannelEntry] + perturbation_wells: dict[str, list[str]] | None = None + interval_minutes: float | None = None + start_hpi: float | None = None + marker: str | None = None + organelle: str | None = None + pixel_size_xy_um: float + pixel_size_z_um: float | None = None + exclude_fovs: list[str] = Field(default_factory=list) + + +class BenchmarkCollection(BaseModel): + """Frozen collection tying experiments to train/test FOV membership.""" + + name: str + description: str + provenance: Provenance + experiments: list[CollectionExperiment] + train_fovs: list[str] | None = None + test_fovs: list[str] | None = None + + +def load_collection(collection_path: Path) -> BenchmarkCollection: + """Load and validate a frozen benchmark collection. + + Parameters + ---------- + collection_path : Path + Path to a collection YAML file. + + Returns + ------- + BenchmarkCollection + Validated collection. + """ + return load_yaml(collection_path, BenchmarkCollection) diff --git a/applications/dynacell/src/dynacell/data/manifests.py b/applications/dynacell/src/dynacell/data/manifests.py new file mode 100644 index 000000000..5189d88ee --- /dev/null +++ b/applications/dynacell/src/dynacell/data/manifests.py @@ -0,0 +1,179 @@ +"""Dataset manifest schemas and loaders for the DynaCell benchmark. + +Pydantic models that parse and validate YAML manifests. Loaders accept +explicit file paths — no import-time registry or hardcoded config roots. +""" + +from __future__ import annotations + +from functools import lru_cache +from pathlib import Path + +from pydantic import BaseModel, field_validator, model_validator + +from dynacell.data._yaml import load_yaml + + +class DatasetRef(BaseModel): + """Reference to a dataset target, resolved against a manifest registry. + + Carried under ``benchmark.dataset_ref`` in benchmark leaf configs. + The composition-time resolver reads this reference and splices + ``data_path``, ``source_channel``, and ``target_channel`` into the + composed Lightning config. + """ + + dataset: str + target: str + + +class VoxelSpacing(BaseModel): + """Physical voxel spacing in micrometers.""" + + z: float + y: float + x: float + + def as_list(self) -> list[float]: + """Return spacing as ``[z, y, x]`` list for metric functions.""" + return [self.z, self.y, self.x] + + +class StoreLocations(BaseModel): + """Zarr store paths for a single organelle target.""" + + train: Path + test: Path + cell_segmentation: Path | None = None + gt_cache_dir: Path | None = None + + +class TargetConfig(BaseModel): + """Configuration for a single organelle prediction target.""" + + gene: str + organelle: str + display_name: str + target_channel: str + stores: StoreLocations + splits: str + + +class DatasetManifest(BaseModel): + """Top-level dataset manifest.""" + + name: str + version: str + description: str + cell_type: str + imaging_modality: str + spacing: VoxelSpacing + channels: dict[str, str | list[str]] + targets: dict[str, TargetConfig] + + @field_validator("targets") + @classmethod + def _targets_not_empty(cls, v: dict) -> dict: + """Validate that at least one target is defined.""" + if not v: + raise ValueError("Manifest must define at least one target.") + return v + + @property + def source_channel(self) -> str: + """Return the single source channel name for source-target datasets. + + ``channels["source"]`` may be a string or a single-element list; a + multi-element list is rejected since downstream ``HCSDataModule`` + takes one channel name. + """ + source = self.channels["source"] + if isinstance(source, str): + return source + if isinstance(source, list) and len(source) == 1: + return source[0] + raise ValueError(f"Manifest source channel must be a string or single-element list, got {source!r}.") + + +class SplitDefinition(BaseModel): + """Train/val/test FOV split for one organelle.""" + + split_version: str + random_seed: int + source_stores: list[Path] | None = None + selection_criteria: dict | None = None + train: dict + test: dict + val: dict | None = None + + @model_validator(mode="after") + def _check_counts(self) -> SplitDefinition: + """Validate count matches len(fovs) when fovs is non-empty.""" + for split_name in ("train", "val", "test"): + split = getattr(self, split_name) + if split is None: + continue + fovs = split.get("fovs", []) + if fovs and "count" in split: + if len(fovs) != split["count"]: + raise ValueError(f"{split_name} declares count={split['count']} but has {len(fovs)} FOVs.") + return self + + +@lru_cache(maxsize=64) +def load_manifest(manifest_path: Path) -> DatasetManifest: + """Load and validate a dataset manifest from a YAML file. + + Cached by resolved path; manifests are treated as immutable within a + process (same policy as :func:`viscy_utils.compose._load_yaml_cached`). + + Parameters + ---------- + manifest_path : Path + Path to a dataset manifest YAML file. + + Returns + ------- + DatasetManifest + Validated manifest. + """ + return load_yaml(manifest_path, DatasetManifest) + + +def load_splits(split_path: Path) -> SplitDefinition: + """Load and validate a split definition from a YAML file. + + Parameters + ---------- + split_path : Path + Path to a split definition YAML file. + + Returns + ------- + SplitDefinition + Validated split definition. + """ + return load_yaml(split_path, SplitDefinition) + + +def get_target(manifest: DatasetManifest, target_name: str) -> TargetConfig: + """Get a specific target from a loaded manifest. + + Parameters + ---------- + manifest : DatasetManifest + A loaded dataset manifest. + target_name : str + Name of the target (e.g., ``"sec61b"``). + + Returns + ------- + TargetConfig + Target configuration. + + Raises + ------ + KeyError + If ``target_name`` is not in the manifest. + """ + return manifest.targets[target_name] diff --git a/applications/dynacell/src/dynacell/data/resolver.py b/applications/dynacell/src/dynacell/data/resolver.py new file mode 100644 index 000000000..29a26ad7a --- /dev/null +++ b/applications/dynacell/src/dynacell/data/resolver.py @@ -0,0 +1,196 @@ +"""Manifest-driven dataset reference resolution for the DynaCell benchmark. + +Turns a :class:`DatasetRef` (``{dataset, target}``) into concrete paths and +channel names by reading a Pydantic :class:`DatasetManifest` YAML discovered +via manifest roots. Callers compose this with the config pipeline via +:mod:`dynacell._compose_hook`. + +Manifest root precedence (highest wins): + +1. ``cli_roots`` argument. +2. ``DYNACELL_MANIFEST_ROOTS`` env var (``os.pathsep``-separated paths). +3. Python entry points under group ``dynacell.manifest_roots``. + +For each root (in order), the resolver looks for +``//manifest.yaml``. First hit wins. No recursion, no +globbing. +""" + +from __future__ import annotations + +import os +from importlib import resources +from importlib.metadata import entry_points +from pathlib import Path + +from pydantic import BaseModel + +from dynacell.data.manifests import ( + DatasetRef, + VoxelSpacing, + load_manifest, +) + + +class NoManifestRootsError(RuntimeError): + """No manifest roots could be discovered from CLI, env, or entry points.""" + + +class ManifestNotFoundError(LookupError): + """Dataset slug not found under any configured manifest root.""" + + +class TargetNotFoundError(LookupError): + """Target slug not present in the located dataset manifest.""" + + +class ResolvedDataset(BaseModel): + """Flat view of the manifest fields a composed config needs.""" + + manifest_path: Path + data_path_train: Path + data_path_test: Path + source_channel: str + target_channel: str + spacing: VoxelSpacing + cell_segmentation_path: Path | None = None + gt_cache_dir: Path | None = None + + +_ENV_VAR = "DYNACELL_MANIFEST_ROOTS" +_ENTRY_POINT_GROUP = "dynacell.manifest_roots" + +REQUIRED_REF_KEYS: tuple[str, ...] = ("dataset", "target") + + +def dataset_ref_from_dict(ref_dict: object) -> DatasetRef | None: + """Validate a ``benchmark.dataset_ref`` dict, returning ``None`` for partial refs. + + Shared between the Lightning-side compose hook and the Hydra-side + eval hook so the "full ref vs partial ref vs no ref" policy stays + identical across surfaces. A missing dict, non-dict value, or + partial dict (either ``dataset`` or ``target`` missing) is treated + as a no-op signal (returns ``None``). A dict with both keys present + is validated via Pydantic — malformed values surface as the usual + :class:`pydantic.ValidationError`. + """ + if not isinstance(ref_dict, dict): + return None + if not all(k in ref_dict for k in REQUIRED_REF_KEYS): + return None + return DatasetRef.model_validate(ref_dict) + + +def _entry_point_roots() -> list[Path]: + """Resolve entry-point-registered manifest roots to package resource dirs.""" + roots: list[Path] = [] + for ep in entry_points(group=_ENTRY_POINT_GROUP): + module = ep.load() + resource_dir = resources.files(module) + roots.append(Path(str(resource_dir))) + return roots + + +def discover_manifest_roots(cli_roots: list[Path] | None = None) -> list[Path]: + """Return manifest roots in precedence order (CLI → env var → entry points). + + Parameters + ---------- + cli_roots : list[Path] or None + Explicit roots provided by the caller. If given, they take + precedence over environment and entry points but do not replace + them — lower-precedence roots still contribute. + + Returns + ------- + list[Path] + Non-empty list of roots to scan. + + Raises + ------ + NoManifestRootsError + If no roots are configured at any precedence level. + """ + roots: list[Path] = [] + if cli_roots: + roots.extend(Path(p) for p in cli_roots) + env_value = os.environ.get(_ENV_VAR) + if env_value: + roots.extend(Path(p) for p in env_value.split(os.pathsep) if p) + roots.extend(_entry_point_roots()) + if not roots: + raise NoManifestRootsError( + "No dynacell manifest roots configured.\n\n" + "VisCy ships its own bundled registry at " + "``dynacell._manifests``; this error means the entry-point " + "provider declared in applications/dynacell/pyproject.toml " + "didn't load.\n\n" + "Confirm dynacell was installed cleanly (``uv sync`` from the " + "VisCy worktree). To override with a different registry, set " + f"``{_ENV_VAR}=/path/to/datasets`` (env var) or pass " + "``cli_roots=`` to ``discover_manifest_roots``.\n" + ) + return roots + + +def _find_manifest(dataset: str, roots: list[Path]) -> Path: + """Return the first ``//manifest.yaml`` that exists.""" + searched: list[Path] = [] + for root in roots: + candidate = root / dataset / "manifest.yaml" + searched.append(candidate) + if candidate.is_file(): + return candidate + lines = "\n".join(f" - {p}" for p in searched) + raise ManifestNotFoundError(f"dataset {dataset!r} not found.\n\nSearched:\n{lines}\n") + + +def resolve_dataset_ref( + ref: DatasetRef, + roots: list[Path] | None = None, +) -> ResolvedDataset: + """Resolve a :class:`DatasetRef` against the manifest registry. + + Parameters + ---------- + ref : DatasetRef + The reference to resolve. + roots : list[Path] or None + Optional explicit roots (CLI-provided). Falls back to env var and + entry points per :func:`discover_manifest_roots`. + + Returns + ------- + ResolvedDataset + Flat view of the fields the composed config needs. + + Raises + ------ + NoManifestRootsError + If no manifest roots are configured. + ManifestNotFoundError + If the dataset slug is not found under any root. + TargetNotFoundError + If the target slug is not defined in the located manifest. + """ + all_roots = discover_manifest_roots(roots) + manifest_path = _find_manifest(ref.dataset, all_roots) + manifest = load_manifest(manifest_path) + if ref.target not in manifest.targets: + available = ", ".join(sorted(manifest.targets)) or "(none)" + raise TargetNotFoundError( + f"target {ref.target!r} not found in dataset {ref.dataset!r}.\n\n" + f"Manifest: {manifest_path}\n" + f"Available targets: {available}\n" + ) + target = manifest.targets[ref.target] + return ResolvedDataset( + manifest_path=manifest_path, + data_path_train=target.stores.train, + data_path_test=target.stores.test, + source_channel=manifest.source_channel, + target_channel=target.target_channel, + spacing=manifest.spacing, + cell_segmentation_path=target.stores.cell_segmentation, + gt_cache_dir=target.stores.gt_cache_dir, + ) diff --git a/applications/dynacell/src/dynacell/data/specs.py b/applications/dynacell/src/dynacell/data/specs.py new file mode 100644 index 000000000..f72b694c1 --- /dev/null +++ b/applications/dynacell/src/dynacell/data/specs.py @@ -0,0 +1,41 @@ +"""Benchmark spec schemas for reproducible benchmark runs.""" + +from __future__ import annotations + +from pathlib import Path + +from pydantic import BaseModel, Field + +from dynacell.data._yaml import load_yaml + + +class BenchmarkSpec(BaseModel): + """Executable benchmark recipe tying together pipeline stages.""" + + name: str + version: str + description: str + collection_path: Path + preprocess_configs: list[Path] = Field(default_factory=list) + train_preset: str | None = None + predict_preset: str | None = None + evaluate_config: Path | None = None + report_config: Path | None = None + output_root: Path + checkpoint_path: Path | None = None + + +def load_benchmark_spec(spec_path: Path) -> BenchmarkSpec: + """Load and validate a benchmark spec. + + Parameters + ---------- + spec_path : Path + Path to a benchmark spec YAML file. + + Returns + ------- + BenchmarkSpec + Validated benchmark spec. + """ + return load_yaml(spec_path, BenchmarkSpec) diff --git a/applications/dynacell/src/dynacell/engine.py b/applications/dynacell/src/dynacell/engine.py index 60dba120b..e2768d3d0 100644 --- a/applications/dynacell/src/dynacell/engine.py +++ b/applications/dynacell/src/dynacell/engine.py @@ -5,54 +5,61 @@ """ import inspect +import itertools +import logging from typing import Literal, Sequence import numpy as np import torch import torch.nn.functional as F from lightning.pytorch import LightningModule -from monai.optimizers import WarmupCosineSchedule from monai.transforms import DivisiblePad from torch import Tensor, nn -from torch.optim.lr_scheduler import ConstantLR from dynacell.celldiff_wrapper import CELLDiff3DVS from viscy_data import Sample from viscy_models import Unet3d, UNeXt2 from viscy_models.celldiff import CELLDiffNet, UNetViT3D +from viscy_models.unet.fcmae import FullyConvolutionalMAE from viscy_utils.log_images import detach_sample, log_image_grid +from viscy_utils.optimizers import configure_adamw_scheduler + +_logger = logging.getLogger("lightning.pytorch") _ARCHITECTURE: dict[str, type[nn.Module]] = { "UNetViT3D": UNetViT3D, "FNet3D": Unet3d, "UNeXt2": UNeXt2, + "fcmae": FullyConvolutionalMAE, } -def _configure_adamw_scheduler( - module: LightningModule, - model: nn.Module, - lr: float, - schedule: str, -) -> tuple[list, list]: - """Build AdamW optimizer with WarmupCosine or Constant LR schedule. +def _aggregate_validation_losses( + validation_losses: list[list[tuple[Tensor, int]]], +) -> Tensor: + """Compute sample-weighted mean loss across dataloaders. + + Parameters + ---------- + validation_losses : list of list of (Tensor, int) + Per-dataloader list of ``(scalar_loss, batch_size)`` tuples + accumulated during validation. - Shared by :class:`DynacellUNet` and :class:`DynacellFlowMatching`. + Returns + ------- + Tensor + Scalar weighted mean loss. """ - optimizer = torch.optim.AdamW(model.parameters(), lr=lr) - if schedule == "WarmupCosine": - scheduler = WarmupCosineSchedule( - optimizer, - warmup_steps=3, - t_total=module.trainer.estimated_stepping_batches, - warmup_multiplier=1e-3, - ) - return [optimizer], [{"scheduler": scheduler, "interval": "step"}] - elif schedule == "Constant": - scheduler = ConstantLR(optimizer, factor=1, total_iters=module.trainer.max_epochs) - else: - raise ValueError(f"Unknown schedule {schedule!r}, expected 'WarmupCosine' or 'Constant'") - return [optimizer], [scheduler] + dl_means: list[Tensor] = [] + dl_totals: list[Tensor] = [] + for dl_batches in validation_losses: + losses, sizes = zip(*dl_batches) + sizes_t = torch.tensor(sizes, dtype=torch.float, device=losses[0].device) + dl_means.append((torch.stack(losses) * sizes_t).sum() / sizes_t.sum()) + dl_totals.append(sizes_t.sum()) + total_n = torch.stack(dl_totals).sum() + weighted = torch.stack([m * n for m, n in zip(dl_means, dl_totals)]).sum() + return weighted / total_n def _make_divisible_pad(model: nn.Module) -> DivisiblePad: @@ -93,7 +100,7 @@ class DynacellUNet(LightningModule): Parameters ---------- - architecture : {"UNetViT3D", "FNet3D", "UNeXt2"} + architecture : {"UNetViT3D", "FNet3D", "UNeXt2", "fcmae"} Architecture key selecting the backbone. model_config : dict | None Keyword arguments forwarded to the backbone constructor. @@ -111,23 +118,42 @@ class DynacellUNet(LightningModule): YX shape for example input (used by FNet3D for graph logging). Ignored when the model provides ``input_spatial_size``. ckpt_path : str | None - Checkpoint path to load model weights. + Path to a checkpoint to load **weights only** at construction time. + Intended for inference (predict/test), not training resumption — + optimizer state, epoch counters, and scheduler state are not + restored. + encoder_only : bool, default False + When True, ``ckpt_path`` must be set, and only the + ``model.encoder.*`` weights are loaded (decoder/head stay at fresh + init). Intended for finetuning from an FCMAE-pretrained encoder. + Only supported for ``architecture='fcmae'``. + + Note: on resumed runs (via trainer-level ``--ckpt_path``), this + pre-load still fires in ``__init__`` before Lightning restores + the resume checkpoint, and the resume state overwrites it. The + file at ``ckpt_path`` must therefore remain accessible for the + lifetime of any run based on a pretrained leaf. """ def __init__( self, - architecture: Literal["UNetViT3D", "FNet3D", "UNeXt2"] = "UNetViT3D", + architecture: Literal["UNetViT3D", "FNet3D", "UNeXt2", "fcmae"] = "UNetViT3D", model_config: dict | None = None, loss_function: nn.Module | None = None, lr: float = 1e-3, schedule: Literal["WarmupCosine", "Constant"] = "Constant", + warmup_steps: int = 3, + warmup_multiplier: float = 1e-3, log_batches_per_epoch: int = 8, log_samples_per_batch: int = 1, example_input_yx_shape: Sequence[int] = (256, 256), + predict_method: Literal["full_image", "sliding_window"] = "full_image", + predict_overlap: tuple[int, int, int] = (4, 256, 256), ckpt_path: str | None = None, + encoder_only: bool = False, ) -> None: super().__init__() - self.save_hyperparameters(ignore=["loss_function", "ckpt_path"]) + self.save_hyperparameters(ignore=["loss_function", "ckpt_path", "encoder_only"]) if model_config is None: model_config = {} net_class = _ARCHITECTURE.get(architecture) @@ -137,8 +163,13 @@ def __init__( self.loss_function = loss_function if loss_function is not None else nn.MSELoss() self.lr = lr self.schedule = schedule + self.warmup_steps = warmup_steps + self.warmup_multiplier = warmup_multiplier self.log_batches_per_epoch = log_batches_per_epoch self.log_samples_per_batch = log_samples_per_batch + self.predict_method = predict_method + self.predict_overlap = predict_overlap + self.training_step_outputs: list = [] # Each entry is a list of (loss, batch_size) tuples for weighted aggregation. self.validation_losses: list[list[tuple[Tensor, int]]] = [] @@ -161,7 +192,17 @@ def __init__( h, w = example_input_yx_shape self.example_input_array = torch.rand(1, in_channels, d, h, w) - if ckpt_path is not None: + if encoder_only: + if ckpt_path is None: + raise ValueError("DynacellUNet(encoder_only=True) requires ckpt_path to be set") + if not isinstance(self.model, FullyConvolutionalMAE): + raise ValueError(f"encoder_only is only supported for architecture='fcmae', got {architecture!r}") + state_dict = torch.load(ckpt_path, weights_only=True, map_location="cpu")["state_dict"] + prefix = "model.encoder." + encoder_weights = {k.removeprefix(prefix): v for k, v in state_dict.items() if k.startswith(prefix)} + self.model.encoder.load_state_dict(encoder_weights, strict=True) + _logger.info(f"Loaded {len(encoder_weights)} encoder parameters from {ckpt_path}") + elif ckpt_path is not None: self.load_state_dict(torch.load(ckpt_path, weights_only=True, map_location="cpu")["state_dict"]) def forward(self, x: Tensor) -> Tensor: @@ -278,7 +319,14 @@ def predict_step(self, batch: Sample, batch_idx: int, dataloader_idx: int = 0) - source = batch["source"] original_shape = source.shape[2:] source = self._predict_pad(source) - prediction = self.forward(source) + if self.predict_method == "full_image": + prediction = self.forward(source) + elif self.predict_method == "sliding_window": + prediction = self.predict_sliding_window(source, overlap_size=self.predict_overlap) + else: + raise ValueError( + f"Unknown predict_method: {self.predict_method!r}. Choose 'full_image' or 'sliding_window'." + ) return _center_crop_to_shape(prediction, original_shape) def on_train_epoch_end(self): @@ -291,24 +339,20 @@ def on_validation_epoch_end(self): super().on_validation_epoch_end() self._log_samples("val_samples", self.validation_step_outputs) if self.validation_losses: - # Compute per-dataloader weighted mean, then weight dataloaders by sample count. - dl_means, dl_totals = [], [] - for dl_batches in self.validation_losses: - losses, sizes = zip(*dl_batches) - # Create sizes on the same device as the losses to avoid device - # mismatch on GPU/DDP where losses are on the model device. - sizes_t = torch.tensor(sizes, dtype=torch.float, device=losses[0].device) - dl_means.append((torch.stack(losses) * sizes_t).sum() / sizes_t.sum()) - dl_totals.append(sizes_t.sum()) - total_n = torch.stack(dl_totals).sum() - weighted = sum(m * n for m, n in zip(dl_means, dl_totals)) - self.log("loss/validate", weighted / total_n, sync_dist=True) + self.log("loss/validate", _aggregate_validation_losses(self.validation_losses), sync_dist=True) self.validation_step_outputs.clear() self.validation_losses.clear() def configure_optimizers(self): """Configure AdamW optimizer with LR scheduler.""" - return _configure_adamw_scheduler(self, self.model, self.lr, self.schedule) + return configure_adamw_scheduler( + self, + self.model, + self.lr, + self.schedule, + warmup_steps=self.warmup_steps, + warmup_multiplier=self.warmup_multiplier, + ) def _log_samples(self, key: str, imgs: Sequence[Sequence[np.ndarray]]): """Log image grid to the active logger.""" @@ -316,6 +360,73 @@ def _log_samples(self, key: str, imgs: Sequence[Sequence[np.ndarray]]): return log_image_grid(self.logger, key, imgs, self.current_epoch) + def predict_sliding_window(self, source: Tensor, overlap_size: tuple[int, int, int] = (4, 256, 256)) -> Tensor: + """Run sliding-window inference over a large input volume. + + Overlapping regions are averaged across all covering patches. + + Parameters + ---------- + source : Tensor + Input tensor of shape ``(B, C, D, H, W)``. + overlap_size : tuple of int + Overlap in ``(D, H, W)`` between adjacent patches. + + Returns + ------- + Tensor + Prediction with the same spatial shape as ``source``. + """ + spatial = source.shape[-3:] + patch_spatial = tuple(self.model.input_spatial_size) + n_spatial = 3 + overlap = tuple(overlap_size) + + for i in range(n_spatial): + S, P, ov = spatial[i], patch_spatial[i], overlap[i] + if S < P: + raise ValueError(f"spatial dim {i} size {S} must be >= patch size {P}") + if not (0 <= ov < P): + raise ValueError(f"overlap at dim {i} must satisfy 0 <= overlap < patch (got {ov} vs {P})") + + # Accumulators are allocated lazily from the first patch output so + # their channel dimension matches the model's out_channels (which can + # differ from source's in_channels, e.g. 1 phase in -> 2 target out). + prediction_sum: Tensor | None = None + prediction_count: Tensor | None = None + + start_lists = [] + for i in range(n_spatial): + S, P, ov = spatial[i], patch_spatial[i], overlap[i] + stride = P - ov + last = S - P + starts = [0] + while starts[-1] + stride < last: + starts.append(starts[-1] + stride) + if starts[-1] != last: + starts.append(last) + start_lists.append(starts) + + with torch.no_grad(): + for starts in itertools.product(*start_lists): + slicer: list = [slice(None)] * source.ndim + for i, st in enumerate(starts): + slicer[-(n_spatial - i)] = slice(st, st + patch_spatial[i]) + patch_out = self.forward(source[tuple(slicer)]) + if prediction_sum is None: + out_shape = list(source.shape) + out_shape[1] = patch_out.shape[1] + prediction_sum = torch.zeros(out_shape, device=source.device, dtype=patch_out.dtype) + prediction_count = torch.zeros(out_shape, device=source.device, dtype=patch_out.dtype) + prediction_sum[tuple(slicer)] += patch_out + prediction_count[tuple(slicer)] += 1 + + if prediction_sum is None: + raise RuntimeError("sliding window produced no patches") + if not torch.all(prediction_count > 0): + raise RuntimeError("sliding window left uncovered voxels") + return prediction_sum / prediction_count + class DynacellFlowMatching(LightningModule): """Flow-matching LightningModule for generative virtual staining. @@ -346,11 +457,30 @@ class DynacellFlowMatching(LightningModule): num_log_steps : int Number of ODE steps for validation image generation (cheaper than ``num_generate_steps``). - predict_method : {"generate", "non_overlapping", "sliding_window"} - Prediction generation method. ``"generate"`` runs single-patch ODE - (default, matches standard HCS tile workflow). + compute_validation_loss : bool + Whether to compute and log flow-matching validation loss on the + validation loader. Disabled by default to preserve the previous + cheaper validation behavior. + predict_method : {"denoise", "generate", "sliding_window", "iterative"} + Prediction generation method. ``"generate"`` runs single-patch ODE + (default, matches standard HCS tile workflow). ``"sliding_window"`` + partitions the volume into **non-overlapping** tiles (ignores + ``predict_overlap``; passing a non-zero overlap raises so users + aren't silently misled). ``"iterative"`` slides overlapping tiles + with velocity anchoring — use this when you want + ``predict_overlap`` to apply. ``"denoise"`` uses the noise-space + overlap tiler. predict_overlap : int or tuple of int - Overlap for sliding-window prediction. + Overlap for ``denoise`` and ``iterative``. Ignored by + ``sliding_window``; must be ``0`` or ``[0, 0, 0]`` when + ``predict_method='sliding_window'``. + ckpt_path : str | None + Path to a checkpoint to load **weights only** at construction time. + Intended for inference (predict/test), not training resumption — + optimizer state, epoch counters, and scheduler state are not + restored. Bypasses LightningCLI's checkpoint hparam merging, so + predict-time settings (``predict_method``, ``predict_overlap``, + etc.) are taken from the config rather than the checkpoint. """ def __init__( @@ -359,27 +489,39 @@ def __init__( transport_config: dict | None = None, lr: float = 1e-4, schedule: Literal["WarmupCosine", "Constant"] = "WarmupCosine", + warmup_steps: int = 3, + warmup_multiplier: float = 1e-3, log_batches_per_epoch: int = 8, log_samples_per_batch: int = 1, num_generate_steps: int = 100, num_log_steps: int = 10, - predict_method: Literal["generate", "non_overlapping", "sliding_window"] = "generate", + compute_validation_loss: bool = False, + predict_method: Literal["denoise", "generate", "sliding_window", "iterative"] = "generate", predict_overlap: int | tuple[int, int, int] = 256, + ckpt_path: str | None = None, ) -> None: super().__init__() - self.save_hyperparameters() + self.save_hyperparameters( + ignore=["predict_method", "predict_overlap", "num_generate_steps", "num_log_steps", "ckpt_path"] + ) net = CELLDiffNet(**(net_config or {})) self.model = CELLDiff3DVS(net, **(transport_config or {})) self.lr = lr self.schedule = schedule + self.warmup_steps = warmup_steps + self.warmup_multiplier = warmup_multiplier self.log_batches_per_epoch = log_batches_per_epoch self.log_samples_per_batch = log_samples_per_batch self.num_generate_steps = num_generate_steps self.num_log_steps = num_log_steps + self.compute_validation_loss = compute_validation_loss self.predict_method = predict_method self.predict_overlap = predict_overlap self._training_step_outputs: list = [] + self._validation_losses: list[list[tuple[Tensor, int]]] = [] self._val_log_batch: tuple[Tensor, Tensor] | None = None + if ckpt_path is not None: + self.load_state_dict(torch.load(ckpt_path, weights_only=True, map_location="cpu")["state_dict"]) def training_step(self, batch: dict, batch_idx: int) -> Tensor: """Compute flow-matching training loss for one batch. @@ -414,16 +556,27 @@ def training_step(self, batch: dict, batch_idx: int) -> Tensor: return loss def validation_step(self, batch: dict, batch_idx: int, dataloader_idx: int = 0) -> None: - """Capture one validation batch for epoch-end generation logging. - - Flow-matching does not compute a validation loss. - """ + """Capture validation samples and optionally compute loss.""" if batch_idx == 0 and self._val_log_batch is None: n = self.log_samples_per_batch self._val_log_batch = ( batch["source"][:n].clone(), batch["target"][:n].clone(), ) + if not self.compute_validation_loss: + return + phase: Tensor = batch["source"] + target: Tensor = batch["target"] + loss = self.model(phase, target) + if dataloader_idx + 1 > len(self._validation_losses): + self._validation_losses.append([]) + self._validation_losses[dataloader_idx].append((loss.detach(), phase.shape[0])) + self.log( + f"loss/val/{dataloader_idx}", + loss, + sync_dist=True, + batch_size=phase.shape[0], + ) def on_train_epoch_end(self) -> None: """Log training image samples at end of epoch.""" @@ -433,13 +586,17 @@ def on_train_epoch_end(self) -> None: def on_validation_epoch_end(self) -> None: """Generate ODE samples from captured validation batch and log.""" super().on_validation_epoch_end() - if self._val_log_batch is not None and self.logger is not None: - phase_log, target_log = self._val_log_batch - n = min(self.log_samples_per_batch, phase_log.shape[0]) - generated = self.model.generate(phase_log[:n], num_steps=self.num_log_steps) - gen_samples = detach_sample((phase_log[:n], target_log[:n], generated), n) - self._log_samples("val_generated_samples", gen_samples) + if self._val_log_batch is not None: + if self.logger is not None: + phase_log, target_log = self._val_log_batch + n = min(self.log_samples_per_batch, phase_log.shape[0]) + generated = self.model.generate(phase_log[:n], num_steps=self.num_log_steps) + gen_samples = detach_sample((phase_log[:n], target_log[:n], generated), n) + self._log_samples("val_generated_samples", gen_samples) self._val_log_batch = None + if self._validation_losses: + self.log("loss/validate", _aggregate_validation_losses(self._validation_losses), sync_dist=True) + self._validation_losses.clear() def predict_step(self, batch: dict, batch_idx: int, dataloader_idx: int = 0) -> Tensor: """Generate virtual staining for one batch via ODE sampling. @@ -473,12 +630,27 @@ def predict_step(self, batch: dict, batch_idx: int, dataloader_idx: int = 0) -> pad.extend([0, max(0, p - s)]) source = F.pad(source, pad, mode="replicate") - if self.predict_method == "generate": + if self.predict_method == "denoise": + prediction = self.model.denoise_sliding_window(source, overlap_size=self.predict_overlap) + elif self.predict_method == "generate": prediction = self.model.generate(source, num_steps=self.num_generate_steps) - elif self.predict_method == "non_overlapping": - prediction = self.model.generate_non_overlapping(source, num_steps=self.num_generate_steps) elif self.predict_method == "sliding_window": - prediction = self.model.generate_sliding_window( + # generate_sliding_window partitions into non-overlapping tiles + # and does NOT consume predict_overlap. A non-zero overlap means + # the user wants overlapping tiled inference — route them to + # `iterative`, which anchors overlapping regions via velocity. + overlap = self.predict_overlap + overlap_values = (overlap,) * 3 if isinstance(overlap, int) else tuple(overlap) + if any(o > 0 for o in overlap_values): + raise ValueError( + "predict_method='sliding_window' uses non-overlapping tiles and " + f"ignores predict_overlap (got {overlap_values}). " + "Use predict_method='iterative' for overlap-anchored tiled inference, " + "or set predict_overlap=[0, 0, 0] to acknowledge the non-overlapping behavior." + ) + prediction = self.model.generate_sliding_window(source, num_steps=self.num_generate_steps) + elif self.predict_method == "iterative": + prediction = self.model.generate_iterative( source, num_steps=self.num_generate_steps, overlap_size=self.predict_overlap, @@ -486,14 +658,21 @@ def predict_step(self, batch: dict, batch_idx: int, dataloader_idx: int = 0) -> else: raise ValueError( f"Unknown predict_method: {self.predict_method!r}. " - "Choose 'generate', 'non_overlapping', or 'sliding_window'." + "Choose 'denoise', 'generate', 'sliding_window', or 'iterative'." ) return prediction[:, :, : original_shape[0], : original_shape[1], : original_shape[2]] def configure_optimizers(self): """Configure AdamW optimizer with LR scheduler.""" - return _configure_adamw_scheduler(self, self.model, self.lr, self.schedule) + return configure_adamw_scheduler( + self, + self.model, + self.lr, + self.schedule, + warmup_steps=self.warmup_steps, + warmup_multiplier=self.warmup_multiplier, + ) def _log_samples(self, key: str, imgs: Sequence[Sequence[np.ndarray]]) -> None: """Log image grid to the active logger.""" diff --git a/applications/dynacell/src/dynacell/evaluation/README.md b/applications/dynacell/src/dynacell/evaluation/README.md new file mode 100644 index 000000000..8e502eaf8 --- /dev/null +++ b/applications/dynacell/src/dynacell/evaluation/README.md @@ -0,0 +1,370 @@ +# dynacell.evaluation + +End-to-end evaluation pipeline for virtual staining predictions against fluorescence ground truth. + +## Components + +| Module | Purpose | +|---|---| +| `pipeline.py` | Hydra-driven orchestrator. Loads prediction/GT OME-Zarr plates, computes per-FOV per-timepoint metrics, saves CSVs + NPYs + plots. CLI entrypoint: `dynacell evaluate`. | +| `metrics.py` | Pixel metrics (PCC, SSIM, NRMSE, PSNR, FSC resolution, spectral PCC, MicroMS3IM), mask metrics (Dice, IoU, precision, recall, accuracy, TP/FP/FN/TN), feature metrics split into `*_target_*` / `*_pred_*` / `*_pairwise` so GT-side work can be cached separately from predictions. | +| `segmentation.py` | Organelle-specific classical-CV segmentation via `aicssegmentation` workflows (`nucleus`, `membrane`, `nucleoli`, `lysosomes`, `er`, `mitochondria`). Used for mask metrics. | +| `cache.py` | GT artifact cache: on-disk layout, manifest I/O, read/write helpers, staleness check. Keyed by `(cache_schema_version, gt_path, gt_channel_name, cell_segmentation_path)`. | +| `pipeline_cache.py` | Per-FOV load-or-compute wrappers (`fov_gt_masks`, `fov_gt_cp_features`, `fov_gt_deep_features`). Honor `force_recompute.*` flags and the `io.require_complete_cache` contract. | +| `precompute_cli.py` | Hydra entrypoint for `dynacell precompute-gt`. Iterates GT positions and fills the cache; no eval loop. | +| `utils.py` | `DinoV3FeatureExtractor`, `DynaCLRFeatureExtractor`, pairwise feature-similarity helpers, `plot_metrics()` bar/violin plots. | +| `io.py` | OME-Zarr / tiff readers and writers, prediction preprocessing transforms. | +| `torch_ssim.py` | GPU-friendly PyTorch SSIM. | +| `formatting.py` | Metric table formatting helpers. | +| `spectral_pcc/` | Bandlimited spectral PCC diagnostics and bead simulations. | +| `_configs/eval.yaml` | Hydra config for `dynacell evaluate`, with `???` MISSING markers for dataset-specific fields. | +| `_configs/precompute.yaml` | Hydra config for `dynacell precompute-gt`; inherits eval, requires `io.gt_cache_dir`. | + +## Inputs + +- `io.pred_path` — model predictions, HCS OME-Zarr (channel: `io.pred_channel_name`) +- `io.gt_path` — fluorescence ground truth, HCS OME-Zarr (channel: `io.gt_channel_name`) +- `io.cell_segmentation_path` — *optional* precomputed cell segmentation, HCS OME-Zarr. Required only when `compute_feature_metrics=true` or when building CP/DINOv3/DynaCLR cache entries. Position layout must match GT/pred 1:1. +- `io.gt_cache_dir` — *optional* directory for the GT artifact cache. `null` (default) disables caching; set to a writable path to opt in. Required for `dynacell precompute-gt` and for `io.require_complete_cache=true`. + +## Running an evaluation + +`dynacell evaluate` is a Hydra entrypoint. Override any field on the CLI with `key=value`. + +Paths and settings that belong to a (target, marker, dataset) combination +live in named Hydra config groups, so most invocations only need to select +the right group and point at the prediction / output paths. Groups come from +two sources: the packaged schema under `src/dynacell/evaluation/_configs/` +and — on a repo checkout — HPC-bound groups under +`configs/benchmarks/virtual_staining/_internal/` that `dynacell.__main__` +exposes through two injected `hydra.searchpath` roots. See the table below. + +### Config groups + +| Group | Options | What it sets | Source | +|---|---|---|---| +| `target` | `er_sec61b`, `mito_tomm20`, `membrane`, `nucleus` | `target_name`, `benchmark.dataset_ref.target`. | `configs/benchmarks/virtual_staining/_internal/shared/eval/target/` | +| `predict_set` | `ipsc_confocal` | `benchmark.dataset_ref.dataset`. | in-package (`_configs/predict_set/`) | +| `feature_extractor/dinov3` | `lvd1689m` | `feature_extractor.dinov3.pretrained_model_name`. | in-package (`_configs/feature_extractor/dinov3/`) | +| `feature_extractor/dynaclr` | `default` | `feature_extractor.dynaclr.checkpoint` and 8-field `encoder` dict. | `configs/benchmarks/virtual_staining/_internal/shared/eval/feature_extractor/dynaclr/` | +| `leaf` | `///eval__` (8 canonical leaves) | Composes all of the above for a canonical benchmark run; see "Benchmark eval leaves" below. | `configs/benchmarks/virtual_staining/_internal/leaf/` (symlink tree) | + +`io.*` fields (`gt_path`, `cell_segmentation_path`, `gt_channel_name`, +`pred_channel_name`, `gt_cache_dir`) and `pixel_metrics.spacing` are now +owned by the dataset manifest (`dynacell/data/manifests.py`) and +spliced into the composed config by a post-compose hook in +`_ref_hook.py`. `pred_channel_name` is derived as +`{target_channel}_prediction`. + +- **In-package** groups (`predict_set`, `feature_extractor/dinov3`, + `spectral_pcc/*`) ship in the wheel: schema and path-free reference + values only. +- **Repo-checkout** groups (`target`, `feature_extractor/dynaclr`, `leaf`) + all live under `configs/benchmarks/virtual_staining/_internal/` — the + hidden support tree that keeps the top level of `virtual_staining/` + biology-only. They contain HPC paths, our DynaCLR checkpoint, and + benchmark-instance values — useless to external users. + `dynacell.__main__` injects two `hydra.searchpath` roots + (`_internal/` for the `leaf/` tree and `_internal/shared/eval/` for + `target/` and `feature_extractor/dynaclr/`) when running from a repo + checkout. Wheel installs without the repo silently omit these, and + external users supply their own via `--config-dir`. +- **Hydra only discovers `.yaml` files for group resolution**, so eval + group files under `_internal/shared/eval/`, the canonical eval leaves + at `///eval__.yaml`, and the + `_internal/leaf/` symlinks all use `.yaml`. + Lightning-side train and predict leaves stay `.yml` (they compose + through `viscy_utils.compose`, which is extension-agnostic). + +Selecting a group on the CLI: `=