Skip to content

Port virtual_staining examples into applications/cytoland#409

Open
edyoshikun wants to merge 19 commits intomodular-viscy-stagingfrom
cytoland-virtual-staining-examples
Open

Port virtual_staining examples into applications/cytoland#409
edyoshikun wants to merge 19 commits intomodular-viscy-stagingfrom
cytoland-virtual-staining-examples

Conversation

@edyoshikun
Copy link
Copy Markdown
Member

Summary

  • Ports the examples/virtual_staining/ tree from main into applications/cytoland/examples/ on top of modular-viscy-staging, covering four tutorial groups: VS_model_inference, dlmbl_exercise, vcp_tutorials, phase_contrast.
  • Rewrites every viscy.* import to the modular package layout (viscy_data, viscy_transforms, viscy_utils, cytoland.engine, viscy_utils.callbacks.HCSPredictionWriter, viscy_utils.losses.MixedLoss, viscy_utils.evaluation.metrics.mean_average_precision).
  • Updates setup.sh in dlmbl_exercise and phase_contrast to pip install -e "applications/cytoland[metrics]" instead of the retired top-level viscy[metrics,visual,examples]>=0.2 wheel.
  • Adds **/examples/** to [tool.ruff] per-file-ignores for D, E402, E501, F821 so jupytext percent-cell scripts lint cleanly without forcing docstrings, top-of-file imports, short lines, or importing notebook builtins.
  • Adds a Tutorials and demos section to applications/cytoland/README.md pointing at the new tree.

Notebooks (.ipynb) are intentionally not ported — regenerate from solution.py / *.py with jupytext --to ipynb when needed.

Import migration

Old New
viscy.data.hcs viscy_data.hcs
viscy.transforms viscy_transforms
viscy.trainer viscy_utils.trainer
viscy.translation.engine.{VSUNet, FcmaeUNet, AugmentedPredictionVSUNet} cytoland.engine.{...}
viscy.translation.engine.MixedLoss viscy_utils.losses.MixedLoss
viscy.translation.predict_writer.HCSPredictionWriter viscy_utils.callbacks.HCSPredictionWriter
viscy.translation.evaluation_metrics.mean_average_precision viscy_utils.evaluation.metrics.mean_average_precision

Test plan

  • python -c "import ast; ast.parse(open(f).read())" passes for all ported .py files
  • ruff check + ruff format pass via pre-commit on every commit
  • Import smoke test — uv run --package cytoland python -c "from cytoland.engine import ...; from viscy_data.hcs import ...; ..." resolves every ported symbol
  • End-to-end run of any demo against real data + checkpoints (requires download from public.czbiohub.org/comp.micro/viscy and a GPU; not part of this PR)

Tutorial-style scripts under applications/*/examples use jupytext-style
percent cells, markdown docstrings, and long URL references that trip
D205/D400/D100/D103 and E501. Treat them like notebooks and tests.
Bring back the four VSCyto inference demos (VSCyto2D, VSCyto3D,
VSNeuromast, and TTA-augmented) plus plot.py helper from
examples/virtual_staining/VS_model_inference on main. Imports are
updated to the modular package layout:

- viscy.data.hcs -> viscy_data.hcs
- viscy.transforms -> viscy_transforms
- viscy.trainer -> viscy_utils.trainer
- viscy.translation.engine -> cytoland.engine
- viscy.translation.predict_writer -> viscy_utils.callbacks
Bring back the DL@MBL 2024 image-translation exercise (solution.py,
README, setup.sh, prepare-exercise.sh) from
examples/virtual_staining/dlmbl_exercise on main. Notebooks are not
ported; regenerate from solution.py via jupytext if needed.

Imports are updated to the modular package layout:

- viscy.data.hcs -> viscy_data.hcs
- viscy.transforms -> viscy_transforms
- viscy.trainer -> viscy_utils.trainer
- viscy.translation.engine.VSUNet -> cytoland.engine.VSUNet
- viscy.translation.engine.MixedLoss -> viscy_utils.losses.MixedLoss
- viscy.translation.evaluation_metrics.mean_average_precision ->
  viscy_utils.evaluation.metrics.mean_average_precision

setup.sh now installs applications/cytoland[metrics] editable
instead of the legacy top-level viscy[metrics,visual,examples]>=0.2
wheel, plus cellpose and torchview as extra tutorial dependencies.
Tutorial-style scripts routinely import late inside % cells (E402)
and reference notebook builtins like get_ipython without importing
them (F821). Extending the per-file-ignores so the vcp tutorial
scripts lint cleanly alongside the existing D and E501 exemptions.
Bring back the Virtual Cell Platform tutorials (quick_start.py,
hek293t.py, neuromast.py, README.md) from
examples/virtual_staining/vcp_tutorials on main. Notebooks are not
ported; regenerate from .py via jupytext if needed.

quick_start.py imports are updated to the modular package layout:

- viscy.data.hcs -> viscy_data.hcs
- viscy.transforms -> viscy_transforms
- viscy.trainer -> viscy_utils.trainer
- viscy.translation.engine.FcmaeUNet -> cytoland.engine.FcmaeUNet
- viscy.translation.predict_writer -> viscy_utils.callbacks

hek293t.py and neuromast.py have no viscy Python imports (they
demonstrate the viscy preprocess / viscy predict CLIs), so no code
changes are required. The commented-out pip install "viscy[...]"
hints are left as-is for historical reference.
Add a Tutorials and demos section listing VS_model_inference,
vcp_tutorials, dlmbl_exercise, and configs, so the newly ported
examples are discoverable from the cytoland landing page.
Bring back the phase-contrast virtual staining tutorial (solution.py,
README, setup.sh, prepare-exercise.sh) from
examples/virtual_staining/phase_contrast on main. Notebooks are not
ported; regenerate from solution.py via jupytext if needed.

Imports are updated to the modular package layout:

- viscy.data.hcs -> viscy_data.hcs
- viscy.transforms -> viscy_transforms
- viscy.trainer -> viscy_utils.trainer
- viscy.translation.engine.VSUNet -> cytoland.engine.VSUNet

setup.sh now installs applications/cytoland[metrics] editable
instead of the legacy top-level viscy[metrics,visual,examples]>=0.2
wheel.
Add a concise Lightning primer before Part 1 (three-object mental
model: LightningDataModule, LightningModule, Trainer; what
trainer.fit replaces) so learners unfamiliar with Lightning can
follow the rest of the exercise.

Expand inline explanations at the three points where Lightning
concepts first appear in code:

- HCSDataModule section: describe the DataModule role and the exact
  dict shape (source, target, index) yielded to training_step.
- VSUNet instantiation: frame the class as the LightningModule that
  bundles network, loss, and per-batch logic; gloss lr, schedule,
  freeze_encoder, log_batches_per_epoch.
- Trainer constructors: annotate fast_dev_run, accelerator, devices,
  precision="16-mixed", max_epochs, log_every_n_steps, and
  TensorBoardLogger; spell out what trainer.fit does internally so
  learners see it replacing a hand-written training loop.

Markdown only, no code changes.
Six concise additions for learners new to deep learning and Lightning:

1. Typos: fluoresecence -> fluorescence, componets -> components,
   Person Correlation -> Pearson Correlation.
2. Tighten the OME-Zarr / HCS section: one-paragraph primer on the
   row/col/field/level/T/C/Z/Y/X hierarchy before open_ome_zarr is
   called.
3. Add an augmentation motivation table mapping each MONAI transform
   to the real-world microscope variation it simulates.
4. Add a pre-model markdown cell explaining the UNeXt2 config
   (encoder_blocks, dims, decoder_conv_blocks, stem_kernel_size,
   in_stack_depth) in U-Net terms.
5. Same cell explains MixedLoss (L1 + MS-SSIM tradeoff) and the
   WarmupCosine LR schedule.
6. Part 2 opener distinguishes regression vs segmentation metric
   families; fill in the Task 2.1 TODO with real Pearson/SSIM
   definitions tied to the image-translation setting.

Markdown only, no code changes.
Issue dl-janelia/image_translation#16:

- Fix the nested duplicate for-loop after Task 1.1 that shadows the
  loop variable (single loop now).
- Replace the broken viscy.transforms source link with authoritative
  MONAI docs links for RandAffined and RandGaussianNoised.
- Add explicit path + public download URL for the VSCyto2D pretrained
  checkpoint next to Task 2.2 so students don't guess where it went.
- Switch spatial dimension labels from (D, H, W) to (Z, Y, X) in
  Part 3 Gaussian-blur tasks to match the convention used everywhere
  else in the exercise.
- Note MicroSSIM as a microscopy-appropriate SSIM variant in the
  metrics primer.
- Warn students to restart the kernel before re-running training to
  release GPU memory (CUDA OOM prevention).
Rewrite setup.sh to use uv instead of conda so students can provision
the exercise without a working conda install. The script now:

- Installs uv via the official installer if missing.
- Creates a Python 3.11 venv at .venv/ inside the exercise folder.
- Installs cytoland editable from the monorepo plus the tutorial
  extras (cellpose, torchview, jupyter, ipykernel, ipywidgets,
  jupytext, nbformat, nbconvert) into that venv.
- Registers the venv as a Jupyter kernel named 06_image_translation
  so VSCode and JupyterLab both surface it by default.
- Downloads the training / test OME-Zarr stores and the VSCyto2D
  pretrained checkpoint into ~/data/06_image_translation/.

Script is bash strict-mode (set -euo pipefail) and resolves the
monorepo root relative to the script path so it works regardless of
the caller's cwd. README updated to match the new flow and document
the VSCode / Jupyter kernel selection.
Classic skimage SSIM assumes natural-image dynamic range and scores
collapse into a narrow band on sparse, dim, noisy fluorescence
microscopy predictions — so the metric can barely rank good vs bad
outputs. microSSIM (Ashesh et al. 2024, arXiv:2408.08747) subtracts
background and fits a per-image rescale before running SSIM, which
restores sensitivity over the intensity range microscopy predictions
actually live in.

- Replace skimage.metrics.structural_similarity with
  microssim.micro_structural_similarity at all 6 call sites (two
  metrics blocks: single-model and phase2fluor vs pretrained).
- Drop the now-unused from skimage import metrics import.
- Rename DataFrame columns SSIM_nuc / SSIM_mem to microSSIM_nuc /
  microSSIM_mem so plots and saved CSVs name the metric correctly.
- Rewrite the Part 2 Task 2.1 metric definition to explain WHY
  microSSIM instead of SSIM for microscopy.
- Add microssim to setup.sh install list and to the README install
  summary.
setup_TA.sh stages data + checkpoint to a shared DATA_ROOT (no env).
setup_student.sh creates the per-user venv + kernel and skips the
download when DATA_ROOT already has the data.
setup_student.sh now defaults to python 3.13 and installs
cytoland + viscy from PyPI when run outside a VisCy monorepo
clone. When the monorepo is detected (and root pyproject is the
viscy umbrella), it falls back to the existing editable workspace
install. README updated to match.
- Replace stale torch.no_grad() with torch.inference_mode() in the
  prediction-visualization block to match the rest of the file.
- Renumber the first 'Task 1.5' (model instantiation) to 'Task 1.4'.
  The exercise was missing a Task 1.4, leaving two Task 1.5s.
- Update the source-code reference under that task to point at
  VSUNet and the fcmae network (the Unet2D link was obsolete).
Three small wording fixes ported from the DL@Janelia version of this
exercise where it had clearer phrasing:

- Task 1.1 hint: 'what are your options' -> 'what your options are'.
- Task 2.4: add the missing alert heading and move the section header
  outside the alert box (was structurally mismatched).
- Task 2.4 question: typo fix 'How do yout model' -> 'How does your model'.
Adds the Task 2.5 section that walks students through the inverse
translation: predicting QPI from nuclei + membrane fluorescence using
a pretrained model and Test-Time Augmentation. Inserted between Task 2.4
and Part 3.

setup_TA.sh and setup_student.sh now also stage the
fluor2phase_step668.ckpt checkpoint (previously commented out in TA,
absent in student) since Task 2.5 loads it. The hardcoded /mnt/efs/...
paths from the original Janelia version are replaced with the
top_dir-based DATA_ROOT layout used elsewhere in this exercise.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant