feat: tiling by Adames4 · Pull Request #5 · RationAI/ulcerative-colitis

Adames4 · 2026-02-12T19:00:31Z

Tiling script.

Blocked by #3 , #4

[IGNORE QC FILES]

Dependency graph:

                         +--------------+
                  -------| tissue-masks |<------+           +------------+      +----------------------+
                 /       +--------------+       |       +---| tile-masks |<-----| preprocessing-report |
                /                               |       |   +------------+      +----------------------+
+---------+    /                            +--------+  |
| dataset | <-+                             | tiling |<-+
+---------+    \                            +--------+  |
                \                               |       |   +------------+
                 \       +-----------------+    |       +---| embeddings |
                  -------| quality-control |<---+           +------------+
                         +-----------------+

Summary by CodeRabbit

New Features
- Added a configurable tiling preprocessing pipeline for slide image extraction, QC-aware filtering, mask integration, and dataset split handling.
- Added new preprocessing configuration and mask/dataset URI mappings for multi-split workflows.
Chores
- Updated dataset artifact configuration structure to a nested MLflow URI mapping.
- Added a runtime dependency and a job submission script to run the tiling workflow.

Copilot

Pull request overview

Adds a preprocessing tiling stage (plus QC masks generation) to the ulcerative-colitis pipeline, wiring it into the existing Hydra/MLflow/Ray-based preprocessing structure and updating project dependencies accordingly.

Changes:

Introduces preprocessing/tiling.py to build slide/tile datasets using tissue + QC masks and save results to MLflow.
Adds preprocessing/quality_control.py to generate QC masks/metrics via rationai.AsyncClient, organize outputs, and log artifacts to MLflow.
Extends configs (preprocessing + experiments + datasets-with-masks) and updates dependencies (pyproject.toml, uv.lock) for tiling/QC tooling.

Reviewed changes

Copilot reviewed 23 out of 25 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
uv.lock	Locks new dependencies required by tiling/QC (incl. `rationai-tiling`, `rationai-sdk`, `ratiopath`, and geo/image stack).
pyproject.toml	Adds tiling/QC-related runtime deps and Git sources for `rationai-tiling` + `rationai-sdk`.
preprocessing/tissue_masks.py	Small variable rename (`df` → `dataset`) for clarity.
preprocessing/tiling.py	New tiling pipeline: dataset split, slide reading via Ray, mask overlap computation, MLflow dataset export.
preprocessing/quality_control.py	New QC pipeline: async slide checks, mask organization, metrics aggregation, MLflow artifact logging.
scripts/preprocessing/tiling.py	New kube-jobs submission script for tiling.
scripts/preprocessing/quality_control.py	New kube-jobs submission script for QC.
configs/preprocessing/tiling.yaml	Adds global tiling config schema (mpp/tile_extent/stride/splits/metadata).
configs/preprocessing/quality_control.yaml	Adds global QC config schema (output_dir, timeouts, qc_parameters, metadata).
configs/experiment/preprocessing/tiling/ftn_*.yaml	Adds FTN tiling experiment presets for multiple mpp/tile sizes.
configs/experiment/preprocessing/tiling/ikem_*.yaml	Adds IKEM tiling experiment presets for multiple mpp/tile sizes.
configs/experiment/preprocessing/tiling/knl_patos_*.yaml	Adds KNL Patos tiling experiment presets for multiple mpp/tile sizes.
configs/dataset/processed_w_masks/ftn.yaml	Adds dataset variant with tissue/QC mask MLflow URIs (currently marked TODO).
configs/dataset/processed_w_masks/ikem.yaml	Adds dataset variant with tissue/QC mask MLflow URIs (currently marked TODO).
configs/dataset/processed_w_masks/knl_patos.yaml	Adds dataset variant with tissue/QC mask MLflow URIs (currently marked TODO).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

preprocessing/tiling.py

scripts/preprocessing/quality_control.py

configs/dataset/processed_w_masks/knl_patos.yaml

configs/dataset/processed_w_masks/ftn.yaml

preprocessing/tiling.py

scripts/preprocessing/tiling.py

configs/dataset/processed_w_masks/ikem.yaml

preprocessing/tiling.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

pyproject.toml (1)

19-19: Consider pinning the Git source for rationai-tiling explicitly in pyproject.toml.

While uv.lock already locks this dependency to commit 475e7938e9a7230b378fca1c67106eda84f69c85, adding a rev to the source declaration makes the intent explicit and reduces reliance on the lockfile alone.

Proposed change

-rationai-tiling = { git = "https://gitlab.ics.muni.cz/rationai/digital-pathology/libraries/tiling.git" }
+rationai-tiling = { git = "https://gitlab.ics.muni.cz/rationai/digital-pathology/libraries/tiling.git", rev = "475e7938e9a7230b378fca1c67106eda84f69c85" }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 19, The pyproject.toml currently references the Git
dependency "rationai-tiling" without an explicit rev; update its package source
entry to include the exact commit SHA (e.g.,
475e7938e9a7230b378fca1c67106eda84f69c85) via a rev field so the source
declaration is pinned (locate the rationai-tiling dependency block in
pyproject.toml and add rev = "<commit-sha>" to that source entry).

scripts/preprocessing/tiling.py (1)

4-17: Clarify or parameterize placeholder values.

The script contains Python Ellipsis (...) placeholders on lines 5, 6, and 15. While syntactically valid, these will cause runtime errors when submit_job expects actual string values. Consider either:

Using descriptive placeholder strings (e.g., "<your-username>")
Reading values from environment variables or config
Adding a comment indicating this is a template file

♻️ Suggested improvement using environment variables

+import os
 from kube_jobs import storage, submit_job
 
 
 submit_job(
-    job_name="ulcerative-colitis-tiling-...",
-    username=...,
+    job_name=os.environ.get("JOB_NAME", "ulcerative-colitis-tiling"),
+    username=os.environ["KUBE_USERNAME"],
     public=False,
     cpu=64,
     memory="128Gi",
     shm="48Gi",
     script=[
         "git clone https://github.com/RationAI/ulcerative-colitis.git workdir",
         "cd workdir",
         "uv sync --frozen",
-        "uv run --active -m preprocessing.tiling +dataset=processed_w_masks/...",
+        f"uv run --active -m preprocessing.tiling +dataset=processed_w_masks/{os.environ['DATASET']}",
     ],
     storage=[storage.secure.DATA],
 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/preprocessing/tiling.py` around lines 4 - 17, The submit_job call
contains Python Ellipsis (...) in the job_name, username and dataset script
argument which will raise runtime errors; update the submit_job invocation
(referencing submit_job, job_name, username and the script list entry with "uv
run ... +dataset=...") to use real values or parameterize them: replace ... with
descriptive placeholder strings like "<your-job-name>" and "<your-username>" or
read from environment/config (e.g., os.environ["JOB_NAME"],
os.environ["USERNAME"], os.environ["DATASET"]) and document the required env
vars at the top of the file so the script is runnable as a template.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@preprocessing/tiling.py`:
- Around line 102-110: The function extract_coverages is missing a type
annotation for the varargs parameter *cols; update its signature to annotate the
varargs as strings (e.g., *cols: str or *cols: tuple[str, ...]) so the linter no
longer flags no-untyped-def, keeping the existing return type dict[str, Any] and
behavior in extract_coverages.
- Around line 140-205: Remove the unused pyright-style ignore comments (e.g., "#
type: ignore[reportArgumentType]" and "# type: ignore[reportCallIssue]") in this
tiling pipeline: specifically on the map(add_nancy_index, ...), map(qc_agg,
...), slides.map(add_fold, ...), slides.map(add_mask_paths, ...), the
tile_overlay_overlap(...) with_column calls, map(extract_coverages, ...),
filter(filter_tissue, ...), and final map(select, ...) occurrences; either
delete the bracketed ignores or replace them with a generic "# type: ignore" if
your typechecker requires a suppression, ensuring no remaining
"[reportArgumentType]" or "[reportCallIssue]" annotations remain in functions
row_hash, add_nancy_index, qc_agg, add_fold, add_mask_paths,
tile_overlay_overlap, extract_coverages, filter_tissue, and select.

---

Nitpick comments:
In `@pyproject.toml`:
- Line 19: The pyproject.toml currently references the Git dependency
"rationai-tiling" without an explicit rev; update its package source entry to
include the exact commit SHA (e.g., 475e7938e9a7230b378fca1c67106eda84f69c85)
via a rev field so the source declaration is pinned (locate the rationai-tiling
dependency block in pyproject.toml and add rev = "<commit-sha>" to that source
entry).

In `@scripts/preprocessing/tiling.py`:
- Around line 4-17: The submit_job call contains Python Ellipsis (...) in the
job_name, username and dataset script argument which will raise runtime errors;
update the submit_job invocation (referencing submit_job, job_name, username and
the script list entry with "uv run ... +dataset=...") to use real values or
parameterize them: replace ... with descriptive placeholder strings like
"<your-job-name>" and "<your-username>" or read from environment/config (e.g.,
os.environ["JOB_NAME"], os.environ["USERNAME"], os.environ["DATASET"]) and
document the required env vars at the top of the file so the script is runnable
as a template.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f86d163b-66ee-4493-a5d1-4ec51d6cfcc4

📥 Commits

Reviewing files that changed from the base of the PR and between b2d547e and caaa6aa.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

configs/dataset/processed/ftn.yaml
configs/dataset/processed/ikem.yaml
configs/dataset/processed/knl_patos.yaml
configs/dataset/processed_w_masks/ftn.yaml
configs/dataset/processed_w_masks/ikem.yaml
configs/dataset/processed_w_masks/knl_patos.yaml
configs/preprocessing/tiling.yaml
preprocessing/quality_control.py
preprocessing/split_dataset.py
preprocessing/tiling.py
preprocessing/tissue_masks.py
pyproject.toml
scripts/preprocessing/tiling.py

preprocessing/tiling.py

Adames4 added 30 commits January 25, 2026 15:46

chore: replace pdm with uv

6e1d4b9

feat: configs

d7a1ef3

feat: dataset creation

34abac4

fix: configs

80eca85

fix: configs

97335ef

feat: add scripts

727c729

fix: invalid job name

fa9f581

fix: set slide_id as index and ensure nancy is an integer

fd978b3

chore: dependencies

2869916

feat: processed data

725e17d

feat: tissue masks

f7026b3

feat: script

ed49c42

feat: script

87113e1

fix: job name

de98fef

fix: exclude .git folder from ray env

929d298

fix: exclude .venv folder from ray env

81d4df3

fix: typo

3f0321f

feat: job script

2ecde13

feat: config

6cae5d0

feat: job script

616ed48

chore: dependencies

700f916

feat: quality control

65e86db

feat: add dataset configuration files for ftn, ikem, and knl_patos

59965a4

fix: output dir

234d611

fix: typo

4930434

chore: Merge branch 'feature/tissue-masks' into feature/tiling

180afe7

chore: add tiling libs

456d4ae

fix: naming

e88bfd8

feat: confs

b91d36d

feat: tiling

413db4f

Copilot AI reviewed Feb 12, 2026

View reviewed changes

fix: PR

4269a3a

Adames4 requested a review from vejtek February 12, 2026 20:23

Adames4 added 3 commits February 12, 2026 20:30

fix: repo

a9615ed

chore: Merge branch 'feature/quality-control' into feature/tiling

3b7f95c

fix: repo

659910e

This was referenced Feb 12, 2026

feat: tile masks #6

Open

feat: embeddings #8

Open

Adames4 removed the request for review from vejtek February 28, 2026 20:51

Adames4 added 5 commits March 7, 2026 17:28

chore: Merge branch 'master' into feature/tiling

cd09f80

feat: update tiling

2007e83

fix: typo

6c5ec71

fix: typo

4c01ac2

fix: imports

31e78c2

Adames4 changed the base branch from feature/tissue-masks to master March 8, 2026 19:56

Adames4 requested review from a team, JakubPekar, matejpekar and vejtek March 8, 2026 19:56

matejpekar added the jules label Mar 10, 2026

matejpekar requested changes Mar 10, 2026

View reviewed changes

preprocessing/tiling.py Show resolved Hide resolved

preprocessing/tiling.py Show resolved Hide resolved

fix: add ray to with

caaa6aa

Adames4 requested a review from matejpekar March 10, 2026 18:28

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

preprocessing/tiling.py Outdated Show resolved Hide resolved

preprocessing/tiling.py Show resolved Hide resolved

fix: mypy

486ca49

matejpekar removed the jules label Mar 10, 2026

matejpekar approved these changes Mar 10, 2026

View reviewed changes

vejtek approved these changes Mar 12, 2026

View reviewed changes

Adames4 merged commit 235b919 into master Mar 13, 2026
3 checks passed

Adames4 deleted the feature/tiling branch March 13, 2026 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: tiling#5

feat: tiling#5
Adames4 merged 79 commits intomasterfrom
feature/tiling

Adames4 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Adames4 commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Adames4 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading