-
Notifications
You must be signed in to change notification settings - Fork 0
feat: quality control #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
6e1d4b9
chore: replace pdm with uv
Adames4 d7a1ef3
feat: configs
Adames4 34abac4
feat: dataset creation
Adames4 80eca85
fix: configs
Adames4 97335ef
fix: configs
Adames4 727c729
feat: add scripts
Adames4 fa9f581
fix: invalid job name
Adames4 fd978b3
fix: set slide_id as index and ensure nancy is an integer
Adames4 87113e1
feat: script
Adames4 700f916
chore: dependencies
Adames4 65e86db
feat: quality control
Adames4 59965a4
feat: add dataset configuration files for ftn, ikem, and knl_patos
Adames4 234d611
fix: output dir
Adames4 4930434
fix: typo
Adames4 e4d2499
fix: typo
Adames4 acfbf19
fix: glob over changing dir
Adames4 659f505
fix: finish the run
Adames4 45ac3b3
fix: rever last commit
Adames4 bd0c6ac
fix: PR comments
Adames4 9ffea96
chore: dependencies
Adames4 c98608e
chore: Merge branch 'feature/dataset' into feature/quality-control
Adames4 bc71668
feat: configs
Adames4 a9615ed
fix: repo
Adames4 0a05af5
chore: Merge branch 'master' into feature/quality-control
Adames4 b0e8b96
fix: PR
Adames4 5b1285a
chore: new lines
Adames4 3c14873
fix: PR
Adames4 5544b91
fix: PR
Adames4 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # @package _global_ | ||
|
|
||
| output_dir: ${project_dir}/quality_control/${dataset.institution} | ||
|
|
||
| request_timeout: 18000 | ||
| max_concurrent: 5 | ||
|
|
||
| qc_parameters: | ||
| mask_level: 3 | ||
| sample_level: 1 | ||
| check_residual: True | ||
| check_folding: False | ||
| check_focus: True | ||
| wb_correction: True | ||
|
|
||
|
|
||
| metadata: | ||
| run_name: "🎭 QC Masks: ${dataset.institution}" | ||
| description: Quality control masks for ${dataset.institution} institution | ||
| hyperparams: ${qc_parameters} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| # credits: https://gitlab.ics.muni.cz/rationai/digital-pathology/pathology/lymph-nodes/-/blob/develop/preprocessing/qc.py?ref_type=heads | ||
|
|
||
| import asyncio | ||
| from collections.abc import Generator | ||
| from pathlib import Path | ||
| from typing import TypedDict | ||
|
|
||
| import hydra | ||
| import mlflow.artifacts | ||
| import pandas as pd | ||
| import rationai | ||
| from omegaconf import DictConfig | ||
| from rationai.mlkit import autolog, with_cli_args | ||
| from rationai.mlkit.lightning.loggers import MLFlowLogger | ||
| from rationai.types import SlideCheckConfig | ||
| from tqdm.asyncio import tqdm | ||
|
|
||
|
|
||
| class QCParameters(TypedDict): | ||
| mask_level: int | ||
| sample_level: int | ||
| check_residual: bool | ||
| check_folding: bool | ||
| check_focus: bool | ||
| wb_correction: bool | ||
|
|
||
|
|
||
| def get_qc_masks(qc_parameters: QCParameters) -> Generator[tuple[str, str], None, None]: | ||
| if qc_parameters["check_focus"]: | ||
| yield ("Piqe_focus_score_piqe_median", "blur_per_tile") | ||
| yield ("Piqe_piqe_median_activity_mask", "blur_per_pixel") | ||
|
|
||
| if qc_parameters["check_residual"]: | ||
| yield ("ResidualArtifactsAndCoverage_cov_percent_heatmap", "artifacts_per_tile") | ||
| yield ("ResidualArtifactsAndCoverage_coverage_mask", "artifacts_per_pixel") | ||
|
|
||
| if qc_parameters["check_folding"]: | ||
| yield ("FoldingFunction_folding_test", "folds_per_pixel") | ||
|
|
||
|
|
||
| def organize_masks(output_path: Path, subdir: str, mask_prefix: str) -> None: | ||
| prefix_dir = output_path / subdir | ||
| prefix_dir.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| # Glob has to be wrapped in list, because we're modifying the directory!!! | ||
| for file in list(output_path.glob(f"{mask_prefix}_*.tiff")): | ||
| slide_name = file.name.replace(f"{mask_prefix}_", "") | ||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| destination = prefix_dir / slide_name | ||
| file.rename(destination) | ||
|
|
||
|
|
||
| async def qc_main( | ||
| output_path: Path, | ||
| slides: list[str], | ||
| logger: MLFlowLogger, | ||
| request_timeout: int, | ||
| max_concurrent: int, | ||
| qc_parameters: QCParameters, | ||
| ) -> None: | ||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| async with rationai.AsyncClient() as client: # type: ignore[attr-defined] | ||
| async for result in tqdm( | ||
| client.qc.check_slides( | ||
| slides, | ||
| output_path, | ||
| config=SlideCheckConfig(**qc_parameters), | ||
| timeout=request_timeout, | ||
| max_concurrent=max_concurrent, | ||
| ), | ||
| total=len(slides), | ||
| ): | ||
| if not result.success: | ||
| with open(output_path / "qc_errors.log", "a") as log_file: | ||
| log_file.write( | ||
| f"Failed to process {result.wsi_path}: {result.error}\n" | ||
| ) | ||
|
|
||
| # Organize generated masks into subdirectories | ||
| for prefix, artifact_name in get_qc_masks(qc_parameters): | ||
| organize_masks(Path(output_path), artifact_name, prefix) | ||
|
|
||
| # Merge generated csv files | ||
| csvs = list(Path(output_path).glob("*.csv")) | ||
| pd.concat([pd.read_csv(f) for f in csvs]).to_csv( | ||
| Path(output_path, "qc_metrics.csv"), index=False | ||
| ) | ||
|
|
||
| # Remove individual csv files | ||
| for f in csvs: | ||
| f.unlink() | ||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| logger.log_artifacts(local_dir=str(output_path)) | ||
|
|
||
|
|
||
| def download_dataset(uri: str) -> pd.DataFrame: | ||
| path = mlflow.artifacts.download_artifacts(artifact_uri=uri) | ||
| df = pd.read_csv(path) | ||
| return df | ||
|
|
||
|
|
||
| @with_cli_args(["+preprocessing=quality_control"]) | ||
| @hydra.main(config_path="../configs", config_name="preprocessing", version_base=None) | ||
| @autolog | ||
| def main(config: DictConfig, logger: MLFlowLogger) -> None: | ||
| df = download_dataset(config.dataset.uri) | ||
|
|
||
| output_path = Path(config.output_dir) | ||
| output_path.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| asyncio.run( | ||
| qc_main( | ||
| output_path=output_path, | ||
| slides=df["path"].to_list(), | ||
| logger=logger, | ||
| request_timeout=config.request_timeout, | ||
| max_concurrent=config.max_concurrent, | ||
| qc_parameters=config.qc_parameters, | ||
| ) | ||
| ) | ||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| from kube_jobs import storage, submit_job | ||
|
|
||
|
|
||
| submit_job( | ||
| job_name="ulcerative-colitis-quality-control-...", | ||
| username=..., | ||
| public=False, | ||
| cpu=2, | ||
| memory="4Gi", | ||
| script=[ | ||
| "git clone https://github.com/RationAI/ulcerative-colitis.git workdir", | ||
| "cd workdir", | ||
| "uv sync --frozen", | ||
| "uv run -m preprocessing.quality_control +dataset=processed/...", | ||
| ], | ||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| storage=[storage.secure.DATA], | ||
| ) | ||
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.