Skip to content

Address #15; Lay Validation Foundation; Light Cleanup Work#17

Open
olivia-banks wants to merge 5 commits intoEpiForeSITE:mainfrom
olivia-banks:ob-organize
Open

Address #15; Lay Validation Foundation; Light Cleanup Work#17
olivia-banks wants to merge 5 commits intoEpiForeSITE:mainfrom
olivia-banks:ob-organize

Conversation

@olivia-banks
Copy link
Member

@olivia-banks olivia-banks commented Mar 18, 2026

Address #15 and lay foundation for validation; light cleanup work. Running list of changes:

  • Move code into the structure recommended by various pieces of tooling (e.g., ./app.py./src/epicc/).
  • Move file access to using importlib.resources for increased flexibility under differing deployment conditions (necessary for when we package with micropip for WASM/stlite).
  • Fixing typing errors, make sure we cleanly lint under standard type checking.
  • Implement format read/write (FRW) interface for Define YAML schema and write a validation tool validate_yaml #15 for dealing with arbitrary opaque dictionaries in various formats (YAML/XLSX) while preserving non-semantic (trivia) elements.
  • Create a standard validation layer for unstructured data through Pydantic (for both config and model parameters).
  • Wire FRW interface into the configuration layer (remove one-off logic).
  • Wire FRW interface into the model parameter layer (remove one-off logic).
  • Allow generation of "template" files for each Pydantic data model (BaseModel).
  • Write unit tests for the FRW interface with pytest.
  • Rename from epicc (Epi Cost Calculator) to the new project name, once decided upon.

We're introducing some dependencies here, but they are already baked into pyodide, which is what stlite uses under the hood, so there's no extra overhead!

Discussion for the FRW may be found here.

@olivia-banks olivia-banks self-assigned this Mar 18, 2026
@olivia-banks olivia-banks added this to the MVP milestone Mar 18, 2026
@olivia-banks olivia-banks linked an issue Mar 18, 2026 that may be closed by this pull request
@olivia-banks olivia-banks added the enhancement New feature or request label Mar 18, 2026
@olivia-banks
Copy link
Member Author

There's still some more cleanup I have to do on the YAML side.

@olivia-banks olivia-banks marked this pull request as ready for review March 24, 2026 04:57
Copilot AI review requested due to automatic review settings March 24, 2026 04:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures the Streamlit app into a src/epicc/ package layout and introduces a format read/write + validation foundation (Pydantic-backed) for loading parameters and model/config YAML/XLSX in a uniform way, with accompanying unit tests.

Changes:

  • Migrates the app from top-level app.py + utils/ + models/ into src/epicc/ (new __main__.py, config/, formats/, model/, models/, utils/).
  • Adds a new FRW (format read/write) layer for YAML/XLSX plus Pydantic validation and template generation.
  • Adds pytest-based tests for the formats and model loader, and updates dependencies (PyYAML → ruamel-yaml, adds Pydantic).

Reviewed changes

Copilot reviewed 35 out of 43 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
uv.lock Updates locked dependencies (adds Pydantic + ruamel-yaml; removes PyYAML).
pyproject.toml Updates runtime deps (adds pydantic[email], ruamel-yaml; removes PyYAML).
.gitignore Adds ignores for common dev/build artifacts.
app.py Removes old top-level Streamlit entrypoint.
config/app.yaml Removes old app config file (migrated).
config/global_defaults.yaml Removes old global defaults file (migrated).
config/paths.yaml Removes old model path config (no longer used).
models/tb_isolation.py Removes old dynamic python-file model module.
models/measles_outbreak.py Removes old dynamic python-file model module.
utils/model_loader.py Removes old model discovery/dynamic import loader.
utils/parameter_loader.py Removes old YAML/XLSX parameter loader utilities.
utils/parameter_ui.py Removes old Streamlit parameter UI renderer.
utils/section_renderer.py Removes old section renderer.
src/epicc/main.py New Streamlit app entrypoint using the new package structure.
src/epicc/web/sidebar.css Adds CSS for sidebar styling.
src/epicc/config/init.py Adds config loader (resource-based) and exports CONFIG.
src/epicc/config/schema.py Adds Pydantic config schema models.
src/epicc/config/default.yaml Adds default app config YAML.
src/epicc/formats/init.py Adds format registry, validation wrapper, and read_from_format.
src/epicc/formats/base.py Introduces the BaseFormat FRW interface.
src/epicc/formats/yaml.py Adds YAML FRW backend (ruamel-yaml).
src/epicc/formats/xlsx.py Adds XLSX FRW backend (openpyxl).
src/epicc/formats/template.py Adds template generation from Pydantic models.
src/epicc/model/init.py Adds model loader for schema-based YAML models.
src/epicc/model/base.py Adds BaseSimulationModel ABC for Python-based models.
src/epicc/model/schema.py Adds Pydantic schema for YAML-defined models.
src/epicc/models/tb_isolation.py Ports TB Isolation model to BaseSimulationModel.
src/epicc/models/tb_isolation.yaml Adds default parameters for TB Isolation model.
src/epicc/models/measles_outbreak.py Ports Measles Outbreak model to BaseSimulationModel.
src/epicc/models/measles_outbreak.yaml Adds default parameters for Measles Outbreak model.
src/epicc/utils/model_loader.py Defines built-in model registry (returns model instances).
src/epicc/utils/parameter_loader.py Adds typed parameter loading + flattening via FRW.
src/epicc/utils/parameter_ui.py Adds updated Streamlit parameter UI rendering helpers.
src/epicc/utils/section_renderer.py Adds updated section rendering utilities.
src/epicc/utils/excel_model_runner.py Refactors existing Excel runner (style cleanup; reuses shared flatten_dict).
tests/epicc/test_formats.py Adds tests for format selection + read_from_format.
tests/epicc/test_formats_yaml.py Adds YAMLFormat tests (read/write, error cases).
tests/epicc/test_formats_xlsx.py Adds XLSXFormat tests (read/write, error cases).
tests/epicc/test_formats_template.py Adds template generation integration tests (YAML + XLSX).
tests/epicc/test_model_loader.py Adds tests for schema-based model loader.
sample/TB Isolation.xlsx Adds sample Excel file for TB Isolation.
sample/Measles Outbreak.xlsx Adds sample Excel file for Measles Outbreak.
sample/~$TB Isolation.xlsx Adds an Excel temp/lock file (artifact).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +16
# We're okay to cast here because we know the file exists and Traversable is a supertype of
# Path, but mypy doesn't know that.
config_path = cast(
Path, importlib.resources.files("epicc").joinpath(f"models/{name}.yaml")
)

return read_from_format(config_path, config_path.open("rb"), Model)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importlib.resources.files(...) returns a Traversable, not a pathlib.Path. Casting it to Path doesn’t change the runtime type, and read_from_format() ultimately calls Path(path).suffix, which will raise TypeError for a Traversable in zipped/pyodide-style deployments. Prefer passing a real str/Path for suffix detection (e.g., f"{name}.yaml") while using the Traversable only to open the stream, or extend read_from_format/get_format to accept Traversable safely.

Suggested change
# We're okay to cast here because we know the file exists and Traversable is a supertype of
# Path, but mypy doesn't know that.
config_path = cast(
Path, importlib.resources.files("epicc").joinpath(f"models/{name}.yaml")
)
return read_from_format(config_path, config_path.open("rb"), Model)
# Use a Traversable for opening the resource, and a real Path/str for suffix detection.
config_resource = importlib.resources.files("epicc").joinpath(f"models/{name}.yaml")
config_name = Path(f"{name}.yaml")
return read_from_format(config_name, config_resource.open("rb"), Model)

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +16 to +21
# We're okay to cast here because we know the file exists and Traversable is a supertype of
# Path, but mypy doesn't know that.
config_path = cast(
Path, importlib.resources.files("epicc").joinpath(f"config/{name}.yaml")
)
return read_from_format(config_path, config_path.open("rb"), Config)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as epicc.model.load_model: importlib.resources.files(...) returns a Traversable, and casting to Path can lead to runtime errors when read_from_format() tries to wrap it with Path(...). To keep this compatible with packaging/pyodide, pass a str/Path filename for format detection and use the Traversable only for .open() (or update read_from_format/get_format to support Traversable).

Suggested change
# We're okay to cast here because we know the file exists and Traversable is a supertype of
# Path, but mypy doesn't know that.
config_path = cast(
Path, importlib.resources.files("epicc").joinpath(f"config/{name}.yaml")
)
return read_from_format(config_path, config_path.open("rb"), Config)
resource = importlib.resources.files("epicc").joinpath(f"config/{name}.yaml")
filename = f"config/{name}.yaml"
return read_from_format(filename, resource.open("rb"), Config)

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +88 to +111
def write(self, data: dict[str, Any], template: Workbook | None = None) -> bytes:
"""
Write a dictionary to an XLSX file.

Args:
data: Dictionary to write.
template: Optional Workbook object to use as a template for writing. If provided, the
structure and formatting of the template will be preserved as much as possible.

Returns:
Byte array containing the XLSX data.
"""

wb = template or Workbook()
ws = wb.active
assert ws is not None, "Workbook must have an active worksheet (bug)."

# Populate with the provided data piecewise.
for row in ws.iter_rows():
key_cell = row[_COL_PARAMETER]
val_cell = row[_COL_VALUE]
if key_cell.value in data:
val_cell.value = data[key_cell.value] # type: ignore

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XLSXFormat.read() builds a nested dict from dot-notation keys, but write() only updates cells when key_cell.value in data, which only works for a flat dict keyed by the worksheet strings. As-is, a read→modify→write flow will silently fail for nested values (e.g., updating {"costs": {"latent": 1}} won’t match a costs.latent row). Consider flattening nested input data to dot-notation before the loop (or change read() to keep flat keys).

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +47
yaml = YAML(typ="safe")

try:
data = yaml.load(data)
except Exception as e:
raise ValueError(f"Failed to parse YAML data at {self.path}") from e

if not isinstance(data, dict):
raise ValueError(
f"Expected a YAML mapping at the top level in {self.path}, got {type(data).__name__}"
)

return data, yaml
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module docstring says this reader preserves YAML structure/trivia, but the implementation uses YAML(typ="safe") and returns a plain dict, which discards comments/formatting. If trivia preservation is a goal, use ruamel’s round-trip loader (typ="rt") and keep the loaded CommentedMap (or similar) as the template for write().

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,76 @@
"""
Generic reader for YAML parameter files. Excepts a YAML file with a mapping at the top level, which
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling in the module docstring: “Excepts a YAML file …” should be “Expects a YAML file …”.

Suggested change
Generic reader for YAML parameter files. Excepts a YAML file with a mapping at the top level, which
Generic reader for YAML parameter files. Expects a YAML file with a mapping at the top level, which

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +8
import importlib.resources
import tempfile
import textwrap
from pathlib import Path
from unittest.mock import MagicMock, patch

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These imports appear unused and will be flagged by Ruff’s default rules (F401): importlib.resources, MagicMock. Remove them or use them in the tests.

Copilot uses AI. Check for mistakes.
Comment on lines 14 to +17
[dependency-groups]
dev = [
"ruff>=0.15.6",
"pytest>=8.0.0",
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pytest is used by the newly added test suite (tests/epicc/*) but isn’t declared in any dependency group, and it also doesn’t appear in uv.lock. Add pytest (and any other test runner deps) to an appropriate dependency group (e.g. dev or a new test group) and regenerate the lockfile so a fresh env/container can run the tests.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +9
import importlib.resources
from typing import Any

import streamlit as st

from epicc.config import CONFIG
from epicc.formats import VALID_PARAMETER_SUFFIXES
from epicc.model.base import BaseSimulationModel
from epicc.utils.excel_model_runner import (
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module uses absolute imports like from epicc.config import CONFIG, which requires src/ to be on PYTHONPATH or the project to be installed as a package. If users run streamlit run src/epicc/__main__.py (a common pattern after moving to src/ layout), sys.path will include src/epicc (not src) and these imports will fail. Consider adding a documented/automated entrypoint (e.g., a small top-level app.py shim, or ensuring uv sync installs the project and updating run instructions) so the app is runnable without manual PYTHONPATH setup.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Define YAML schema and write a validation tool validate_yaml

2 participants