Skip to content

Addressing main PR comments#15

Merged
caruizelembioinfo merged 3 commits intomainfrom
review_round_20260328
Mar 28, 2026
Merged

Addressing main PR comments#15
caruizelembioinfo merged 3 commits intomainfrom
review_round_20260328

Conversation

@caruizelembioinfo
Copy link
Copy Markdown

@caruizelembioinfo caruizelembioinfo commented Mar 28, 2026

Addressing Main Fork PR Comments:

Code Quality & Style

  • Modern Python typing: The project's pyproject.toml specifies requires-python = ">=3.8" and CI runs on Python 3.8. The built-in generic syntax (dict[str, Any]) requires 3.9+ and the union syntax (X | None) requires 3.10+ at runtime. Since the project rules also forbid from future import annotations, we cannot use these modern forms without breaking compatibility. We've kept the typing module generics (Dict, List, Optional, Tuple) consistent with the rest of the codebase (e.g. base_module.py).
  • Import ordering: Reorganized bases2fastq.py imports into proper PEP 8 blocks (stdlib / third-party / local), with alphabetical sorting within each block.
  • Unused import: Removed Union (and Dict, List, Optional, Tuple) from the typing import.

Potential Bugs

  • Fragile split(""): Added maxsplit=1 to both occurrences -- sample.split("", maxsplit=1) in _setup_colors and sample_data["SampleID"].split("__", maxsplit=1)[1] in tabulate_index_assignment_stats.
  • combined_level comment: Added an inline comment explaining the intentional mix of run-level stats for the run table and project-level samples for per-sample plots.
  • add_run_plots / add_sample_plots empty data: Both methods now return early when data is empty, and skip adding sections when plot_html is None.

Performance

  • find_log_files reuse: Pre-collected log file lists are now stored as instance variables (self._run_level_log_files, self._project_level_log_files) and passed through to all downstream parsers. Manifest log files are also collected once and shared between _parse_run_manifest and _parse_index_assignment. All six parser methods now accept an optional log_files parameter.

Security

  • Path validation comment: Added a comment in _parse_run_manifest_in_project making it explicit that base_directory is resolved to the run output root (not the project subdirectory), so the manifest is validated against the correct ancestor directory.

Minor / Nits

  • Hardcoded docs URL: Extracted to a module-level constant ELEMBIO_DOCS_URL, referenced in all 5 occurrences (4 error messages + the href in init).
  • DEFAULT_MIN_POLONIES: The change from 10,000 to 1,000 is intentional. The previous threshold was too aggressive and excluded samples with very low read counts, including some valid test cases. The lower threshold retains these samples while still filtering out near-empty entries.

@caruizelembioinfo caruizelembioinfo merged commit 38b4694 into main Mar 28, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant