Releases: ghruproject/bactscout
Releases · ghruproject/bactscout
BactScout v1.2.0
Release date: 2025-11-05
This is a breaking release that introduces canonical coverage field names across BactScout's outputs, rounds memory metrics, and updates documentation and the CLI. Please read the upgrade notes carefully before updating production pipelines.
Quick summary
- Version: v1.2.0
- Nature: Breaking change (intentional)
- Primary goals:
- Rename and standardize coverage-related output fields
- Round resource memory metrics to integers for cleaner output
- Add
bactscout versionCLI command - Update QC and scaling documentation to reflect current behavior
- Remove generated test output from repository tracking
- Default thresholds in config changed. You can always change them yourself
- A lot of improvement to documentation. Take a look: https://ghruproject.github.io/bactscout/
Highlights
-
Canonical coverage fields
- Canonicalized coverage field names are now used everywhere in the codebase, outputs, and docs. Notably:
coverage_estimate_sylph— Sylph-derived coverage estimate (previously part of thecoverage_estimatefamily)coverage_estimate_qualibact— calculated coverage (reads / expected genome size) and its statuscoverage_estimate_qualibact_status
- These canonical names are the single source of truth for outputs (CSV/JSON) and header ordering utilities.
- Canonicalized coverage field names are now used everywhere in the codebase, outputs, and docs. Notably:
-
Memory metric formatting
resource_memory_avg_mbandresource_memory_peak_mbare rounded to integers before being written to summary CSV/JSON files.
-
CLI
bactscout versionadded: prints the string frombactscout/__version__.py(now1.2.0).
-
Documentation and guides
docs/guide/quality-control.mdrewritten to reflect the two-tier WARN/FAIL QC logic implemented in code (seebactscout/thread.py). It includes an explicit note about what “x‑fold” coverage means and how both Sylph and qualibact estimates are combined to derive final pass/warning/fail results.docs/guide/scaling.mdadded: practical guidance for running BactScout at scale and a Nextflow example walkthrough (processescollect_sampleandfinal_summary).- Per-sample README content consolidated into the canonical
docs/usage/output-format.md.
Detailed changelog
The following files were updated or added as part of this release (representative list):
-
bactscout/thread.py- All coverage-related keys replaced with canonical names.
- Memory metrics rounded before writeout.
- Final PASS/WARNING/FAIL logic aligned with documentation (two-tier thresholds; critical vs non-critical metrics).
-
bactscout/util.py- CSV header ordering and formatting helpers updated to include new canonical coverage keys.
-
bactscout.py- New
versionsubcommand.
- New
-
bactscout/__version__.py- Updated to
__version__ = "1.2.0".
- Updated to
-
docs/guide/quality-control.md- Rewritten to match logic in
bactscout/thread.py(WARN/FAIL thresholds, coverage handling, metric definitions and guidance).
- Rewritten to match logic in
-
docs/guide/scaling.md(new)- Guidance for multi-sample and HPC/Nextflow deployments; notes about I/O, resource monitoring, and process-level responsibilities.
Full Changelog: v1.1.2...v1.2.0
BactScout v1.1.2
🎉 What's Changed
🐛 Bug Fixes
- Fix MLST ST detection failing for empty/invalid values - Resolved issue where MLST sequence typing would fail when encountering empty or invalid ST values in stringMLST output, ensuring robust handling of edge cases (#9)
📚 Documentation Improvements
- Comprehensive docstring updates - Updated all function docstrings in
thread.pyto accurately reflect current implementation- Removed references to deprecated QC metrics (insert size, filtering status, quality trends)
- Added detailed parameter descriptions with types and defaults
- Documented status logic (PASSED/WARNING/FAILED) for all QC handlers
- Enhanced workflow documentation with step-by-step processing details
- Improved error handling and edge case documentation
🔧 Configuration Updates
- Adapter detection enhancement - Added
adapter_overrep_thresholdconfiguration parameter for more granular control over adapter contamination detection - GC content evaluation - Added
gc_fail_percentageparameter for improved GC content range validation - Removed deprecated
quality_end_drop_thresholdconfiguration
🧹 Code Quality
- Improved function documentation consistency across all QC evaluation handlers
- Enhanced status message clarity for better debugging and reporting
- Better backward compatibility notes for legacy configuration parameters
🧪 Testing
- Added comprehensive test coverage for genome download functionality
- Enhanced integration tests for sample data collection pipeline
- Optimized CI workflows by removing memory-intensive tests
Full Changelog: v1.0.0...v1.1.2
BactScout v1.0.0
Release Highlights
✨ Features
- Complete MLST analysis pipeline for bacterial genomic classification
- Multi-threaded processing for improved performance
- Sylph-based species identification
- FastP quality control integration
- Comprehensive configuration system
🧪 Testing & Quality
- 98 comprehensive pytest tests (100% passing)
- 16 CLI integration tests
- 56 fastp data extraction tests
- 13 stringmlst module tests
- Full code coverage reporting with Codecov integration
🚀 DevOps & Automation
- GitHub Actions CI/CD pipeline with Pixi
- Automated linting and code quality checks
📦 Deployment
- Pre-configured settings for common bacterial species
- Database support for: Acinetobacter baumannii, Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa, Salmonella enterica
🐛 Bug Fixes
- Fixed fastp metrics extraction for read length calculations
- Fixed field name validation in fastp result handling
This is a stable, production-ready release suitable for genomic analysis workflows.