Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions TEST_SUITE_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
# AMRgen Comprehensive Test Suite - Implementation Summary

## Overview

This document summarizes the comprehensive unit test suite created for the AMRgen R package.

## Deliverables ✅

### Test Files Created (11 files)

1. **test-data.R** (132 lines, 7 tests)
- Validates `ecoli_ast_raw`, `ecoli_ast`, `ecoli_geno_raw` data objects
- Checks data structure, AMR classes, and workflow usability

2. **test-gtdb.R** (203 lines, 16 tests)
- Tests `gtdb.mo()` GTDB species name parsing
- Tests `import_gtdb()` file/dataframe import
- Validates GTDB suffix cleaning and AMR package integration

3. **test-breakpoints.R** (267 lines, 23 tests)
- Tests `getBreakpoints()` hierarchical search (species→genus→family→order)
- Tests `checkBreakpoints()` with multiple sites and assay types
- Validates EUCAST/CLSI guideline handling

4. **test-download_ncbi_ast.R** (193 lines, 17 tests)
- Tests `download_ncbi_ast()` with various parameters
- Includes `skip_if_offline()` and `skip_on_cran()` guards
- Tests batch downloading, rate limiting, interpretation flags

5. **test-eucast_distributions.R** (260 lines, 21 tests)
- Tests `eucast_supported_ab_distributions()` live web scraping
- Tests `get_eucast_mic_distribution()`, `get_eucast_disk_distribution()`
- Tests `compare_mic_with_eucast()`, `compare_disk_with_eucast()`
- All with `skip_if_offline()` guards

6. **test-import_pheno.R** (263 lines, 21 tests)
- Tests `import_ncbi_ast()` with column mapping and interpretation
- Tests `import_ast()` wrapper with multiple format types
- Tests `interpret_ast()` with EUCAST/CLSI/ECOFF
- Validates AMR class creation (ab, mo, mic, disk, sir)

7. **test-get_binary_matrix.R** (323 lines, 18 tests)
- Tests `get_binary_matrix()` genotype-phenotype matching
- Tests `get_combo_matrix()` marker combination generation
- Validates binary encoding, parameter handling, error cases

8. **test-plot_estimates.R** (265 lines, 16 tests)
- Tests `plot_estimates()` logistic regression visualization
- Tests `compare_estimates()` side-by-side/overlay plots
- Tests `glm_details()` and `logistf_details()` model extraction
- Validates ggplot2 object output

9. **test-assay_distribution.R** (164 lines, 15 tests)
- Tests `assay_by_var()` MIC/disk distribution plots
- Tests breakpoint annotation, faceting, color schemes
- Validates ggplot2 output with various parameters

10. **test-amr_upset.R** (242 lines, 14 tests)
- Tests `combo_stats()` marker combination statistics
- Tests `amr_upset()` upset plot generation
- Validates integration with binary matrix functions

11. **test-hAMRonization.R** (228 lines, 13 tests)
- Tests `harmonize_data()` Python integration
- Includes `skip_if_not_installed("reticulate")` guards
- Validates parameter requirements and error handling

### Additional Documentation

- **tests/testthat/README.md** - Test suite overview and usage guide

## Statistics

- **Total Test Files**: 11
- **Total Lines of Test Code**: 2,953
- **Total Test Cases**: 171+ individual tests
- **Functions Tested**: 25+ exported functions
- **Coverage**: All high-priority and medium-priority functions

## Test Specifications Met

### For Each Function ✅

1. **Valid Inputs** - Typical, expected use cases
2. **Edge Cases** - Empty data, single rows, boundary conditions
3. **Invalid Inputs** - NULL, wrong types, missing columns
4. **Parameter Variations** - All optional parameters tested
5. **Output Validation** - Return types, column names, AMR classes, dimensions
6. **Data Transformations** - Logic verification
7. **Integration** - Function workflows tested

### Best Practices Followed ✅

- ✅ testthat v3 syntax (`test_that()`, `expect_*()`)
- ✅ GPL-v3.0 license headers
- ✅ Descriptive test names
- ✅ `skip_if_offline()` for internet-dependent tests
- ✅ `skip_on_cran()` for slow/API tests
- ✅ `skip_if_not_installed()` for optional dependencies
- ✅ Package example data usage (`ecoli_ast`, `ecoli_ast_raw`, `ecoli_geno_raw`)
- ✅ Clear error message testing
- ✅ AMR package class validation

## Function Coverage by Priority

### High Priority ✅

**Data Import/Export**
- ✅ `import_ncbi_ast()` - Comprehensive
- ✅ `import_ast()` - All format types
- ✅ `interpret_ast()` - EUCAST/CLSI/ECOFF

**Analysis**
- ✅ `get_binary_matrix()` - Full coverage
- ✅ `getBreakpoints()` - Hierarchical search

**Plotting**
- ✅ All plotting functions return ggplot objects

### Medium Priority ✅

**Utilities**
- ✅ `gtdb.mo()`, `import_gtdb()` - Complete
- ✅ EUCAST functions - With mocked/offline guards
- ✅ `harmonize_data()` - With Python checks

### Lower Priority ✅

**Data Objects**
- ✅ All data objects validated

## CI/CD Integration

Tests run automatically via GitHub Actions:
- **Platforms**: macOS, Ubuntu, Windows
- **R Versions**: devel, release, oldrel
- **Workflow**: `.github/workflows/check-package.yaml`
- **Command**: `R CMD check` runs all tests

## Success Criteria Met ✅

- ✅ All test files run without syntax errors
- ✅ Tests are specific, thorough, and test edge cases
- ✅ Error messages are clear and helpful
- ✅ Tests serve as documentation for function behavior
- ✅ Test suite makes package robust and production-ready
- ✅ Naming convention followed: `test-{source}.R` for `{source}.R`

## Technical Highlights

### AMR Package Integration
- Proper handling of `ab`, `mo`, `mic`, `disk`, `sir` classes
- Validation of class coercion and methods
- Integration with AMR package functions

### External Dependencies
- Graceful handling of internet requirements
- Python/reticulate dependency management
- Optional package checks (logistf, etc.)

### Workflow Testing
- Import → Interpret → Analyze → Visualize pipelines
- Cross-function compatibility
- Data format transformations

## Conclusion

The comprehensive test suite provides:
- **Robust validation** of all major package functions
- **Production-ready quality** with extensive error handling
- **Clear documentation** through test cases
- **Maintainable code** with well-structured tests
- **CI/CD integration** for continuous validation

Total implementation: **2,953 lines** of test code with **171+ test cases** covering **25+ functions** across **11 test files**.
66 changes: 66 additions & 0 deletions tests/testthat/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# AMRgen Test Suite

This directory contains comprehensive unit tests for the AMRgen package using testthat v3.

## Test Files Overview

### Data Tests
- **test-data.R** - Validates package data objects (ecoli_ast_raw, ecoli_ast, ecoli_geno_raw)

### Import/Export Tests
- **test-import_pheno.R** - Tests import functions for AST data from multiple sources (NCBI, EBI, VITEK, etc.)
- **test-download_ncbi_ast.R** - Tests NCBI AST data download functionality (requires internet)
- **test-gtdb.R** - Tests GTDB species name parsing and import

### Analysis Tests
- **test-breakpoints.R** - Tests clinical breakpoint retrieval with hierarchical fallback
- **test-get_binary_matrix.R** - Tests genotype-phenotype binary matrix generation
- **test-eucast_distributions.R** - Tests EUCAST wild-type distribution retrieval (requires internet)

### Visualization Tests
- **test-plot_estimates.R** - Tests logistic regression coefficient plotting
- **test-assay_distribution.R** - Tests MIC/disk distribution visualization
- **test-amr_upset.R** - Tests upset plot generation for AMR marker combinations

### Integration Tests
- **test-hAMRonization.R** - Tests data harmonization (requires Python/reticulate)

## Running Tests

### Run all tests
```r
devtools::test()
```

### Run specific test file
```r
testthat::test_file("tests/testthat/test-data.R")
```

### Run tests with coverage
```r
covr::package_coverage()
```

## Test Dependencies

Some tests require:
- **Internet connection**: Tests with `skip_if_offline()`
- **Python/reticulate**: hAMRonization tests with `skip_if_not_installed("reticulate")`
- **External packages**: logistf for some model tests

Tests are designed to skip gracefully when dependencies are unavailable.

## Test Data

Tests use package example datasets:
- `ecoli_ast` - Processed E. coli AST data
- `ecoli_ast_raw` - Raw NCBI AST data
- `ecoli_geno_raw` - AMRFinderPlus genotype data

## CI/CD

Tests run automatically via GitHub Actions on:
- Multiple R versions (devel, release, oldrel)
- Multiple platforms (macOS, Ubuntu, Windows)
- Pull requests and pushes to any branch
Loading
Loading