Skip to content

feat: Input validation with Dutch error messages and CI/CD#9

Merged
aslamht merged 4 commits intomainfrom
4-data-validation
Mar 11, 2026
Merged

feat: Input validation with Dutch error messages and CI/CD#9
aslamht merged 4 commits intomainfrom
4-data-validation

Conversation

@aslamht
Copy link
Contributor

@aslamht aslamht commented Mar 10, 2026

Summary

  • Implements comprehensive input validation for enrollment and RIO data
  • Adds Dutch error messages for better user experience
  • Sets up GitHub Actions CI/CD workflows for package checks and test coverage
  • Achieves 145 passing tests with proper validation coverage

Changes

Validation (R/validation.R)

  • validate_enrollments_input() - validates enrollment data structure, required columns, year ranges
  • validate_rio_input() - validates RIO data structure and required columns
  • validate_data_types() - validates column data types
  • All validation uses structured rlang::abort() with informative Dutch error messages

Integration (R/audit.R)

  • Added validation calls in audit_enrollments() and audit_rio()
  • Early input validation before processing

Tests

  • 17 tests for validation functions (test-validation.R)
  • 2 tests for audit integration (test-audit.R)
  • All test expectations updated to match Dutch error messages
  • Result: 145 tests passing, 1 skipped, 0 failures

CI/CD Workflows

  • .github/workflows/R-CMD-check.yaml - Package checks on push/PR
  • .github/workflows/test-coverage.yaml - Test coverage reporting

Test plan

  • All tests passing (145/145)
  • Validation rejects invalid input with helpful Dutch messages
  • CI/CD workflows configured
  • No fluff, clean code

Closes #4

aslamht added 4 commits March 10, 2026 19:33
- Add R/validation.R with validate_enrollments_input(), validate_rio_input(), and validate_data_types()
- Integrate validation into audit_enrollments() and audit_rio()
- Create comprehensive test suite (17 tests for validation, 2 for audit integration)
- Add GitHub Actions workflows for R-CMD-check and test-coverage
- Use structured rlang::abort() with informative Dutch error messages
- All 145 tests passing

Closes #4
- Add raw data validation before column translation
- Add strict NA check for INS_Studentnummer (max 5% NA allowed)
- Remove test output files and add to .gitignore
- Shorten mapping table filenames to fix non-portable path errors
- Use utils::head() to avoid NAMESPACE warnings
- All 145 tests passing

Renamed mapping files (keeping Mapping_ prefix):
- Mapping_INS_Indicatie_eerste_jaars_opleiding_en_instelling... → Mapping_INS_ind_ej_opl_inst_naam.csv
- Mapping_INS_Indicatie_eerste_jaars_instelling...naam → Mapping_INS_ind_ej_inst_naam.csv
- Mapping_INS_Indicatie_eerste_jaars_instelling...Cat → Mapping_INS_ind_ej_inst_cat.csv
- Mapping_INS_Indicatie_actief_op_peildatum... → Mapping_INS_actief_peil_omschr.csv
- Mapping_INS_Opleidingsfase_actueel... → Mapping_INS_opleidingsfase_naam.csv
- Mapping_INS_Examenresultaat... → Mapping_INS_examenres_omschr.csv
- Mapping_INS_Soort_inschrijving... → Mapping_INS_soort_inschr_cat.csv
- Mapping_INS_Hoogste_vooropleiding... → Mapping_INS_hoogste_vooropl_cat.csv
- Mapping_INS_BaMa...gelijke_fase → Mapping_BaMa_Examen_gelijk_cohort.csv
- Mapping_INS_BaMa...ongelijke_fase → Mapping_BaMa_Examen_ongelijk_cohort.csv
- Check all expected columns from Documentatie_ev.csv (In_gebruik=TRUE)
  are present in raw enrollments data (for datasets >= 10 columns)
- For minimal test data (< 10 columns), only check critical columns
- Provides detailed error message with missing column count and list
- All 145 tests passing
- Error message now points users to cedanl/1cijferho tool
- Explains data must be converted from ASCII to CSV with decoding
- Includes GitHub link for the tool
@aslamht aslamht merged commit 9415a25 into main Mar 11, 2026
2 checks passed
@aslamht aslamht deleted the 4-data-validation branch March 11, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data Validatie voor prep1cho

1 participant