BDMS 412/529/530: water level csv implementation#466
BDMS 412/529/530: water level csv implementation#466jacob-a-brown wants to merge 36 commits intowater-level-csvfrom
Conversation
This was deprecated awhile ago but still existed in the lexicon and parameter files. The only valid sample matrix for water level observations is "water"
The bulk upload is used for both the API and CLI, so it should be tested by itself. Then the API and CLI tests can just call the bulk upload function and verify that they run as expected
All water levels must be associated with an existing contact in the DB, so the contact fixture is used instead of hardcoding a name.
hanging data prevents tests from running cleanly
This large refactor updates the water level CSV bulk upload process in the following ways: - It uses the updated schema/headers for the CSV - It uses Pydantic to do all of the data validations that don't pertain to the DB - It updates tests to reflect these changes - It tests the services function since that logic is used by both the API and CLI
The logic to test if the bulk upload function works correctly is already covered in its own unit tests. The CLI test was redundant and has been removed to streamline the test suite.
This commit adds a unit test to verify that the water level csv upload service correctly handles the scenario where the specified source file does not exist. The test checks that the appropriate error message is returned in the payload when a FileNotFoundError is raised.
instead of just the first one. This allows users to see all issues with a row at once instead of having to fix them one by one.
the model needs to be validated and created before any comparisons can be made against the database
this is required in the feature file
…n value This reflects the updates to the feature test
It is now called water_level_date_time instead of measurement_date_time
The required/optional givens are common between well inventory and water level CSVs, so this commit moves them to common.py to avoid duplication and make them reusable across different feature files.
The previous implementation was for an outdated version of the water level csv handling. This commit updates the test to align with the new implementation, ensuring that the CLI commands are tested correctly with the current water level csv processing logic.
Both water-level-csv and well-inventory-csv features have the same step to convert datetime fields from naive to aware and confirm the correct timezone offset. This commit moves that step implementation to common.py to avoid code duplication and ensure consistent testing of datetime handling across both features.
This commit adds a feature test for invalid "measuring_person" values in the water level csv upload. It also uses standard verbiage for the feature test so that the same step can be reused.
There was a problem hiding this comment.
Pull request overview
This PR implements the water level CSV upload functionality with proper Pydantic-based validation and database persistence, while refactoring common test steps to enable code reuse between water-level-csv and well-inventory-csv feature files.
Changes:
- Refactored water level CSV upload to use Pydantic schemas for validation and structured data handling
- Moved common test steps to
tests/features/common.pyfor reuse across feature files - Updated field mappings to align with database schema (e.g.,
measurement_date_time→water_level_date_time, field staff handling)
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| services/water_level_csv.py | Core implementation refactored to use Pydantic schemas and proper database field mappings |
| schemas/water_level_csv.py | New Pydantic schema definitions for CSV rows and bulk upload responses |
| tests/services/test_water_level_csv.py | New comprehensive unit tests for bulk upload function |
| tests/test_observation.py | Updated API test to use fixture-based test data and verify all created database entities |
| tests/test_cli_commands.py | Updated CLI tests with detailed verification of database insertions |
| tests/features/steps/common.py | New common test steps for CSV field validation and datetime handling |
| tests/features/steps/water-levels-csv.py | Updated to use Pydantic schema and shared common steps |
| tests/features/steps/well-inventory-csv.py | Removed duplicate steps now available in common.py |
| tests/conftest.py | New fixture providing test data for water level bulk uploads |
| tests/features/environment.py | Updated contact creation to support multiple named contacts |
| api/observation.py | Updated to return proper Pydantic payload and 201 status code |
| core/parameter.json | Corrected matrix value from "groundwater" to "water" |
| @pytest.fixture() | ||
| def water_level_bulk_upload_data(water_well_thing, contact, second_contact): | ||
| data = { | ||
| "well_name_point_id": water_well_thing.name, | ||
| "field_event_date_time": "2025-02-15T08:00:00", | ||
| "field_staff": contact.name, | ||
| "field_staff_2": second_contact.name, | ||
| "field_staff_3": "", | ||
| "water_level_date_time": "2025-02-15T10:30:00", | ||
| "measuring_person": contact.name, | ||
| "sample_method": "Electric tape measurement (E-probe)", | ||
| "mp_height": "1.5", | ||
| "level_status": "Water level not affected", | ||
| "depth_to_water_ft": "7", | ||
| "data_quality": "Water level accurate to within two hundreths of a foot", | ||
| "water_level_notes": "Initial measurement", | ||
| } | ||
| yield data | ||
| del data |
There was a problem hiding this comment.
Corrected spelling of 'hundreths' to 'hundredths' in data_quality field.
There was a problem hiding this comment.
This typo exists in LU_DataQuality. Should it be fixed here or at a late time?
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
| response_model=WaterLevelBulkUploadResponse, | ||
| status_code=HTTP_200_OK, | ||
| response_model=WaterLevelBulkUploadPayload, | ||
| status_code=HTTP_201_CREATED, |
There was a problem hiding this comment.
The naming can change as we see fit 😅 . I tried to mimic the nomenclature from when I refactored/updated the water level csv implementation. I think originally we were planning to have it work as an API endpoint but now it's being implemented by the CLI, whose "response" we need to define manually.
| context.datetime_fields.append("date_time") | ||
|
|
||
|
|
||
| # TODO: implement when optional water levels are added to the well inventory csv testing data |
There was a problem hiding this comment.
Why is this test here before it's implemented? It's kinda silly to have a test with just pass.
There was a problem hiding this comment.
Good question! Since it is in the feature file if I don't implement the step then a warning/error will be flagged and the tests will not pass.
As I wrote at the end of the PR description this is the next and final step in getting the csv uploads to be fully implemented
| field_staff,well_name_point_id,field_event_date_time,measurement_date_time,sampler,sample_method,mp_height,level_status,depth_to_water_ft,data_quality,water_level_notes | ||
| Alice Lopez,AR0001,2025-02-15T08:00:00-07:00,2025-02-15T10:30:00-07:00,Groundwater Team,electric tape,1.5,stable,45.2,approved,Initial measurement after irrigation shutdown | ||
| Bernardo Chen,AR0002,2025-03-05T09:15:00-07:00,2025-03-05T11:10:00-07:00,Consultant,steel tape,1.8,rising,47.0,provisional,Follow-up visit; pump was off for 24h | ||
| Alice Lopez,AR0001,2025-02-15T08:00:00,2025-02-15T10:30:00,Groundwater Team,electric tape,1.5,stable,45.2,approved,Initial measurement after irrigation shutdown |
There was a problem hiding this comment.
So, the CSVs that we use cannot have the UTC offset, or this feature will not work?
There was a problem hiding this comment.
The intension is for users to submit timezone-naive and then we add the offset manually, either MST or MDT (assuming that the data is submitted in Mountain Time)
Why
This PR addresses the following problem / context:
water level csvupload needs to be implemented correctly per the feature tests and DB designwater-level-csv.featureandwell-inventory-csv.featureneed to be resolved so that the same step functions can be reusedHow
Implementation summary - the following was changed / added / removed:
Implementation
water level csvupload implementation was refactored to leverage Pydantic for data validations and code reusewater level csvfile were mapped to the correct tables and fields in OcotilloTests
water-level-csv.featureandwell-inventory-csv.featurewere moved totests/features/common.pyso that they can be reused for both feature files.context.csv_rowstocontext.rowsinwater-level-csv.featurefor the style and implementation to be the same between the two feature filestests/services/water_level_csv.pyto ensure that the function used to add bulk data to the DB works and persists data as expectedNotes
Any special considerations, workarounds, or follow-up work to note?
well inventory csvupload. This will be achievable by reusing the functionservices/water_level_csv.py::bulk_upload_water_levelsin thewell inventory csvupload. That function requires the wells to already exist in the DB, so it should be called after the wells have been persisted. Some things to consider as this final functionality is implemented/developed:bulk_upload_water_levelsfunction cannot be used forwell inventory csvthe Pydantic schemas can and should be used.the optional "water_level_date_time" values are valid ISO 8601 timezone-naive datetime strings (e.g. "2025-02-15T10:30:00") when providedintests/features/well-inventory-csv.pyneeds to be uncommented when this is implemented.well-inventory-csvthat has been vetted and approved. PR 🚨 ⚠️ 💀 [NO TICKET]: Well inventory csv Feature Tests #247 should get merged intostagingbeforewater-level-csvgets merged intostaging(note: this is to merge intowater-level-csv, which also has the latest work fromwell-inventory-csv)