BDMS 412/529/530: water level csv implementation by jacob-a-brown · Pull Request #466 · DataIntegrationGroup/OcotilloAPI

jacob-a-brown · 2026-02-05T17:38:46Z

Why

This PR addresses the following problem / context:

The water level csv upload needs to be implemented correctly per the feature tests and DB design
Common steps between water-level-csv.feature and well-inventory-csv.feature need to be resolved so that the same step functions can be reused

How

Implementation summary - the following was changed / added / removed:

Implementation

The water level csv upload implementation was refactored to leverage Pydantic for data validations and code reuse
The different fields in the water level csv file were mapped to the correct tables and fields in Ocotillo

Tests

The common steps between water-level-csv.feature and well-inventory-csv.feature were moved to tests/features/common.py so that they can be reused for both feature files.
- This also meant renaming context.csv_rows to context.rows in water-level-csv.feature for the style and implementation to be the same between the two feature files
A unit test was added to tests/services/water_level_csv.py to ensure that the function used to add bulk data to the DB works and persists data as expected
The cli tests were updated for the updated implementation

Notes

Any special considerations, workarounds, or follow-up work to note?

Once this PR has been accepted the final thing to do is implement optional water levels for the well inventory csv upload. This will be achievable by reusing the function services/water_level_csv.py::bulk_upload_water_levels in the well inventory csv upload. That function requires the wells to already exist in the DB, so it should be called after the wells have been persisted. Some things to consider as this final functionality is implemented/developed:
- Even if thebulk_upload_water_levels function cannot be used for well inventory csv the Pydantic schemas can and should be used.
- If a well is uploaded correctly, but there is an error in its associated (optional) water level, should that well still be uploaded? What will the error message(s) look like?
- If everything is successfully uploaded, what will the stdout look like?
- The step test the optional "water_level_date_time" values are valid ISO 8601 timezone-naive datetime strings (e.g. "2025-02-15T10:30:00") when provided in tests/features/well-inventory-csv.py needs to be uncommented when this is implemented.
- The well inventory unit tests need to be updated to ensure that the optional water level upload is working as expected. Correct data validation, correct data persistence, etc...
This uses some functionality from well-inventory-csv that has been vetted and approved. PR 🚨 ⚠️ 💀 [NO TICKET]: Well inventory csv Feature Tests #247 should get merged into staging before water-level-csv gets merged into staging (note: this is to merge into water-level-csv, which also has the latest work from well-inventory-csv)

This was deprecated awhile ago but still existed in the lexicon and parameter files. The only valid sample matrix for water level observations is "water"

The bulk upload is used for both the API and CLI, so it should be tested by itself. Then the API and CLI tests can just call the bulk upload function and verify that they run as expected

All water levels must be associated with an existing contact in the DB, so the contact fixture is used instead of hardcoding a name.

hanging data prevents tests from running cleanly

This large refactor updates the water level CSV bulk upload process in the following ways: - It uses the updated schema/headers for the CSV - It uses Pydantic to do all of the data validations that don't pertain to the DB - It updates tests to reflect these changes - It tests the services function since that logic is used by both the API and CLI

The logic to test if the bulk upload function works correctly is already covered in its own unit tests. The CLI test was redundant and has been removed to streamline the test suite.

This commit adds a unit test to verify that the water level csv upload service correctly handles the scenario where the specified source file does not exist. The test checks that the appropriate error message is returned in the payload when a FileNotFoundError is raised.

instead of just the first one. This allows users to see all issues with a row at once instead of having to fix them one by one.

the model needs to be validated and created before any comparisons can be made against the database

this is required in the feature file

…n value This reflects the updates to the feature test

It is now called water_level_date_time instead of measurement_date_time

The required/optional givens are common between well inventory and water level CSVs, so this commit moves them to common.py to avoid duplication and make them reusable across different feature files.

The previous implementation was for an outdated version of the water level csv handling. This commit updates the test to align with the new implementation, ensuring that the CLI commands are tested correctly with the current water level csv processing logic.

Both water-level-csv and well-inventory-csv features have the same step to convert datetime fields from naive to aware and confirm the correct timezone offset. This commit moves that step implementation to common.py to avoid code duplication and ensure consistent testing of datetime handling across both features.

This commit adds a feature test for invalid "measuring_person" values in the water level csv upload. It also uses standard verbiage for the feature test so that the same step can be reused.

Copilot

Pull request overview

This PR implements the water level CSV upload functionality with proper Pydantic-based validation and database persistence, while refactoring common test steps to enable code reuse between water-level-csv and well-inventory-csv feature files.

Changes:

Refactored water level CSV upload to use Pydantic schemas for validation and structured data handling
Moved common test steps to tests/features/common.py for reuse across feature files
Updated field mappings to align with database schema (e.g., measurement_date_time → water_level_date_time, field staff handling)

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
services/water_level_csv.py	Core implementation refactored to use Pydantic schemas and proper database field mappings
schemas/water_level_csv.py	New Pydantic schema definitions for CSV rows and bulk upload responses
tests/services/test_water_level_csv.py	New comprehensive unit tests for bulk upload function
tests/test_observation.py	Updated API test to use fixture-based test data and verify all created database entities
tests/test_cli_commands.py	Updated CLI tests with detailed verification of database insertions
tests/features/steps/common.py	New common test steps for CSV field validation and datetime handling
tests/features/steps/water-levels-csv.py	Updated to use Pydantic schema and shared common steps
tests/features/steps/well-inventory-csv.py	Removed duplicate steps now available in common.py
tests/conftest.py	New fixture providing test data for water level bulk uploads
tests/features/environment.py	Updated contact creation to support multiple named contacts
api/observation.py	Updated to return proper Pydantic payload and 201 status code
core/parameter.json	Corrected matrix value from "groundwater" to "water"

Copilot · 2026-02-05T17:39:33Z

tests/conftest.py

+@pytest.fixture()
+def water_level_bulk_upload_data(water_well_thing, contact, second_contact):
+    data = {
+        "well_name_point_id": water_well_thing.name,
+        "field_event_date_time": "2025-02-15T08:00:00",
+        "field_staff": contact.name,
+        "field_staff_2": second_contact.name,
+        "field_staff_3": "",
+        "water_level_date_time": "2025-02-15T10:30:00",
+        "measuring_person": contact.name,
+        "sample_method": "Electric tape measurement (E-probe)",
+        "mp_height": "1.5",
+        "level_status": "Water level not affected",
+        "depth_to_water_ft": "7",
+        "data_quality": "Water level accurate to within two hundreths of a foot",
+        "water_level_notes": "Initial measurement",
+    }
+    yield data
+    del data


Corrected spelling of 'hundreths' to 'hundredths' in data_quality field.

@jirhiker and @ksmuczynski

This typo exists in LU_DataQuality. Should it be fixed here or at a late time?

tests/features/steps/common.py

jacob-a-brown · 2026-02-05T17:41:56Z

@codex review

chatgpt-codex-connector · 2026-02-05T17:42:01Z

To use Codex here, create a Codex account and connect to github.

jacob-a-brown · 2026-02-05T17:46:31Z

@codex review

chatgpt-codex-connector · 2026-02-05T17:46:37Z

To use Codex here, create a Codex account and connect to github.

TylerAdamMartinez · 2026-02-05T17:42:39Z

api/observation.py

-    response_model=WaterLevelBulkUploadResponse,
-    status_code=HTTP_200_OK,
+    response_model=WaterLevelBulkUploadPayload,
+    status_code=HTTP_201_CREATED,


The naming can change as we see fit 😅 . I tried to mimic the nomenclature from when I refactored/updated the water level csv implementation. I think originally we were planning to have it work as an API endpoint but now it's being implemented by the CLI, whose "response" we need to define manually.

TylerAdamMartinez · 2026-02-05T17:47:25Z

tests/features/steps/well-inventory-csv.py

+        context.datetime_fields.append("date_time")
+

+# TODO: implement when optional water levels are added to the well inventory csv testing data


Why is this test here before it's implemented? It's kinda silly to have a test with just pass.

Good question! Since it is in the feature file if I don't implement the step then a warning/error will be flagged and the tests will not pass.

As I wrote at the end of the PR description this is the next and final step in getting the csv uploads to be fully implemented

TylerAdamMartinez · 2026-02-05T17:50:21Z

tests/features/data/water-levels.csv

 field_staff,well_name_point_id,field_event_date_time,measurement_date_time,sampler,sample_method,mp_height,level_status,depth_to_water_ft,data_quality,water_level_notes
-Alice Lopez,AR0001,2025-02-15T08:00:00-07:00,2025-02-15T10:30:00-07:00,Groundwater Team,electric tape,1.5,stable,45.2,approved,Initial measurement after irrigation shutdown
-Bernardo Chen,AR0002,2025-03-05T09:15:00-07:00,2025-03-05T11:10:00-07:00,Consultant,steel tape,1.8,rising,47.0,provisional,Follow-up visit; pump was off for 24h
+Alice Lopez,AR0001,2025-02-15T08:00:00,2025-02-15T10:30:00,Groundwater Team,electric tape,1.5,stable,45.2,approved,Initial measurement after irrigation shutdown


So, the CSVs that we use cannot have the UTC offset, or this feature will not work?

The intension is for users to submit timezone-naive and then we add the offset manually, either MST or MDT (assuming that the data is submitted in Mountain Time)

And "water_level_date_time" values are valid ISO 8601 timezone-naive datetime strings (e.g. "2025-02-15T10:30:00")

Then all datetime objects are assigned the correct Mountain Time timezone offset based on the date value.

jacob-a-brown and others added 30 commits December 22, 2025 11:38

fix: deprecate groundwater level sample matrix

fe3625a

This was deprecated awhile ago but still existed in the lexicon and parameter files. The only valid sample matrix for water level observations is "water"

feat: test water level bulk upload

35985cf

The bulk upload is used for both the API and CLI, so it should be tested by itself. Then the API and CLI tests can just call the bulk upload function and verify that they run as expected

refactor: use contact fixture for test bulk water level upload

70153aa

All water levels must be associated with an existing contact in the DB, so the contact fixture is used instead of hardcoding a name.

fix: cleanup transfer tests

2198e8b

hanging data prevents tests from running cleanly

Merge branch 'water-level-csv' into water-level-csv-refactor

53e6a10

refactor: remove redundant water level test in CLI

70d0e31

The logic to test if the bulk upload function works correctly is already covered in its own unit tests. The CLI test was redundant and has been removed to streamline the test suite.

feat: implement water level csv unit test for nonexistent well

fad9f98

feat: implement water level csv unit test for bad dtw bgs

08e7a60

feat: implement water level csv unit test for bad dtw bgs

f0b6b2f

feat: enable all errors in a row to be captured

797dee4

instead of just the first one. This allows users to see all issues with a row at once instead of having to fix them one by one.

feat: test invalid field staff names

d7f38e8

fix: remove outdated TODO note

ededb61

refactor: update positive givens for water level csv feature tests

c4ae569

fix: only process DB validations after Pydantic validation has passed

d8d0aa3

the model needs to be validated and created before any comparisons can be made against the database

feat: add water_level_date_time to water level csv upload response

ba27820

this is required in the feature file

fix: update seeding data

03f0b3e

fix: remove measuring_person from scenario since that is not a lexico…

64f5ea3

…n value This reflects the updates to the feature test

fix: update test steps to reflect changes in field names

50cfa8d

It is now called water_level_date_time instead of measurement_date_time

Merge branch 'water-level-csv' into water-level-csv-refactor

0ac590f

Merge branch 'water-level-csv' into water-level-csv-refactor

4ec2635

refactor: move common CSV gives to common.py

19306c9

The required/optional givens are common between well inventory and water level CSVs, so this commit moves them to common.py to avoid duplication and make them reusable across different feature files.

Merge branch 'water-level-csv' into water-level-csv-refactor

493f027

Formatting changes

64bace0

Merge branch 'water-level-csv' into water-level-csv-refactor

5ef609e

formatting changes

d4acffe

feat: test invalid "measuring_person" and standardize test verbiage

a1052a4

This commit adds a feature test for invalid "measuring_person" values in the water level csv upload. It also uses standard verbiage for the feature test so that the same step can be reused.

fix: remove commented-out code

6aced6c

jacob-a-brown requested review from TylerAdamMartinez, chasetmartin, Copilot, jirhiker, kbighorse and ksmuczynski February 5, 2026 17:38

Copilot AI reviewed Feb 5, 2026

View reviewed changes

fix: remove print debugging statement

360495a

fix: fix spelling typo

7debe73

TylerAdamMartinez reviewed Feb 5, 2026

View reviewed changes

jacob-a-brown added 3 commits February 5, 2026 16:04

Merge branch 'water-level-csv' into water-level-csv-refactor

a8bd3ef

Merge branch 'water-level-csv' into water-level-csv-refactor

cfa825c

Merge branch 'water-level-csv' into water-level-csv-refactor

336e741

		context.datetime_fields.append("date_time")


		# TODO: implement when optional water levels are added to the well inventory csv testing data

Conversation

jacob-a-brown commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

How

Implementation

Tests

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

jacob-a-brown Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jacob-a-brown commented Feb 5, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 5, 2026

Uh oh!

jacob-a-brown commented Feb 5, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 5, 2026

Uh oh!

TylerAdamMartinez Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

jacob-a-brown Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

TylerAdamMartinez Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

jacob-a-brown Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

TylerAdamMartinez Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

jacob-a-brown Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jacob-a-brown commented Feb 5, 2026 •

edited

Loading