Parse multi cuvette data by danolson1 · Pull Request #9 · dd-hebert/uv_pro

danolson1 · 2025-12-10T16:12:35Z

I made some edits to the import_kd.py file to enable it to parse data about the cuvette ID (called SAMPLES_CELL_1, 2, 3, etc. in the .KD file). I also included an example .KD file with 3 cuvettes.

I tried to be as conservative as possible to avoid disrupting any downstream components. I added samples_cell as a property of the KDFile object and didn't change the exported spectra dataframe, although in the future it might be useful to include the samples_cell info to that dataframe.

This is a .KD file with data for 3 different cuvettes

Added samples_cell_header to identify SAMPLES_CELL_x text for decoding data from multi-cuvette samples.

Added handle_samples and parse_samples functions

Initial implementation of samples_cell. Before debugging.

I tested it on both multi-cuvette and single-cuvette files and I don't get any errors. Summary Added multi-cuvette support by creating a new samples_cell attribute: 1. Added samples_cell_header class attribute (line 58-61): - Header: RegName in UTF-16-LE - Spacing: 18 bytes from header to first cell name 2. Added _handle_samples_cell method (lines 162-183): - Finds the RegName header once - Iterates through sequential 30-byte entries (2-byte prefix + 28-byte cell name) - Stops when it encounters a non-SAMPLES_CELL string - Returns a pd.Series with cell identifiers 3. Added _parse_samples_cell method (lines 217-223): - Reads a fixed 28-byte UTF-16-LE encoded cell name - Returns None on decode errors 4. Updated parse_kd to return samples_cell and updated the __init__ assignment Results: - Multi-cuvette file: Returns 357 entries with SAMPLES_CELL_1, SAMPLES_CELL_2, SAMPLES_CELL_3 (119 each) - Single-cuvette files: Returns all entries as SAMPLES_CELL_1

I'm adding a file called "1229 PDC PYRUVATE 100MM-8KD" which I renamed to "multi_cuvette_test_data_corrupted.KD." It's an example of a file corruption where the final data point from the previously-saved file gets appended to the start of this file. I think this kind of corruption can be detected and fixed relatively easily.

Setting up tests for fixing a bug with a corrupted .KD file

I fixed the bug by adding validation for time values in the KD file parser. The changes: 1. Added warnings import (import_kd.py:11) to issue warnings about corrupted files. 2. Added _validate_and_fix_data() method (import_kd.py:161-246) that: - Builds a working DataFrame by transposing the spectra and adding the sample cell column - Uses pandas groupby to process each cuvette's data separately - Detects non-increasing time values by finding "reset points" where time decreases - Marks all preceding timepoints with values >= the reset time as invalid - Issues two warnings: one about potential corruption and one about removed timepoints - Returns cleaned spectra, spectra_times, and samples_cell with corrupt data removed 3. Integrated validation into parse_kd() (import_kd.py:132-135) so it runs automatically when parsing any .KD file. 4. Created tests (tests/test_import_kd.py) to verify: - Valid files produce no corruption warnings - Corrupted files produce warnings and have bad data removed The fix correctly identifies and removes the corrupt timepoint (730.3 seconds) from each of the 3 cuvettes in the corrupted test file.

danolson1 · 2026-01-15T16:34:49Z

I found that some of my .KD files were corrupted, and made some additional edits to this branch to identify and fix problems associated with this.

danolson1 added 5 commits December 9, 2025 22:47

Create multi_cuvette_test_data.KD

d23d25a

This is a .KD file with data for 3 different cuvettes

Update import_kd.py

81eac13

Added samples_cell_header to identify SAMPLES_CELL_x text for decoding data from multi-cuvette samples.

Update import_kd.py

9e316c3

Added handle_samples and parse_samples functions

Initial implementation of samples_cell

6d4bca8

Initial implementation of samples_cell. Before debugging.

dd-hebert self-requested a review December 11, 2025 02:08

dd-hebert self-assigned this Dec 11, 2025

dd-hebert added the enhancement New feature or request label Dec 11, 2025

danolson1 added 3 commits January 11, 2026 22:19

Setting up tests

a9813eb

Setting up tests for fixing a bug with a corrupted .KD file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse multi cuvette data#9

Parse multi cuvette data#9
danolson1 wants to merge 8 commits intodd-hebert:mainfrom
danolson1:parse-multi-cuvette-data

danolson1 commented Dec 10, 2025

Uh oh!

danolson1 commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danolson1 commented Dec 10, 2025

Uh oh!

danolson1 commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants