Custom CSV handling, small improvements to types and enrichment example by DanDits · Pull Request #16 · DisyInformationssysteme/cadenza-analytics-python

DanDits · 2026-01-26T22:01:34Z

No description provided.

…only empty strings, by working on the input dataframe and not an empty dataframe

…tions and to allow working around issues with pandas csv parsing and writing

DanDits · 2026-01-26T22:24:27Z

Should something be mentioned in the CHANGELOG? If we merge this, the only user visible changes will be the slightly adjusted example, the "support" for python 3.11 and some type annotation improvements. The CSV changes are in that sense no new features or behavior changes, more fixes to achieve the expected behavior in various 'edge' cases.

buddemat · 2026-01-30T08:59:48Z

src/cadenzaanalytics/util/csv.py

+    lines.append(_format_row(columns_list, columns_list, None, None, None, None))
+
+    # Write data rows
+    for _, row in df.iterrows():


iterrows is very slow. It would probably be 5-10x faster to use itertuples or transform into a numpy array like so:

values = df.to_numpy(dtype=object, na_value=None) for row in values: lines.append(_format_row(list(row), ...))

buddemat · 2026-02-02T09:36:21Z

src/cadenzaanalytics/util/csv.py

+                # Quoted value - extract content (can contain newlines)
+                pos += 1
+                value = []
+                while pos < len(csv_data):


This looks at the whole payload character by character. For large data, this will be very slow.
We could user str.find() instead to find the next quote (should be implemented in C).

Something along the lines of

while pos < len(csv_data): next_quote = csv_data.find('"', pos) value_parts.append(csv_data[pos:next_quote]) pos = next_quote + 1

buddemat

I have 2 comments concerning performance, whcih I guess should be addressed. I have not looked at the tests.

From what I read in the channels, both @julianjanssen and @ArneBab see some issues with the "full custom csv import" approach. We might want to discuss this once more?

…andling

…ead of custom reader

…ending on version

… to utc to have a stable output, always have pandas Timestamps and prevent issues with mixed timezone offsets

…rted and attempt to lower required version to 3.10

…about the cadenza server timezone, use the python server timezone as fallback

ArneBab · 2026-02-11T16:48:48Z

src/cadenzaanalytics/util/csv.py

+
+    all_rows =  _parse_csv_with_default_reader(csv_data) \
+        if sys.version_info >= (3, 13) \
+        else _parse_csv(csv_data)


Looks good.

ArneBab · 2026-02-11T16:49:14Z

src/cadenzaanalytics/util/csv.py

+
+    for row in reader:
+        all_rows.append(row)
+    return all_rows


feat: improve type annotations for parameter values and mappings

b21062f

DanDits requested a review from buddemat January 26, 2026 22:01

DanDits self-assigned this Jan 26, 2026

DanDits force-pushed the slb/dd branch from b2de9e0 to ef4871d Compare January 26, 2026 22:14

dittmar added 3 commits January 26, 2026 23:18

fix example enrichment extension to actually enrich some values, not …

d2eff39

…only empty strings, by working on the input dataframe and not an empty dataframe

CADENZA-42792 feat: add custom csv parser handling all special defini…

1a08ee0

…tions and to allow working around issues with pandas csv parsing and writing

decrease required python version to 3.11

6e64ade

DanDits force-pushed the slb/dd branch from ef4871d to 6e64ade Compare January 26, 2026 22:19

add changelog entries

4e405cf

buddemat reviewed Jan 30, 2026

View reviewed changes

buddemat reviewed Feb 2, 2026

View reviewed changes

buddemat requested changes Feb 2, 2026

View reviewed changes

dittmar added 7 commits February 5, 2026 09:38

adapt csv parsing to work with pandas 3.0.0

71b694c

add test case with confirmed output from cadenza concerning newline h…

b3c0dd3

…andling

for python versions after 3.12 prefer to use standard csv.reader inst…

397bbfb

…ead of custom reader

extract method for parsing CSV with two different implementations dep…

9fb7c82

…ending on version

change behavior to normalize ZONED_DATA_TIME values read from cadenza…

af0f916

… to utc to have a stable output, always have pandas Timestamps and prevent issues with mixed timezone offsets

add cadenzaAnalyticsVersion to capabilities and discovery responses

06465c1

remove obsolete comment that python versions below 3.12 are not suppo…

aaec5b1

…rted and attempt to lower required version to 3.10

DanDits force-pushed the slb/dd branch from e5c90cd to aaec5b1 Compare February 5, 2026 10:13

dittmar added 2 commits February 5, 2026 13:26

add timezone information to the AnalyticsRequest to get some context …

80d6d09

…about the cadenza server timezone, use the python server timezone as fallback

improve performance writing cadenza csv by using itertuples

55f4fef

ArneBab reviewed Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom CSV handling, small improvements to types and enrichment example#16

Custom CSV handling, small improvements to types and enrichment example#16
DanDits wants to merge 14 commits intomainfrom
slb/dd

DanDits commented Jan 26, 2026

Uh oh!

DanDits commented Jan 26, 2026

Uh oh!

buddemat Jan 30, 2026

Uh oh!

buddemat Feb 2, 2026

Uh oh!

buddemat left a comment

Uh oh!

ArneBab Feb 11, 2026

Uh oh!

ArneBab Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

DanDits commented Jan 26, 2026

Uh oh!

DanDits commented Jan 26, 2026

Uh oh!

buddemat Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

buddemat Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

buddemat left a comment

Choose a reason for hiding this comment

Uh oh!

ArneBab Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

ArneBab Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants