[PLU-121]: fix Teradata connector case sensitivity for Enterprise databases by simoncoombes · Pull Request #651 · Unstructured-IO/unstructured-ingest

simoncoombes · 2026-02-25T21:16:18Z

Summary

Enterprise Teradata uppercases all column names in cursor.description, while ClearScape Analytics preserves the original case
This causes a KeyError in the downloader when it tries to look up the user-configured id_column (lowercase) against uppercase DataFrame columns
Normalize column names to lowercase in TeradataDownloader.query_db (line 157) and TeradataUploader.get_table_columns (line 205)

Root Cause

Standard Enterprise Teradata folds unquoted identifiers to uppercase per the ANSI SQL spec. The existing connector was only tested against ClearScape Analytics (Teradata's developer edition), which preserves case as-typed. When a customer pointed the connector at Enterprise Teradata, cursor.description returned ID instead of id, causing the downstream result.iloc[0][id_column] lookup to fail with KeyError: 'id'.

Changes

TeradataDownloader.query_db: columns = [col[0].lower() for col in cursor.description]
TeradataUploader.get_table_columns: self._columns = [desc[0].lower() for desc in cursor.description]

Test plan

Verified against patched image (teradata-case-fix) on SND with ClearScape source — no regression
Customer to verify against Enterprise Teradata once deployed

…abases Enterprise Teradata uppercases column names in cursor.description, causing KeyError when the downloader and uploader try to match them against user-provided lowercase config values. Normalize column names to lowercase in TeradataDownloader.query_db and TeradataUploader.get_table_columns.

potter-potter · 2026-02-25T23:34:56Z

@claude please review

claude · 2026-02-25T23:35:15Z

Claude finished @potter-potter's task —— View job

Code Review Complete ✅

Reviewing PR for Teradata connector case sensitivity fix.

Todo List:

Read repository setup instructions (CLAUDE.md) - No CLAUDE.md file found
Examine the changed Teradata connector file
Analyze the specific code changes
Review for correctness and potential issues
Provide comprehensive review feedback

Review Summary

Overall Assessment: ✅ APPROVED

This is a well-targeted fix for a real compatibility issue between Enterprise Teradata and ClearScape Analytics environments. The changes are minimal, focused, and address the root cause effectively.

Detailed Review

✅ Root Cause Analysis

The PR correctly identifies the issue: Enterprise Teradata follows ANSI SQL spec by folding unquoted identifiers to uppercase, while ClearScape Analytics preserves case. This causes KeyError when looking up lowercase id_column against uppercase DataFrame columns.

✅ Solution Approach

Normalizing column names to lowercase in both the downloader and uploader is the right approach:

Line 157 (TeradataDownloader.query_db): columns = [col[0].lower() for col in cursor.description]
Line 205 (TeradataUploader.get_table_columns): self._columns = [desc[0].lower() for desc in cursor.description]

✅ Code Quality

Changes are minimal and surgical
Consistent approach across both downloader and uploader
Maintains existing code structure and patterns
No performance impact (simple string operation)

✅ Backward Compatibility

Safe change: converting already-lowercase column names to lowercase is idempotent
Works for both Enterprise Teradata (uppercase) and ClearScape Analytics (case-preserved)
No breaking changes to existing functionality

✅ Error Handling

The fix prevents the KeyError: 'id' issue described in the PR by ensuring consistent lowercase column naming throughout the pipeline.

Minor Observations

Consistency: Both changes use the same pattern [col[0].lower() for col in cursor.description] - good consistency.
Edge Cases: The solution handles the common case well. If there were ever case-sensitive column requirements in the future, this could be made configurable, but that's not needed for this fix.
Testing: The PR mentions testing against ClearScape (no regression) with customer verification pending for Enterprise Teradata - appropriate testing approach.

Recommendations

No changes required - this is a solid, minimal fix that directly addresses the issue without unnecessary complexity.

simoncoombes requested a review from a team as a code owner February 25, 2026 21:16

simoncoombes temporarily deployed to ci February 25, 2026 21:16 — with GitHub Actions Inactive

bump version to 1.4.6 and update changelog

74afc25

simoncoombes temporarily deployed to ci February 25, 2026 23:40 — with GitHub Actions Inactive

simoncoombes temporarily deployed to ci February 25, 2026 23:41 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PLU-121]: fix Teradata connector case sensitivity for Enterprise databases#651

[PLU-121]: fix Teradata connector case sensitivity for Enterprise databases#651
simoncoombes wants to merge 2 commits intomainfrom
scoombes/PLU-121-teradata-case-sensitivity

simoncoombes commented Feb 25, 2026

Uh oh!

potter-potter commented Feb 25, 2026

Uh oh!

claude bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simoncoombes commented Feb 25, 2026

Summary

Root Cause

Changes

Test plan

Uh oh!

potter-potter commented Feb 25, 2026

Uh oh!

claude bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete ✅

Todo List:

Review Summary

Detailed Review

✅ Root Cause Analysis

✅ Solution Approach

✅ Code Quality

✅ Backward Compatibility

✅ Error Handling

Minor Observations

Recommendations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude bot commented Feb 25, 2026 •

edited

Loading