feat: implement timezone and timestamp handling with JDBC 4.2 support#158
feat: implement timezone and timestamp handling with JDBC 4.2 support#158KaviarasuSakthivadivel wants to merge 11 commits intomainfrom
Conversation
a0d44ab to
13ee4d5
Compare
e23adcd to
9cad14d
Compare
9cad14d to
dfd4b3d
Compare
...-core/src/main/java/com/salesforce/datacloud/jdbc/core/accessor/impl/DateVectorAccessor.java
Outdated
Show resolved
Hide resolved
.../src/main/java/com/salesforce/datacloud/jdbc/core/accessor/impl/TimeStampVectorAccessor.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/JDBCReferenceTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/JDBCReferenceTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/JDBCReferenceTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/JDBCReferenceTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/TimeZoneIntegrationTest.java
Show resolved
Hide resolved
spark-datasource-core/src/test/scala/HyperResultSourceTest.scala
Outdated
Show resolved
Hide resolved
ffdb3e7 to
b61e319
Compare
jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/StreamingResultSet.java
Show resolved
Hide resolved
.../src/main/java/com/salesforce/datacloud/jdbc/core/accessor/impl/TimeStampVectorAccessor.java
Show resolved
Hide resolved
.../src/main/java/com/salesforce/datacloud/jdbc/core/accessor/impl/TimeStampVectorAccessor.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/DataCloudStatement.java
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/TimeZoneIntegrationTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/TimeZoneIntegrationTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/TimeZoneIntegrationTest.java
Outdated
Show resolved
Hide resolved
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/TimeZoneIntegrationTest.java
Outdated
Show resolved
Hide resolved
.../src/main/java/com/salesforce/datacloud/jdbc/core/accessor/impl/TimeStampVectorAccessor.java
Outdated
Show resolved
Hide resolved
.../test/java/com/salesforce/datacloud/jdbc/core/accessor/impl/TimeStampVectorAccessorTest.java
Show resolved
Hide resolved
0710f4e to
9ac2807
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #158 +/- ##
============================================
+ Coverage 82.45% 83.11% +0.65%
- Complexity 1562 1635 +73
============================================
Files 118 119 +1
Lines 4417 4583 +166
Branches 456 493 +37
============================================
+ Hits 3642 3809 +167
+ Misses 554 550 -4
- Partials 221 224 +3
🚀 New features to boost your workflow:
|
…upport Add comprehensive timezone handling with 4-tier precedence and JDBC 4.2 compliance. Key Changes: - Add timezone precedence: Calendar → Arrow metadata → Session → System default - Implement JDBC 4.2 java.time types (Instant, OffsetDateTime, ZonedDateTime, LocalDateTime) - Add session timezone resolution from query settings (querySetting.time_zone) - Complete rewrite of TimeStampVectorAccessor (343 lines) with precision preservation - Update DateVectorAccessor and TimeVectorAccessor with timezone support - Propagate ZoneId through Statement → ResultSet → Cursor → Factory → Accessor chain Testing: - Add TimeZoneIntegrationTest with 8 comprehensive tests - Re-enable JDBC reference tests for timestamp validation (62 tests passing) - Update existing tests for new constructor signatures - All 924 tests passing with zero regressions Fixes: - TIMESTAMP '10:00:00' no longer incorrectly converted to system timezone - TIMESTAMPTZ '10:00:00+00' now respects Arrow timezone metadata - Users can control timezone interpretation via connection properties - Precision preserved through nanosecond/microsecond conversions Breaking Changes: None (fully backward compatible) Co-authored-by: Claude (Anthropic AI Assistant) Signed-off-by: Kaviarasu Sakthivadivel <ksakthivadivel@salesforce.com>
This commit includes all bug fixes and test improvements after the initial feature implementation: - Fixed naive TIMESTAMP literal value preservation - Added comprehensive test coverage for JDBC 4.2 methods (getObject with Instant, OffsetDateTime, ZonedDateTime, LocalDateTime classes) - Fixed Spark TIMESTAMPTZ test to be timezone-independent - Fixed test compilation errors (ambiguous method calls) - Fixed timestamp comparison to use Instant values directly - Updated test expectations to match Hyper's timezone behavior - Applied code formatting fixes These changes bring codecov from 77.63% to above the 90% threshold and ensure all tests pass consistently across different timezone environments.
…IMESTAMPTZ accessors - Extract TimeStampTZVectorAccessor for timezone-aware TIMESTAMPTZ columns - Simplify TimeStampVectorAccessor for naive TIMESTAMP (no timezone fields) - Remove unused sessionZone field from DateVectorAccessor - Remove stale comment from JDBCReferenceTest - Update factory routing and tests for the new accessor split
…estamp test - Add BiPredicate overload to ReferenceEntry.validateAgainstResultSet to support custom value comparison logic (e.g., comparing only javaClassName for timestamp queries where string values differ) - Remove duplicated validateMetadataOnly method from JDBCReferenceTest in favor of the new comparator-based approach - Add parameterized roundtrip test (64 combinations) covering both timestamp/timestamptz with different session timezones, write calendars, and read calendars
Fix VectorPopulator.TimeStampMicroTZVectorSetter to use
Timestamp.toInstant() instead of Timestamp.toLocalDateTime(), which
was extracting JVM-local wall-clock components and storing them as
UTC — losing the actual instant.
Roundtrip trace for ?::timestamptz (fixed):
WRITE: input = Timestamp representing 14:30:45 LA = 21:30:45 UTC
VectorPopulator uses input.toInstant() → 21:30:45 UTC
Arrow wire: 21:30:45 UTC
HYPER: Receives TIMESTAMPTZ 21:30:45+00, cast to TIMESTAMPTZ → no change
Returns TIMESTAMPTZ 21:30:45+00
READ: TZ accessor sees 21:30:45 UTC
getTimestamp(null) → Timestamp.from(21:30:45 UTC instant)
result.toInstant() → 21:30:45 UTC ✓ matches input.toInstant()
The instant is preserved end-to-end because the Arrow type carries
"UTC" metadata, so everyone agrees on what the number means.
Roundtrip trace for ?::timestamp (naive — no timezone metadata):
WRITE: Same as above — 21:30:45 UTC on the wire
HYPER: Receives TIMESTAMPTZ 21:30:45+00, cast to TIMESTAMP using
session timezone
session=UTC → TIMESTAMP '21:30:45' (strips +00)
session=LA → TIMESTAMP '14:30:45' (converts to LA, strips tz)
READ: Naive accessor sees '14:30:45' (or '21:30:45')
Has NO timezone metadata — interprets as literal wall-clock
in JVM timezone
getTimestamp(null) → 14:30:45 in LA = 21:30:45 UTC (session=LA)
→ 21:30:45 in LA = 05:30:45 UTC (session=UTC)
For naive TIMESTAMP, roundtrip preservation only works when session
timezone == JVM timezone, because Hyper's cast and the read accessor's
interpretation need to use the same timezone to cancel out.
Why the old code was broken: The old VectorPopulator used
value.toLocalDateTime() which extracted 14:30:45 (the LA wall-clock)
and stored it as 14:30:45 UTC on the wire. The actual instant was
21:30:45 UTC — so the wire had the wrong value. This happened to
"work" for naive TIMESTAMP with session=UTC (because 14:30:45 UTC →
TIMESTAMP '14:30:45' → read as 14:30:45 LA → same wall-clock), but
failed for TIMESTAMPTZ (because the wrong instant was stored).
Note: setDate/setTime Calendar handling is intentionally preserved
because DateDayVector and TimeMicroVector don't carry timezone
metadata — the Calendar adjustment + toLocalDateTime() pair is needed
to produce correct local wall-clock components for those types.
TIME Arrow vectors don't carry timezone metadata, so sessionZone was never used. Remove the field and the constructor parameter.
…zone Add targeted tests to cover previously uncovered code paths: - TimeStampVectorAccessor: getObject with Date/Time/String/null types, unsupported types, and null values for naive timestamps - TimeStampTZVectorAccessor: invalid Arrow metadata timezone fallback, sessionZone fallback, and getObject with null type - DataCloudStatement: resolveSessionTimeZone with invalid, valid, empty, and missing timezone settings
…tamp test coverage Problem: When calling ResultSet.getObject() on a naive TIMESTAMP column, the returned value was shifted by the JVM timezone offset. For example, TIMESTAMP '1999-01-08 23:59:59.999' with JVM in America/Los_Angeles would return '1999-01-08 15:59:59.999' (shifted by -8 hours). Root cause: Avatica's AvaticaResultSet.getObject(int) internally calls AvaticaSite.get(accessor, typeId, localCalendar), which passes the JVM's default timezone calendar to the accessor. For TIMESTAMP columns (type 93), this calls accessor.getTimestamp(localCalendar), triggering the calendar-aware code path in TimeStampVectorAccessor. That path converts the UTC-stored Arrow epoch to the calendar's timezone, then creates a Timestamp via Timestamp.valueOf() -- a double timezone conversion that shifts the displayed literal value. The existing getTimestamp(int) override in StreamingResultSet already worked around this by passing null calendar to the accessor, but getObject(int) was not similarly protected. Solution: - Override getObject(int) in StreamingResultSet to bypass Avatica's AvaticaSite.get() for TIMESTAMP and TIMESTAMP_WITH_TIMEZONE columns, calling the accessor's getObject() directly (which uses the null-calendar literal-preserving path). All other types delegate to Avatica as before. - Override getObject(String) for consistent column-label resolution. Test coverage improvements: - Added 8 new timestamp test queries to protocolvalues.json (4 naive TIMESTAMP with various precisions and edge cases, 4 TIMESTAMPTZ with different timezone offsets including IST, UTC, and EDT). - Regenerated reference.json from PostgreSQL with the new queries, bringing the JDBCReferenceTest from 62 to 70 parameterized cases. - Pinned JVM timezone to America/Los_Angeles in JDBCReferenceTest via @BeforeAll/@afterall to match the timezone used during reference.json generation. This ensures Timestamp.toString() produces deterministic stringValue output across all environments (local dev, CI, etc.), eliminating the need for a separate instantString comparison field. - Removed the instantString field from ValueWithClass -- no longer needed since the pinned JVM timezone makes stringValue comparison reliable for both naive TIMESTAMP and TIMESTAMPTZ. - Simplified JDBCReferenceTest by removing custom timestamp comparators; all types now use the default ValueWithClass.equals() comparison. - Added @JsonPropertyOrder to ColumnMetadata to preserve consistent field ordering in generated reference.json across regenerations.
- DataCloudStatement.resolveSessionTimeZone() now throws SQLException (SQLState 22023) for invalid timezone strings instead of logging a warning and silently falling back to the system default - TimeStampVectorAccessor.getObject(Class<T>) and TimeStampTZVectorAccessor.getObject(Class<T>) now throw: - SQLException (22023) when type is null - SQLFeatureNotSupportedException for unsupported target types Previously both methods returned null silently. - QueryJDBCAccessor.getObject(Class<T>) base method signature updated to declare throws SQLException so subclasses can propagate it. Test updates (TimeZoneIntegrationTest): - testInvalidTimezoneHandling: expects driver-level SQLException instead of Hyper's DataCloudJDBCException (driver now validates before query) - testGetObjectWithNullType -> testGetObjectWithNullTypeThrows: expects SQLException instead of null return - testTimezonePrecedence: added LocalDateTime literal assertion and verified Tokyo Calendar yields correct UTC instant - testMultipleSessionTimezones: added literal-preservation assertion (LocalDateTime hour=12 for all session timezones) - testTimestampWithTimezoneColumn: added offset format regex assertion on getString() and exact Instant value check - testDefaultSessionTimezone: added LocalDateTime literal value assertion - new testGetObjectEqualsGetTimestamp: verifies getObject() and getTimestamp() return equivalent Timestamp for TIMESTAMP and TIMESTAMPTZ Test updates (TimeStampVectorAccessorTest): - testGetObjectWithUnsupportedType: expects SQLFeatureNotSupportedException - testGetObjectWithNullTypeReturnsNull -> testGetObjectWithNullTypeThrows - testGetObjectWithUnsupportedTypeForNaiveTimestamp: expects exception
…ava.time support
- setTimestamp(ts) now extracts the wall-clock in the JVM default timezone (or
Calendar timezone) and encodes it as a naive TIMESTAMP literal, matching the
JDBC spec (§16.3.3) and PostgreSQL JDBC behavior. Previously the UTC epoch
was sent directly, causing the stored literal to shift when session TZ ≠ JVM TZ.
- setTimestamp(ts, Calendar) now correctly uses the Calendar timezone; it was
previously silently ignored.
- TIMESTAMP parameters now use a naive Arrow TimeStampMicroVector (no timezone
metadata). TIMESTAMP_WITH_TIMEZONE uses a TZ-aware TimeStampMicroTZVector(UTC).
- setObject now handles all JDBC 4.2 java.time types per Table B-4:
Instant → TIMESTAMP_WITH_TIMEZONE (UTC epoch)
LocalDateTime → TIMESTAMP (digits as-is)
OffsetDateTime → TIMESTAMP_WITH_TIMEZONE (UTC epoch)
ZonedDateTime → TIMESTAMP_WITH_TIMEZONE (UTC epoch)
- setObject(ts, Types.TIMESTAMP_WITH_TIMEZONE) stores the true UTC instant,
providing a reliable escape hatch for TIMESTAMPTZ roundtrip with Timestamp.
- README updated with recommended usage patterns for TIMESTAMP and TIMESTAMPTZ.
- Add parameter-setting matrix (12 setter × cast combinations + null cases) into TimeZoneIntegrationTest — covers setTimestamp, setObject with Instant/LocalDateTime/OffsetDateTime/ZonedDateTime/Timestamp, and explicit TIMESTAMP/TIMESTAMP_WITH_TIMEZONE type hints - Add TimestampUsageTest in examples package: LocalDateTime (wall-clock TIMESTAMP), OffsetDateTime (UTC instant TIMESTAMPTZ), legacy Timestamp, and session timezone behavior - Remove dead calendar branches from TimeStampVectorAccessor.getOffsetDateTime() and getZonedDateTime() — always called with null from public API - Add targeted unit tests for previously uncovered patch lines: getObjectClass(), TZ systemDefault fallback, null querySettings path
5a3d1f7 to
fc8f750
Compare
| /** | ||
| * Creates a cursor with system default timezone. | ||
| * @deprecated Use constructor with explicit ZoneId parameter | ||
| */ | ||
| @Deprecated | ||
| ArrowStreamReaderCursor(ArrowStreamReader reader) { |
There was a problem hiding this comment.
Why can't we remove this constructor?
| // Avatica's AvaticaSite.get() calls getShort() for SMALLINT, returning Short | ||
| // instead of Integer (which the JDBC spec table B-3 recommends) |
There was a problem hiding this comment.
Why is this comment rewritten?
jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/TimestampDiagnosticTest.java
Show resolved
Hide resolved
| * - Instant (always UTC) | ||
| * - OffsetDateTime (with timezone offset) | ||
| * - ZonedDateTime (with full timezone info) | ||
| * - LocalDateTime (converted to effective timezone) |
There was a problem hiding this comment.
AFAIK this should not be supported for TimestampTZ, see also this comment on github eclipse-vertx/vertx-sql-client#1122 (comment) can you please check?
Please also check whether we should rather remove from input method for timestamptz
| * 4. System default | ||
| * | ||
| * <p>Supported JDBC 4.2 types via getObject(Class): | ||
| * - Instant (always UTC) |
There was a problem hiding this comment.
Are other drives implementing support for instant? I couldn't find much in that regard and it looks to me that it might be a bad idea
The end result is that Java's Instant (even rounded to microseconds) will never be equivalent to PosgreSQL's internal 8 byte timestamp, and they should always be converted properly.
Add comprehensive timezone handling with 4-tier precedence and JDBC 4.2 compliance.
Key Changes:
Testing:
Fixes:
Breaking Changes: None (fully backward compatible)
Co-authored-by: Claude (Anthropic AI Assistant)