Skip to content

[SNOW-3249917] JDBC removal Step 8d: Replicate storage client implementations#1124

Open
sfc-gh-ggeng wants to merge 12 commits intojdbc-removal-step8c-swap-all-importsfrom
jdbc-removal-step8d-storage-clients
Open

[SNOW-3249917] JDBC removal Step 8d: Replicate storage client implementations#1124
sfc-gh-ggeng wants to merge 12 commits intojdbc-removal-step8c-swap-all-importsfrom
jdbc-removal-step8d-storage-clients

Conversation

@sfc-gh-ggeng
Copy link
Copy Markdown
Contributor

@sfc-gh-ggeng sfc-gh-ggeng commented Mar 26, 2026

Summary

Verbatim replication of the full JDBC storage client stack (~7,000 lines total).

Storage client implementations

  • SnowflakeS3Client (1043 lines) — S3 upload/download with encryption
  • SnowflakeAzureClient (1058 lines) — Azure Blob upload/download
  • SnowflakeGCSClient (1283 lines) — GCS upload/download with presigned URLs

Factory and helpers

  • StorageClientFactory (241 lines) — creates cloud-specific clients
  • S3HttpUtil (132 lines) — S3 proxy configuration
  • S3ObjectMetadata (67 lines), S3StorageObjectMetadata (80 lines) — S3 metadata
  • CommonObjectMetadata (84 lines) — Azure/GCS metadata
  • QueryIdHelper (11 lines) — encryption query ID helper
  • FileCompressionType (47 lines) — compression type enum
  • HttpHeadersCustomizer (47 lines) — HTTP header customization interface
  • HeaderCustomizerHttpRequestInterceptor (159 lines) — HTTP request interceptor
  • GCSAccessStrategy (50 lines), GCSDefaultAccessStrategy (250 lines), GCSAccessStrategyAwsSdk (346 lines)

uploadWithoutConnection in SnowflakeFileTransferAgent

  • uploadWithoutConnection(SnowflakeFileTransferConfig) — main upload method
  • pushFileToRemoteStore, pushFileToRemoteStoreWithPresignedUrl
  • computeDigest, compressStreamWithGZIP, compressStreamWithGZIPNoDigest
  • InputStreamWithMetadata, remoteLocation, extractLocationAndPath, MAX_BUFFER_SIZE

StorageClientUtil additions

  • createCaseInsensitiveMap(Header[]) — overload for Apache HTTP headers
  • convertSystemPropertyToBooleanValue — boolean system property reader
  • throwJCEMissingError — fixed to match JDBC's exact signatures (2-arg + 3-arg with queryId)
  • throwNoSpaceLeftError — fixed to match JDBC's exact signatures (3-arg + 4-arg with session + queryId)

Stacked on #1123.

Replication Verification Diff Report

All classes are line-by-line replications of JDBC v3.25.1 sources. Common permitted mechanical differences across all files:

  • Package declaration changed to net.snowflake.ingest.streaming.internal.fileTransferAgent
  • SFLogger/SFLoggerFactory → ingest's replicated versions
  • @SnowflakeJdbcInternalApi / @SnowflakeOrgInternalApi removed
  • Same-package types (ErrorCode, SqlState, SnowflakeSQLException, SnowflakeSQLLoggedException, MatDesc, FileBackedOutputStream, StorageObjectMetadata, StageInfo, RemoteStoreFileEncryptionMaterial, EncryptionProvider, GcmEncryptionProvider, etc.) — imports removed (same package)
  • SnowflakeUtil static methods → StorageClientUtil equivalents
  • SFPair/Stopwatchnet.snowflake.ingest.utils versions
  • SFSession/SFBaseSession/SFSessionProperty kept from JDBC temporarily (always null from callers)

SnowflakeS3Client

JDBC source : net/snowflake/client/jdbc/cloud/storage/SnowflakeS3Client.java @ v3.25.1

Permitted differences (mechanical):

  • SFSSLConnectionSocketFactoryIngestSSLConnectionSocketFactory
  • Constants.CLOUD_STORAGE_CREDENTIALS_EXPIREDErrorCode.CLOUD_STORAGE_CREDENTIALS_EXPIRED
  • SnowflakeFileTransferAgent.throwJCEMissingErrorStorageClientUtil.throwJCEMissingError
  • SnowflakeFileTransferAgent.throwNoSpaceLeftErrorStorageClientUtil.throwNoSpaceLeftError
  • HttpUtil.isSocksProxyDisabled() → ingest's HttpUtil.isSocksProxyDisabled()
  • net.snowflake.client.jdbc.SnowflakeSQLException added to throws clauses (session.getHttpClientKey() throws it)

Unexpected differences:

  • SnowflakeUtil.assureOnlyUserAccessibleFilePermissions kept from JDBC (download path only, no ingest equivalent)

SnowflakeAzureClient

JDBC source : net/snowflake/client/jdbc/cloud/storage/SnowflakeAzureClient.java @ v3.25.1

Permitted differences (mechanical):

  • Same as S3 (SnowflakeUtil → StorageClientUtil, SFLogger swap, etc.)
  • HttpUtil.setProxyForAzure/setSessionlessProxyForAzure kept from JDBC
  • net.snowflake.client.jdbc.SnowflakeSQLException added to throws clauses

Unexpected differences:

  • SnowflakeUtil.assureOnlyUserAccessibleFilePermissions kept from JDBC (download path only)

SnowflakeGCSClient

JDBC source : net/snowflake/client/jdbc/cloud/storage/SnowflakeGCSClient.java @ v3.25.1

Permitted differences (mechanical):

  • Same as S3/Azure
  • Uses JDBC's HttpClientSettingsKey throughout (matching JDBC verbatim)
  • Uses JDBC's HttpUtil, RestRequest, ExecTimeTelemetryData, HttpResponseContextDto for presigned URL upload
  • net.snowflake.client.jdbc.SnowflakeSQLException added to throws clauses

Unexpected differences:

  • SnowflakeUtil.assureOnlyUserAccessibleFilePermissions kept from JDBC (download path only)

StorageClientFactory

JDBC source : net/snowflake/client/jdbc/cloud/storage/StorageClientFactory.java @ v3.25.1

Permitted differences (mechanical):

  • HttpUtil.isSocksProxyDisabled() → ingest's HttpUtil
  • net.snowflake.client.jdbc.SnowflakeSQLException added to throws clauses

Unexpected differences: NONE

S3HttpUtil

JDBC source : net/snowflake/client/jdbc/cloud/storage/S3HttpUtil.java @ v3.25.1

Permitted differences (mechanical):

  • setProxyForS3 takes JDBC's net.snowflake.client.core.HttpClientSettingsKey (matching JDBC — session.getHttpClientKey() returns this type)
  • SFLoggerUtil.isVariableProvided kept from JDBC

Unexpected differences: NONE

GCSAccessStrategy / GCSDefaultAccessStrategy / GCSAccessStrategyAwsSdk

JDBC sources : net/snowflake/client/jdbc/cloud/storage/GCSAccessStrategy*.java @ v3.25.1

Permitted differences (mechanical):

  • Same as S3/Azure
  • net.snowflake.client.jdbc.SnowflakeSQLException added to throws clauses

Unexpected differences: NONE

S3ObjectMetadata / S3StorageObjectMetadata / CommonObjectMetadata

JDBC sources : S3ObjectMetadata.java, S3StorageObjectMetadata.java, CommonObjectMetadata.java @ v3.25.1

Unexpected differences: NONE

HttpHeadersCustomizer / HeaderCustomizerHttpRequestInterceptor

JDBC sources : HttpHeadersCustomizer.java, HeaderCustomizerHttpRequestInterceptor.java @ v3.25.1

Unexpected differences:

  • AttributeEnhancingHttpRequestRetryHandler.EXECUTION_COUNT_ATTRIBUTE inlined as constant string

FileCompressionType / QueryIdHelper

JDBC sources : FileCompressionType.java (from snowflake-common), QueryIdHelper.java @ v3.25.1

Unexpected differences:

SnowflakeFileTransferAgent (modifications)

Added methods replicated from JDBC's SnowflakeFileTransferAgent.java @ v3.25.1:

Permitted differences (mechanical):

  • OCSPMode → ingest's version in uploadWithoutConnection
  • Uses JDBC's SnowflakeUtil.convertProxyPropertiesToHttpClientKey (returns JDBC's HttpClientSettingsKey)
  • Uses JDBC's HttpClientSettingsKey in pushFileToRemoteStoreWithPresignedUrl
  • getFileTransferMetadatas return type uses ingest's SnowflakeFileTransferMetadata
  • CommandType uses ingest's version (same package)

Unexpected differences:

StorageClientUtil (modifications)

Added methods replicated from JDBC's SnowflakeUtil:

  • createCaseInsensitiveMap(Header[]) — verbatim
  • convertSystemPropertyToBooleanValue — verbatim
  • throwJCEMissingError — now matches JDBC's exact signatures (2-arg deprecated + 3-arg with queryId)
  • throwNoSpaceLeftError — now matches JDBC's exact signatures (3-arg deprecated + 4-arg with session + queryId)

Test plan

  • mvn compiler:compile passes (0 errors)
  • mvn test-compile passes (0 errors)
  • ./format.sh passes
  • Full test suite

🤖 Generated with Claude Code

@sfc-gh-ggeng sfc-gh-ggeng requested review from a team as code owners March 26, 2026 21:04
@sfc-gh-ggeng sfc-gh-ggeng force-pushed the jdbc-removal-step8c-swap-all-imports branch from 9b988de to 9bed824 Compare March 27, 2026 21:42
@sfc-gh-ggeng sfc-gh-ggeng force-pushed the jdbc-removal-step8d-storage-clients branch from 6e81a56 to 38b099c Compare March 27, 2026 21:43
sfc-gh-ggeng and others added 12 commits March 28, 2026 20:41
Restructured remaining steps:
- Step 8c: helper classes + interface (PR #1123, done)
- Step 8d: storage client implementations (S3/Azure/GCS + factory +
  uploadWithoutConnection) — next PR
- Step 8e: swap ALL imports at once (after full stack replicated)
- Steps 9a-9c and 10 unchanged

Added progress summary table with all PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…metadata impls, helpers

Verbatim replication of remaining JDBC storage infrastructure:
- StorageClientFactory (241 lines) — creates cloud-specific clients
- S3ObjectMetadata (67 lines) — S3 StorageObjectMetadata impl
- CommonObjectMetadata (84 lines) — Azure/GCS StorageObjectMetadata impl
- FileCompressionType enum (47 lines) — from snowflake-common
- HttpHeadersCustomizer interface (47 lines) — HTTP header customization
- HeaderCustomizerHttpRequestInterceptor (159 lines) — HTTP interceptor

Also fixed SnowflakeFileTransferConfig: restored SFSession type
(was incorrectly replaced with Object).

StorageClientFactory references SnowflakeS3Client/SnowflakeAzureClient/
SnowflakeGCSClient which are not yet replicated — will compile once
the three client implementations are added.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…clients + helpers

Verbatim replication of the three JDBC cloud storage client implementations:
- SnowflakeS3Client (1043 lines) — S3 upload/download with encryption
- SnowflakeAzureClient (1058 lines) — Azure Blob upload/download
- SnowflakeGCSClient (1283 lines) — GCS upload/download with presigned URLs

Plus additional helper classes discovered during replication:
- S3HttpUtil (132 lines) — S3 proxy configuration
- S3StorageObjectMetadata (80 lines) — S3 metadata wrapper
- QueryIdHelper (11 lines) — encryption material query ID helper

Does not yet compile — session-dependent code paths reference JDBC types
(SFSession.getHttpClientKey(), session.getHttpHeaderCustomizers(),
renewExpiredToken). These paths are never executed (session always null)
and will be resolved in a follow-up commit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add GCSAccessStrategy interface + GCSDefaultAccessStrategy +
  GCSAccessStrategyAwsSdk implementations (646 lines total)
- Add renewExpiredToken stub to SnowflakeFileTransferAgent (session-only,
  never called from streaming ingest)
- Add createCaseInsensitiveMap(Header[]) and
  convertSystemPropertyToBooleanValue to StorageClientUtil
- Fix S3HttpUtil.setProxyForS3 to accept JDBC's HttpClientSettingsKey
  (session.getHttpClientKey() returns JDBC type)
- Wrap session.getHttpClientKey() calls in try-catch for JDBC's
  SnowflakeSQLException (different class from ingest's)
- Fix HttpClientSettingsKey type bridge in GCS presigned URL upload
- Fix SqlState imports in GCS access strategies

All 30 compile errors resolved. Production and test code compile cleanly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ated SnowflakeFileTransferAgent

Verbatim replication of upload methods from JDBC's SnowflakeFileTransferAgent:
- uploadWithoutConnection(SnowflakeFileTransferConfig) — main upload entry point
- pushFileToRemoteStore — S3/Azure upload via SnowflakeStorageClient
- pushFileToRemoteStoreWithPresignedUrl — GCS upload with presigned URLs
- computeDigest, compressStreamWithGZIP, compressStreamWithGZIPNoDigest
- InputStreamWithMetadata inner class
- remoteLocation inner class + extractLocationAndPath
- MAX_BUFFER_SIZE constant

Also fixed: removed stale JDBC CommandType bridge (now uses ingest's
CommandType directly since SnowflakeFileTransferMetadataV1 is ingest's),
fixed return type of getFileTransferMetadatas to use ingest's
SnowflakeFileTransferMetadata interface.

Full JDBC storage stack now compiles. Step 8d replication complete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…es exactly

The Step 7a replication incorrectly simplified these method signatures
by dropping parameters (queryId, session). This caused cascading
unexpected differences in all storage client replications.

StorageClientUtil now matches JDBC's SnowflakeFileTransferAgent exactly:
- throwJCEMissingError(String, Exception) — deprecated, delegates to 3-arg
- throwJCEMissingError(String, Exception, String queryId) — main impl
- throwNoSpaceLeftError(SFSession, String, Exception) — deprecated, delegates
- throwNoSpaceLeftError(SFSession, String, Exception, String queryId) — main

Updated all callers:
- Snowflake clients (S3/Azure/GCS): now pass (operation, ex, queryId) and
  (session, operation, ex, queryId) — matching JDBC verbatim
- Iceberg clients: pass null for session (no session available)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…auses

Instead of catching and wrapping JDBC's SnowflakeSQLException in
session-dependent code paths, add it to the throws clause — matching
JDBC's original behavior where session.getHttpClientKey() throws it.

Updated throws clauses in:
- SnowflakeS3Client: constructor + setupSnowflakeS3Client + renew
- SnowflakeAzureClient: createSnowflakeAzureClient + setupAzureClient + renew
- SnowflakeGCSClient: createSnowflakeGCSClient + setupGCSClient + renew
- GCSAccessStrategyAwsSdk: constructor
- StorageClientFactory: createClient + createS3Client + createAzureClient + createGCSClient
- SnowflakeStorageClient: renew interface method

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The replicated Snowflake storage clients should use JDBC's
HttpClientSettingsKey (net.snowflake.client.core.HttpClientSettingsKey)
everywhere — matching JDBC verbatim. Ingest's HttpClientSettingsKey
is only for the Iceberg clients.

Changes:
- SnowflakeFileTransferAgent.uploadWithoutConnection: use JDBC's
  SnowflakeUtil.convertProxyPropertiesToHttpClientKey (returns JDBC type)
- SnowflakeStorageClient interface: uploadWithPresignedUrlWithoutConnection
  and upload take JDBC's HttpClientSettingsKey
- SnowflakeGCSClient: uploadWithPresignedUrl takes JDBC's type,
  removed null placeholder and HttpUtil.getHttpClient bridge —
  now matches JDBC verbatim: session.getHttpClientKey() passed directly
- pushFileToRemoteStoreWithPresignedUrl: takes JDBC's HttpClientSettingsKey

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ceptor directly

The replicated Snowflake stack should use JDBC's types for session-
dependent code, not ingest's. session.getHttpHeadersCustomizers()
returns JDBC's List<HttpHeadersCustomizer> — pass it directly to
JDBC's HeaderCustomizerHttpRequestInterceptor.

Removed the unchecked cast workaround in SnowflakeS3Client and
GCSAccessStrategyAwsSdk.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Now includes mimeSubTypes list, mimeSubTypeToCompressionMap, and
lookupByMimeSubType/lookupByFileExtension methods — matching the
full snowflake-common decompiled source.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…PathFromCommand

Replaced stubs with verbatim copies from JDBC, using JDBC imports for
SFStatement, ExecTimeTelemetryData, SFException, SecretDetector, and
SnowflakeUtil.checkErrorAndThrowException.

Also restored pushFileToRemoteStore's requirePresignedUrl() block
and ArgSupplier lambda in debug log — both were previously removed.

No more stubs or omissions in the replicated SnowflakeFileTransferAgent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
renewExpiredToken throws both ingest and JDBC SnowflakeSQLException.
Propagate the JDBC exception through all callers in the storage client
interface and implementations (S3, Azure, GCS).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sfc-gh-ggeng sfc-gh-ggeng force-pushed the jdbc-removal-step8d-storage-clients branch from 4582b92 to b124d44 Compare March 28, 2026 20:47
@sfc-gh-ggeng sfc-gh-ggeng force-pushed the jdbc-removal-step8c-swap-all-imports branch from 9bed824 to b7666c2 Compare March 28, 2026 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant