Skip to content

Initial connection strategy plugin does not handle green instances and times out #1781

@ehardy

Description

@ehardy

Describe the bug

After a blue/green deployment, the primary host as returned by the information_schema.replica_host_status table contains a -green- suffix in the hostname. The initial connection strategy plugin, in its default configuration, attempts to connect to this host, but the actual DNS entry points to a host without this suffix. The plugin thus times out and throws the following exception:

Caused by: java.sql.SQLException: The Aurora Initial Connection Strategy Plugin attempted to connect but timed out after 5,000ms. Please ensure that your URL is correct, there are no network issues, and you are connecting to the correct role if 'verifyOpenedConnectionType' was set.
	at software.amazon.jdbc.plugin.AuroraInitialConnectionStrategyPlugin.connect(AuroraInitialConnectionStrategyPlugin.java:282)
	at software.amazon.jdbc.ConnectionPluginManager.lambda$connect$6(ConnectionPluginManager.java:356)
	at software.amazon.jdbc.ConnectionPluginManager.lambda$null$3(ConnectionPluginManager.java:251)
	at software.amazon.jdbc.ConnectionPluginManager.executeWithTelemetry(ConnectionPluginManager.java:215)
	at software.amazon.jdbc.ConnectionPluginManager.lambda$makePluginChainFunc$4(ConnectionPluginManager.java:250)
	at software.amazon.jdbc.ConnectionPluginManager.executeWithSubscribedPlugins(ConnectionPluginManager.java:200)
	at software.amazon.jdbc.ConnectionPluginManager.connect(ConnectionPluginManager.java:353)
	at software.amazon.jdbc.wrapper.ConnectionWrapper.init(ConnectionWrapper.java:130)
	at software.amazon.jdbc.wrapper.ConnectionWrapper.<init>(ConnectionWrapper.java:88)
	at software.amazon.jdbc.Driver.connect(Driver.java:266)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:144)
	at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:373)
	at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:210)
	at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:488)
	at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:752)
	at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:731)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

I have documented the issue in this discussion.

This is the host list we see in the information_schema.replica_host_status table after a blue/green deployment:

mysql> SELECT server_id, session_id FROM information_schema.replica_host_status;
+---------------------------------------------------+--------------------------------------+
| server_id                                         | session_id                           |
+---------------------------------------------------+--------------------------------------+
| aws-prod-shared-use1-aurora-rds-db01-green-tcivlm | MASTER_SESSION_ID                    |
| aws-prod-shared-use1-aurora-rds-db03              | 5fc61249-de64-46e2-ae8e-dc8a43771736 |
+---------------------------------------------------+--------------------------------------+
2 rows in set (0.05 sec)

There are methods in RdsUtils to deal with green instance names, it looks to me like the initial connection strategy plugin should remove the green suffix before attempting to resolve the actual host name.

To work around the issue, we have configured endpointSubstitutionRole=none so that the plugin connects to the cluster endpoint.

Expected Behavior

The initial connection strategy plugin handles blue/green deployments and connects to the actual Aurora host without timing out

What plugins are used? What other connection properties were set?

The default plugins, which the initial connection strategy is part of

Current Behavior

The initial connection strategy plugin does not remove the -green- suffix from the primary host found in the information_schema.replica_host_status table and thus attempts to connect to an invalid host. It times out after a few retry attempts.

Reproduction Steps

I believe the steps would look something like the following:

  1. Configure an AWS Aurora MySQL cluster
  2. Have an application using the AWS JDBC Wrapper/MySQL JDBC that connects to it
  3. Go through a blue/green deployment
  4. The application should then experience the exceptions mentioned in the description

Possible Solution

Maybe the initial connection strategy plugin should strip the green suffix from the hostname before attempting to connect to it.

Additional Information/Context

No response

The AWS Advanced JDBC Wrapper version used

3.2.0

JDK version used

25

Operating System and version

kubernetes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions