This document provides detailed information about the PostgreSQL implementation of the upsert functionality in this library.
The PostgreSQL implementation uses the INSERT ... ON CONFLICT ... DO UPDATE syntax to perform upsert operations. This syntax, introduced in PostgreSQL 9.5, allows for inserting a new row if it doesn't exist, or updating an existing row if it does.
The PostgreSQL dialect is implemented in the PostgreSqlUpsertDialect class, which extends the UpsertDialect interface. The key components of this implementation include:
- SQL Generation: The
generateBatchUpsertSqlmethod generates the SQL statement for batch upsert operations. - Data Execution: The
upsertDatamethod executes the upsert operation and handles generated keys. - Generated Keys Handling: The
updateGeneratedKeysmethod updates entity objects with generated keys. - Type Conversion: The
convertToFieldTypemethod handles type conversions for generated keys. - Optimized Batch Support: The
supportsOptimizedBatchmethod indicates that PostgreSQL supports optimized batch operations.
The PostgreSQL upsert syntax follows this pattern:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...)
ON CONFLICT (key_column1, key_column2, ...)
DO UPDATE SET
column1 = EXCLUDED.column1,
column2 = EXCLUDED.column2,
...This syntax allows for specifying exactly which columns should be used for conflict detection (the ON CONFLICT clause) and which columns should be updated when a conflict occurs (the DO UPDATE SET clause).
PostgreSQL allows for explicit conflict detection using the ON CONFLICT clause. This can be based on:
- Primary Key: The primary key of the table
- Unique Constraint: Any unique constraint on the table
- Unique Index: Any unique index on the table
This flexibility allows for more control over when an update should occur.
PostgreSQL provides the EXCLUDED table reference, which contains the values that would have been inserted if the conflict hadn't occurred. This makes it easy to reference the new values in the update clause:
DO UPDATE SET
column1 = EXCLUDED.column1,
column2 = EXCLUDED.column2The PostgreSQL implementation has sophisticated handling for entities with and without IDs:
- Entities with IDs: These are processed using the primary key for conflict detection.
- Entities without IDs: These are processed using unique constraints for conflict detection.
This approach ensures that entities are correctly upserted regardless of whether they have an ID.
PostgreSQL supports optimized batch operations, as indicated by the supportsOptimizedBatch method. This allows for more efficient processing of large batches of entities.
// Define your entity
data class User(
val id: Long? = null,
val username: String,
val email: String
)
// Define your repository
interface UserRepository : UpsertRepository<User, Long> {
// Standard Spring Data JPA methods
}
// Use the repository
val user = User(username = "john", email = "john@example.com")
userRepository.upsert(user)// Define your entities
val users = listOf(
User(username = "john", email = "john@example.com"),
User(username = "jane", email = "jane@example.com")
)
// Use the repository
userRepository.upsertAll(users)interface UserRepository : UpsertRepository<User, Long> {
// Upsert using username as the ON clause
fun upsertOnUsername(user: User): Int
}
// Use the repository
val user = User(username = "john", email = "updated-email@example.com")
userRepository.upsertOnUsername(user)interface UserRepository : UpsertRepository<User, Long> {
// Upsert using username as the ON clause and ignoring updatedAt field
fun upsertOnUsernameIgnoringUpdatedAt(user: User): Int
}
// Use the repository
val user = User(username = "john", email = "john@example.com", updatedAt = LocalDateTime.now())
userRepository.upsertOnUsernameIgnoringUpdatedAt(user)When you want to only insert new records and not update existing ones, you can use the IgnoringAllFields suffix in your method name:
interface UserRepository : UpsertRepository<User, Long> {
// Upsert using id as the ON clause and ignoring all fields
// This will only insert new rows and not update existing ones
fun upsertOnIdIgnoringAllFields(user: User): Int
}
// Use the repository
val user = User(id = 1, username = "john", email = "john@example.com")
userRepository.upsertOnIdIgnoringAllFields(user)In PostgreSQL, this is implemented using the DO NOTHING option, which simply ignores conflicts without performing any updates. This ensures that existing records are not modified while still allowing new records to be inserted.
interface UserRepository : UpsertRepository<User, Long> {
// Upsert using username and email as the ON clause
fun upsertOnUsernameAndEmail(user: User): Int
}
// Use the repository
val user = User(username = "john", email = "john@example.com")
userRepository.upsertOnUsernameAndEmail(user)PostgreSQL allows for conditional updates using the WHERE clause in the DO UPDATE SET part:
INSERT INTO users (username, email, last_login)
VALUES (:username, :email, :last_login)
ON CONFLICT (username) DO UPDATE SET
email = EXCLUDED.email,
last_login = EXCLUDED.last_login
WHERE users.last_login < EXCLUDED.last_loginThis example only updates the last_login field if the new value is more recent than the existing value.
PostgreSQL also supports the DO NOTHING option, which simply ignores conflicts without performing an update:
INSERT INTO users (username, email)
VALUES (:username, :email)
ON CONFLICT (username) DO NOTHINGThis can be useful when you want to insert records only if they don't already exist. This is how the IgnoringAllFields feature is implemented in PostgreSQL - when all update columns are ignored, the dialect automatically uses DO NOTHING instead of DO UPDATE SET.
-
PostgreSQL Version Requirement: The upsert functionality requires PostgreSQL 9.5 or later, as this is when the
ON CONFLICTsyntax was introduced. -
Unique Constraint Requirement: The PostgreSQL implementation relies on the presence of a unique or primary key constraint on the table for conflict detection. If no such constraint exists, the upsert operation will not work as expected.
-
Define Appropriate Constraints: Ensure that your tables have appropriate unique or primary key constraints for the columns you want to use in the ON CONFLICT clause.
-
Use Batch Operations: When upserting multiple entities, use the
upsertAllmethod or a custom batch upsert method to take advantage of the optimized batch operation support. -
Handle Generated Keys: If your entities have generated fields (such as auto-increment IDs), be aware that the implementation will attempt to update these fields after insertion.
-
Consider Conditional Updates: For more complex scenarios, consider using conditional updates to control when and how updates occur.