Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.1.0] - 2026-06-18

### Added

- `PgSqlCaller::BulkUpdate` now accepts an optional `returning:` keyword — pass one or more
column names to read them back from each updated row via SQL `RETURNING`. The result is one
`Symbol`-keyed, type-cast hash per updated row (`[]` when `attrs_list` is empty). Omitting
`returning:` keeps the existing behavior of returning the affected-row count.

## [1.0.0] - 2026-06-08

### Added
Expand Down Expand Up @@ -84,6 +93,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
`transaction_open?`, `explain_analyze`, `typecast_array`, `sanitize_sql_array`, and
`current_database_name`.

[1.1.0]: https://github.com/didww/pg_sql_caller/compare/v1.0.0...v1.1.0
[1.0.0]: https://github.com/didww/pg_sql_caller/compare/v0.2.3...v1.0.0
[0.2.3]: https://github.com/didww/pg_sql_caller/compare/v0.2.2...v0.2.3
[0.2.2]: https://github.com/didww/pg_sql_caller/compare/v0.2.1...v0.2.2
Expand Down
18 changes: 16 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -328,13 +328,27 @@ PgSqlCaller::BulkUpdate.call(Employee, attrs_list, unique_by: :employee_number)
PgSqlCaller::BulkUpdate.call(Employee, attrs_list, unique_by: %i[department_id name])
```

### Reading back updated rows

Pass `returning` to read columns back from each updated row (SQL `RETURNING`) instead of a row count. The result is one `Symbol`-keyed hash per updated row, with values cast to their Ruby types (the same casting as [`select_all_serialized`](#serialized-reads-ruby-type-casting)):

```ruby
PgSqlCaller::BulkUpdate.call(Employee, [
{ id: 1, name: 'John', department_id: 10 },
{ id: 2, name: 'Jane', department_id: 20 }
], returning: %i[id name])
# => [{ id: 1, name: 'John' }, { id: 2, name: 'Jane' }]
```

A single column may be passed as a `Symbol` (`returning: :id`). Without `returning` (the default) the call returns the affected-row **count** exactly as before — the behavior is unchanged.

### Rules and behavior

- **Every row must include each `unique_by` column**, and all hashes must share the same set of keys.
- Only the columns you list are written; `unique_by` columns are used for matching, the rest are updated. Columns you omit (e.g. `created_at`) are left untouched.
- Rows that don't match an existing row are simply not updated — this **never inserts**.
- Returns the number of rows affected (`0` when `attrs_list` is empty — a no-op).
- Raises `ArgumentError` (before touching the database) if a row omits a `unique_by` column or names a column that doesn't exist on the model.
- Returns the number of rows affected (`0` when `attrs_list` is empty — a no-op). With `returning`, it instead returns the updated rows as `Symbol`-keyed hashes (`[]` when `attrs_list` is empty).
- Raises `ArgumentError` (before touching the database) if a row omits a `unique_by` column, names a column that doesn't exist on the model, or `returning` is empty or names an unknown column.

### Why not `upsert_all` or a loop of `update_all`?

Expand Down
70 changes: 58 additions & 12 deletions lib/pg_sql_caller/bulk_update.rb
Original file line number Diff line number Diff line change
Expand Up @@ -42,37 +42,71 @@ class BulkUpdate
# `unique_by` column, and all hashes MUST share the same keys
# @param unique_by [Symbol, Array<Symbol>] the match column(s) — a single column,
# or all parts of a composite key (default +:id+)
# @return [Integer] the number of rows affected
def self.call(model_class, attrs_list, unique_by: :id)
new(model_class, attrs_list, unique_by: unique_by).call
# @param returning [Symbol, Array<Symbol>, nil] column(s) to read back from each
# updated row via SQL `RETURNING`; +nil+ (default) keeps the row-count behavior
# @return [Integer, Array<Hash{Symbol => Object}>] the number of rows affected, or —
# when +returning+ is given — the updated rows as type-cast, Symbol-keyed hashes
def self.call(model_class, attrs_list, unique_by: :id, returning: nil)
new(model_class, attrs_list, unique_by: unique_by, returning: returning).call
end

attr_reader :model_class, :unique_by, :attrs_list
attr_reader :model_class, :unique_by, :attrs_list, :returning

# @param model_class [Class<ActiveRecord::Base>] the model whose table is updated
# @param attrs_list [Array<Hash>] one hash per row; each MUST include every
# `unique_by` column, and all hashes MUST share the same keys
# @param unique_by [Symbol, Array<Symbol>] the match column(s) — a single column,
# or all parts of a composite key (default +:id+)
def initialize(model_class, attrs_list, unique_by: :id)
# @param returning [Symbol, Array<Symbol>, nil] column(s) to read back from each
# updated row via SQL `RETURNING`; +nil+ (default) keeps the row-count behavior
def initialize(model_class, attrs_list, unique_by: :id, returning: nil)
@model_class = model_class
@attrs_list = attrs_list
@unique_by = Array(unique_by)
@returning = returning.nil? ? nil : Array(returning)
end

# Execute the bulk update as a single `UPDATE ... FROM unnest(...)` statement.
#
# @return [Integer] the number of rows affected (0 when +attrs_list+ is empty)
# @raise [ArgumentError] if a row omits a `unique_by` column, or names a column
# that does not exist on the model
# @return [Integer, Array<Hash{Symbol => Object}>] without +returning+, the number of
# rows affected (0 when +attrs_list+ is empty); with +returning+, the updated rows as
# type-cast, Symbol-keyed hashes (+[]+ when +attrs_list+ is empty)
# @raise [ArgumentError] if a row omits a `unique_by` column, names a column that does
# not exist on the model, or +returning+ is empty or names an unknown column
def call
return 0 if attrs_list.empty?
validate_returning! unless returning.nil?
return empty_result if attrs_list.empty?

sql_caller.execute(sql, *bindings).cmd_tuples
if returning.nil?
sql_caller.execute(sql, *bindings).cmd_tuples
else
sql_caller.select_all_serialized(sql, *bindings)
end
end

private

# The value returned for an empty +attrs_list+: a zero row count, or an empty row set
# when +returning+ was requested — mirroring the shape {#call} returns when it runs.
#
# @return [Integer, Array]
def empty_result
returning.nil? ? 0 : []
end

# Validate the requested `RETURNING` columns before any SQL runs: at least one column
# must be named, and every column must exist on the model (each is qualified with the
# target alias `t`, so it must be a real column, never an expression).
#
# @return [void]
# @raise [ArgumentError] if +returning+ is empty or names a column unknown to the model
def validate_returning!
raise ArgumentError, 'returning must name at least one column' if returning.empty?

unknown = returning.map(&:to_s) - model_class.column_names
raise ArgumentError, "unknown #{model_class} returning columns: #{unknown.join(', ')}" if unknown.any?
end

# The SQL executor, built from the model's own connection: it sanitizes the bound
# values, runs the statement and encodes the typed PostgreSQL arrays.
#
Expand Down Expand Up @@ -123,16 +157,28 @@ def validate_columns!(cols)
end

# The full `UPDATE ... FROM unnest(...)` statement, with one `?` placeholder per
# column for the value arrays.
# column for the value arrays, plus a `RETURNING` clause when +returning+ was given.
#
# @return [String]
def sql
<<~SQL.squish
statement = <<~SQL.squish
UPDATE #{model_class.quoted_table_name} AS t
SET #{set_clause}
FROM unnest(#{unnest_args}) AS v(#{column_aliases})
WHERE #{match_clause}
SQL
return statement if returning.nil?

"#{statement} RETURNING #{returning_clause}"
end

# The `RETURNING t.col, ...` projection. Each column is qualified with the target
# alias `t` because the `unnest` source alias `v` shares the same column names, so an
# unqualified `RETURNING` would be ambiguous.
#
# @return [String]
def returning_clause
returning.map { |col| "t.#{quoted(col)}" }.join(', ')
end

# The `SET col = v.col, ...` assignments for the value columns.
Expand Down
2 changes: 1 addition & 1 deletion lib/pg_sql_caller/version.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

module PgSqlCaller
VERSION = '1.0.0'
VERSION = '1.1.0'
end
86 changes: 86 additions & 0 deletions spec/pg_sql_caller/bulk_update_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,92 @@
end
end

context 'with returning:' do
subject { described_class.call(Employee, attrs_list, returning: returning) }

let(:returning) { %i[id name department_id] }

it 'returns the updated rows as Symbol-keyed hashes of the listed columns', :aggregate_failures do
result = subject
expect(result).to contain_exactly(
{ id: first.id, name: 'John Updated', department_id: other_dep.id },
{ id: second.id, name: 'Jane Updated', department_id: other_dep.id }
)
end

it 'returns the new values, not the pre-update ones' do
expect(subject.map { |row| row[:name] }).to contain_exactly('John Updated', 'Jane Updated')
end

it 'returns only the listed columns' do
expect(subject.map(&:keys)).to all(eq(%i[id name department_id]))
end

context 'with a single column passed as a Symbol' do
let(:returning) { :id }

it 'coerces it to an Array and returns that one column' do
expect(subject).to contain_exactly({ id: first.id }, { id: second.id })
end
end

context 'with a datetime column' do
let(:created_at) { Time.now - 60 }
let(:attrs_list) { [{ id: first.id, created_at: created_at }] }
let(:returning) { %i[id created_at] }

it 'type-casts each returned value to its Ruby type', :aggregate_failures do
row = subject.first
expect(row[:created_at]).to be_a(Time)
expect(row[:created_at]).to be_within(1).of(created_at)
end
end

context 'with a composite unique_by' do
subject do
described_class.call(Employee, attrs_list, unique_by: %i[department_id name], returning: %i[id name])
end

let(:new_created_at) { Time.now - 100 }
let(:attrs_list) do
[
{ department_id: dep.id, name: 'John', created_at: new_created_at },
{ department_id: dep.id, name: 'Jane', created_at: new_created_at }
]
end

it 'returns a row per matched key, skipping non-matches' do
expect(subject).to contain_exactly({ id: first.id, name: 'John' }, { id: second.id, name: 'Jane' })
end
end

context 'when attrs_list is empty' do
let(:attrs_list) { [] }

it 'is a no-op returning an empty array' do
expect { expect(subject).to eq([]) }.not_to(change { first.reload.attributes })
end
end

context 'when returning names an unknown column' do
let(:returning) { %i[id bogus_column] }

it 'raises ArgumentError before touching the database', :aggregate_failures do
expect { subject }.to raise_error(ArgumentError, /unknown.*bogus_column/)
expect(first.reload.name).to eq('John')
end
end

context 'when returning is empty' do
let(:returning) { [] }

it 'raises ArgumentError', :aggregate_failures do
expect { subject }.to raise_error(ArgumentError, /at least one column/)
expect(first.reload.name).to eq('John')
end
end
end

# Excluded from the default suite (see filter_run_excluding :benchmark).
# Run with: bundle exec rspec spec/pg_sql_caller/bulk_update_spec.rb --tag benchmark
describe 'performance vs N update_all calls in a transaction', :benchmark do
Expand Down
Loading