From 397b1dd20c6c06f64e22504ed0d8defd387f6517 Mon Sep 17 00:00:00 2001 From: Florin Irion Date: Thu, 28 May 2026 09:19:41 +0200 Subject: [PATCH 1/2] docs(pgd): require RI FULL on TOAST-bearing tables Document the divergence risk when concurrent updates touch different TOAST columns on tables with REPLICA IDENTITY DEFAULT / USING INDEX, and recommend REPLICA IDENTITY FULL as the mitigation. Add the 1 GB total-TOAST limitation that RI FULL does not solve, so operators don't expect it to enable replication of arbitrarily large rows. Covers PGD 4.4, 5.6, 5.7, 5.8, 5.9, 6, 6.1, 6.2, 6.3 with version-appropriate wording: PGD 6.x notes BDR_AUTO default with an override warning; PGD 5.x requires explicit opt-in; PGD 4.4 uses the legacy appusage.mdx structure; PGD 6.3 adds a bullet to planning.mdx. BDR-6868 --- product_docs/docs/pgd/4.4/bdr/appusage.mdx | 30 ++++++++++++++ .../docs/pgd/5.6/appusage/behavior.mdx | 37 +++++++++++++++++ .../docs/pgd/5.7/appusage/behavior.mdx | 37 +++++++++++++++++ .../docs/pgd/5.8/appusage/behavior.mdx | 37 +++++++++++++++++ .../docs/pgd/5.9/appusage/behavior.mdx | 37 +++++++++++++++++ .../docs/pgd/6.1/appusage/behavior.mdx | 41 +++++++++++++++++++ .../docs/pgd/6.2/appusage/behavior.mdx | 41 +++++++++++++++++++ .../6.3/developing-applications/planning.mdx | 2 + .../6.4/developing-applications/planning.mdx | 2 + product_docs/docs/pgd/6/appusage/behavior.mdx | 41 +++++++++++++++++++ 10 files changed, 305 insertions(+) diff --git a/product_docs/docs/pgd/4.4/bdr/appusage.mdx b/product_docs/docs/pgd/4.4/bdr/appusage.mdx index c17e6f754e..8527821d7a 100644 --- a/product_docs/docs/pgd/4.4/bdr/appusage.mdx +++ b/product_docs/docs/pgd/4.4/bdr/appusage.mdx @@ -110,6 +110,36 @@ BDR handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. BDR 4 does not set + this by default — you need to enable it explicitly on every table that + can store TOAST'd values: + + ```sql + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + When an `UPDATE` doesn't change a TOAST column, the column is sent in the + replication stream as an *unchanged* marker rather than as full data. + Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire pre-update + row (including its TOAST data) into WAL, so the receiver always has an + authoritative source for unchanged columns. Under `REPLICA IDENTITY + DEFAULT` (or `USING INDEX`), the receiver fills unchanged TOAST columns + from its local copy of the row, which may be stale if a concurrent + update on another node modified a different column of the same row — + resulting in data divergence between nodes. + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + BDR can't work correctly if Replica Identity columns are marked as external. PostgreSQL allows CHECK() constraints that contain volatile functions. Since diff --git a/product_docs/docs/pgd/5.6/appusage/behavior.mdx b/product_docs/docs/pgd/5.6/appusage/behavior.mdx index 451df72c85..85380e7a60 100644 --- a/product_docs/docs/pgd/5.6/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.6/appusage/behavior.mdx @@ -129,6 +129,43 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 5 does not set + this by default — you need to enable it explicitly on every table that + can store TOAST'd values: + + ```sql + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row — resulting in data divergence between nodes. + + To check the current replica identity: + + ```sql + SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/5.7/appusage/behavior.mdx b/product_docs/docs/pgd/5.7/appusage/behavior.mdx index 451df72c85..85380e7a60 100644 --- a/product_docs/docs/pgd/5.7/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.7/appusage/behavior.mdx @@ -129,6 +129,43 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 5 does not set + this by default — you need to enable it explicitly on every table that + can store TOAST'd values: + + ```sql + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row — resulting in data divergence between nodes. + + To check the current replica identity: + + ```sql + SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/5.8/appusage/behavior.mdx b/product_docs/docs/pgd/5.8/appusage/behavior.mdx index 451df72c85..85380e7a60 100644 --- a/product_docs/docs/pgd/5.8/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.8/appusage/behavior.mdx @@ -129,6 +129,43 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 5 does not set + this by default — you need to enable it explicitly on every table that + can store TOAST'd values: + + ```sql + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row — resulting in data divergence between nodes. + + To check the current replica identity: + + ```sql + SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/5.9/appusage/behavior.mdx b/product_docs/docs/pgd/5.9/appusage/behavior.mdx index 451df72c85..85380e7a60 100644 --- a/product_docs/docs/pgd/5.9/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.9/appusage/behavior.mdx @@ -129,6 +129,43 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 5 does not set + this by default — you need to enable it explicitly on every table that + can store TOAST'd values: + + ```sql + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row — resulting in data divergence between nodes. + + To check the current replica identity: + + ```sql + SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/6.1/appusage/behavior.mdx b/product_docs/docs/pgd/6.1/appusage/behavior.mdx index 5bd29c9d03..a9d3ca245b 100644 --- a/product_docs/docs/pgd/6.1/appusage/behavior.mdx +++ b/product_docs/docs/pgd/6.1/appusage/behavior.mdx @@ -132,6 +132,47 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 6 sets this by + default for new tables via the `BDR_AUTO` replica identity. If you + override this — for example with `ALTER TABLE ... REPLICA IDENTITY + DEFAULT` or `... USING INDEX` — concurrent updates that change different + TOAST columns of the same row on different nodes can produce stale data + on the receiver and cause divergence between nodes. + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row. + + To check or set: + + ```sql + -- Check current replica identity + SELECT relname, relreplident FROM pg_class + WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + + -- Set REPLICA IDENTITY FULL + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/6.2/appusage/behavior.mdx b/product_docs/docs/pgd/6.2/appusage/behavior.mdx index 5bd29c9d03..a9d3ca245b 100644 --- a/product_docs/docs/pgd/6.2/appusage/behavior.mdx +++ b/product_docs/docs/pgd/6.2/appusage/behavior.mdx @@ -132,6 +132,47 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 6 sets this by + default for new tables via the `BDR_AUTO` replica identity. If you + override this — for example with `ALTER TABLE ... REPLICA IDENTITY + DEFAULT` or `... USING INDEX` — concurrent updates that change different + TOAST columns of the same row on different nodes can produce stale data + on the receiver and cause divergence between nodes. + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row. + + To check or set: + + ```sql + -- Check current replica identity + SELECT relname, relreplident FROM pg_class + WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + + -- Set REPLICA IDENTITY FULL + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/6.3/developing-applications/planning.mdx b/product_docs/docs/pgd/6.3/developing-applications/planning.mdx index bcf32d2ce6..78651bc3ad 100644 --- a/product_docs/docs/pgd/6.3/developing-applications/planning.mdx +++ b/product_docs/docs/pgd/6.3/developing-applications/planning.mdx @@ -62,6 +62,8 @@ The following behaviors differ from a single-node database and require attention - Use `BYTEA` for binary data. `BYTEA` columns replicate fully, including the underlying `TOAST` storage, up to 1 GB. The [PostgreSQL large object](https://www.postgresql.org/docs/current/largeobjects.html) facility isn't supported in PGD. +- Use `REPLICA IDENTITY FULL` on tables with TOAST-able columns (`text`, `bytea`, `jsonb`, large `varchar`, and so on). PGD 6 sets this by default for new tables via the `BDR_AUTO` replica identity, so most users don't need to do anything. If you override it — for example with `ALTER TABLE ... REPLICA IDENTITY DEFAULT` or `... USING INDEX` — concurrent updates that change different TOAST columns of the same row on different nodes can produce stale data on the receiver and cause divergence between nodes. The reason is that PostgreSQL sends unchanged TOAST columns in the replication stream as an *unchanged* marker rather than as full data; under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire pre-update row (including its TOAST data) into WAL so the receiver always has an authoritative source for those columns, while under `REPLICA IDENTITY DEFAULT` or `USING INDEX` the receiver fills them from its (possibly stale) local copy. To check the current setting: `SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table';` (`f` = FULL). **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating rows whose total TOAST data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because PostgreSQL must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There is currently no replication-safe path for rows of that size; avoid creating them. + - Avoid volatile functions in `CHECK` constraints. PGD reexecutes `CHECK` constraints on apply by default, and if a volatile function returns a different result than it did on the origin, replication can break. This behavior is controlled by the [`check_constraints`](/pgd/current/reference/cli/command_ref/group/set-option/) group option. - Don't mark `REPLICA IDENTITY` columns as external. diff --git a/product_docs/docs/pgd/6.4/developing-applications/planning.mdx b/product_docs/docs/pgd/6.4/developing-applications/planning.mdx index e9311e2c3b..feca3ed061 100644 --- a/product_docs/docs/pgd/6.4/developing-applications/planning.mdx +++ b/product_docs/docs/pgd/6.4/developing-applications/planning.mdx @@ -62,6 +62,8 @@ The following behaviors differ from a single-node database and require attention - Use `BYTEA` columns for binary data. They replicate fully, including the underlying `TOAST` storage, up to 1 GB. If your application requires the stream-oriented large object interface, PGD also supports [replicated large objects](/pgd/current/large-objects/). +- Use `REPLICA IDENTITY FULL` on tables with TOAST-able columns (`text`, `bytea`, `jsonb`, large `varchar`, and so on). PGD 6 sets this by default for new tables via the `BDR_AUTO` replica identity, so most users don't need to do anything. If you override it — for example with `ALTER TABLE ... REPLICA IDENTITY DEFAULT` or `... USING INDEX` — concurrent updates that change different TOAST columns of the same row on different nodes can produce stale data on the receiver and cause divergence between nodes. The reason is that PostgreSQL sends unchanged TOAST columns in the replication stream as an *unchanged* marker rather than as full data; under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire pre-update row (including its TOAST data) into WAL so the receiver always has an authoritative source for those columns, while under `REPLICA IDENTITY DEFAULT` or `USING INDEX` the receiver fills them from its (possibly stale) local copy. To check the current setting: `SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table';` (`f` = FULL). **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating rows whose total TOAST data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because PostgreSQL must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There is currently no replication-safe path for rows of that size; avoid creating them. + - Avoid volatile functions in `CHECK` constraints. PGD reexecutes `CHECK` constraints on apply by default, and if a volatile function returns a different result than it did on the origin, replication can break. This behavior is controlled by the [`check_constraints`](/pgd/current/reference/cli/command_ref/group/set-option/) group option. - Don't mark `REPLICA IDENTITY` columns as external. diff --git a/product_docs/docs/pgd/6/appusage/behavior.mdx b/product_docs/docs/pgd/6/appusage/behavior.mdx index 5bd29c9d03..a9d3ca245b 100644 --- a/product_docs/docs/pgd/6/appusage/behavior.mdx +++ b/product_docs/docs/pgd/6/appusage/behavior.mdx @@ -132,6 +132,47 @@ PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. +!!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns + Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large + `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated + safely under concurrent updates from multiple nodes. PGD 6 sets this by + default for new tables via the `BDR_AUTO` replica identity. If you + override this — for example with `ALTER TABLE ... REPLICA IDENTITY + DEFAULT` or `... USING INDEX` — concurrent updates that change different + TOAST columns of the same row on different nodes can produce stale data + on the receiver and cause divergence between nodes. + + The reason: when an `UPDATE` doesn't change a TOAST column, the column is + sent in the replication stream as an *unchanged* marker rather than as + full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire + pre-update row (including its TOAST data) into WAL, so the receiver + always has an authoritative source for unchanged columns. Under + `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills + unchanged TOAST columns from its local copy of the row, which may be + stale if a concurrent update on another node modified a different column + of the same row. + + To check or set: + + ```sql + -- Check current replica identity + SELECT relname, relreplident FROM pg_class + WHERE relname = 'your_table'; + -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX + + -- Set REPLICA IDENTITY FULL + ALTER TABLE your_table REPLICA IDENTITY FULL; + ``` + + **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating + rows whose total TOAST data exceeds approximately 1 GB. Updates on + such rows fail at the source with `invalid memory alloc request + size` because PostgreSQL must flatten the entire pre-update row in + memory before WAL logging, and the flattened tuple exceeds + `MaxAllocSize`. There is currently no replication-safe path for + rows of that size; avoid creating them. +!!! + ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external. From e36b1324646abc2724f46ad0b4044fa991dc5ee6 Mon Sep 17 00:00:00 2001 From: Mireia Perez Fuster Date: Thu, 28 May 2026 17:25:53 +0100 Subject: [PATCH 2/2] Rephrased and formatted Florin's contribution --- product_docs/docs/pgd/4.4/bdr/appusage.mdx | 40 +++++---------- .../docs/pgd/5.6/appusage/behavior.mdx | 45 +++++------------ .../docs/pgd/5.7/appusage/behavior.mdx | 45 +++++------------ .../docs/pgd/5.8/appusage/behavior.mdx | 45 +++++------------ .../docs/pgd/5.9/appusage/behavior.mdx | 45 +++++------------ .../docs/pgd/6.1/appusage/behavior.mdx | 49 +++++-------------- .../docs/pgd/6.2/appusage/behavior.mdx | 49 +++++-------------- .../6.3/developing-applications/planning.mdx | 2 +- .../6.4/developing-applications/planning.mdx | 2 +- product_docs/docs/pgd/6/appusage/behavior.mdx | 48 ++++-------------- 10 files changed, 90 insertions(+), 280 deletions(-) diff --git a/product_docs/docs/pgd/4.4/bdr/appusage.mdx b/product_docs/docs/pgd/4.4/bdr/appusage.mdx index 8527821d7a..5339e2e2b8 100644 --- a/product_docs/docs/pgd/4.4/bdr/appusage.mdx +++ b/product_docs/docs/pgd/4.4/bdr/appusage.mdx @@ -107,37 +107,21 @@ with nonmatching collation qualifiers. Row filters might be affected by differences in collations if collatable expressions were used. BDR handling of very long "toasted" data in PostgreSQL is transparent to -the user. The TOAST "chunkid" values likely differ between +the user. The `TOAST` "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. BDR 4 does not set - this by default — you need to enable it explicitly on every table that - can store TOAST'd values: - - ```sql - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - When an `UPDATE` doesn't change a TOAST column, the column is sent in the - replication stream as an *unchanged* marker rather than as full data. - Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire pre-update - row (including its TOAST data) into WAL, so the receiver always has an - authoritative source for unchanged columns. Under `REPLICA IDENTITY - DEFAULT` (or `USING INDEX`), the receiver fills unchanged TOAST columns - from its local copy of the row, which may be stale if a concurrent - update on another node modified a different column of the same row — - resulting in data divergence between nodes. - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! BDR can't work correctly if Replica Identity columns are marked as external. diff --git a/product_docs/docs/pgd/5.6/appusage/behavior.mdx b/product_docs/docs/pgd/5.6/appusage/behavior.mdx index 85380e7a60..018359f7c7 100644 --- a/product_docs/docs/pgd/5.6/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.6/appusage/behavior.mdx @@ -130,40 +130,17 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 5 does not set - this by default — you need to enable it explicitly on every table that - can store TOAST'd values: - - ```sql - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row — resulting in data divergence between nodes. - - To check the current replica identity: - - ```sql - SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! ### Other restrictions diff --git a/product_docs/docs/pgd/5.7/appusage/behavior.mdx b/product_docs/docs/pgd/5.7/appusage/behavior.mdx index 85380e7a60..018359f7c7 100644 --- a/product_docs/docs/pgd/5.7/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.7/appusage/behavior.mdx @@ -130,40 +130,17 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 5 does not set - this by default — you need to enable it explicitly on every table that - can store TOAST'd values: - - ```sql - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row — resulting in data divergence between nodes. - - To check the current replica identity: - - ```sql - SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! ### Other restrictions diff --git a/product_docs/docs/pgd/5.8/appusage/behavior.mdx b/product_docs/docs/pgd/5.8/appusage/behavior.mdx index 85380e7a60..018359f7c7 100644 --- a/product_docs/docs/pgd/5.8/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.8/appusage/behavior.mdx @@ -130,40 +130,17 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 5 does not set - this by default — you need to enable it explicitly on every table that - can store TOAST'd values: - - ```sql - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row — resulting in data divergence between nodes. - - To check the current replica identity: - - ```sql - SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! ### Other restrictions diff --git a/product_docs/docs/pgd/5.9/appusage/behavior.mdx b/product_docs/docs/pgd/5.9/appusage/behavior.mdx index 85380e7a60..018359f7c7 100644 --- a/product_docs/docs/pgd/5.9/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5.9/appusage/behavior.mdx @@ -130,40 +130,17 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 5 does not set - this by default — you need to enable it explicitly on every table that - can store TOAST'd values: - - ```sql - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row — resulting in data divergence between nodes. - - To check the current replica identity: - - ```sql - SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! ### Other restrictions diff --git a/product_docs/docs/pgd/6.1/appusage/behavior.mdx b/product_docs/docs/pgd/6.1/appusage/behavior.mdx index a9d3ca245b..39d3b0a210 100644 --- a/product_docs/docs/pgd/6.1/appusage/behavior.mdx +++ b/product_docs/docs/pgd/6.1/appusage/behavior.mdx @@ -133,44 +133,17 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 6 sets this by - default for new tables via the `BDR_AUTO` replica identity. If you - override this — for example with `ALTER TABLE ... REPLICA IDENTITY - DEFAULT` or `... USING INDEX` — concurrent updates that change different - TOAST columns of the same row on different nodes can produce stale data - on the receiver and cause divergence between nodes. - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row. - - To check or set: - - ```sql - -- Check current replica identity - SELECT relname, relreplident FROM pg_class - WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - - -- Set REPLICA IDENTITY FULL - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! ### Other restrictions diff --git a/product_docs/docs/pgd/6.2/appusage/behavior.mdx b/product_docs/docs/pgd/6.2/appusage/behavior.mdx index a9d3ca245b..39d3b0a210 100644 --- a/product_docs/docs/pgd/6.2/appusage/behavior.mdx +++ b/product_docs/docs/pgd/6.2/appusage/behavior.mdx @@ -133,44 +133,17 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 6 sets this by - default for new tables via the `BDR_AUTO` replica identity. If you - override this — for example with `ALTER TABLE ... REPLICA IDENTITY - DEFAULT` or `... USING INDEX` — concurrent updates that change different - TOAST columns of the same row on different nodes can produce stale data - on the receiver and cause divergence between nodes. - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row. - - To check or set: - - ```sql - -- Check current replica identity - SELECT relname, relreplident FROM pg_class - WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - - -- Set REPLICA IDENTITY FULL - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` +!!! + +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. !!! ### Other restrictions diff --git a/product_docs/docs/pgd/6.3/developing-applications/planning.mdx b/product_docs/docs/pgd/6.3/developing-applications/planning.mdx index 78651bc3ad..be1db7a8bb 100644 --- a/product_docs/docs/pgd/6.3/developing-applications/planning.mdx +++ b/product_docs/docs/pgd/6.3/developing-applications/planning.mdx @@ -62,7 +62,7 @@ The following behaviors differ from a single-node database and require attention - Use `BYTEA` for binary data. `BYTEA` columns replicate fully, including the underlying `TOAST` storage, up to 1 GB. The [PostgreSQL large object](https://www.postgresql.org/docs/current/largeobjects.html) facility isn't supported in PGD. -- Use `REPLICA IDENTITY FULL` on tables with TOAST-able columns (`text`, `bytea`, `jsonb`, large `varchar`, and so on). PGD 6 sets this by default for new tables via the `BDR_AUTO` replica identity, so most users don't need to do anything. If you override it — for example with `ALTER TABLE ... REPLICA IDENTITY DEFAULT` or `... USING INDEX` — concurrent updates that change different TOAST columns of the same row on different nodes can produce stale data on the receiver and cause divergence between nodes. The reason is that PostgreSQL sends unchanged TOAST columns in the replication stream as an *unchanged* marker rather than as full data; under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire pre-update row (including its TOAST data) into WAL so the receiver always has an authoritative source for those columns, while under `REPLICA IDENTITY DEFAULT` or `USING INDEX` the receiver fills them from its (possibly stale) local copy. To check the current setting: `SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table';` (`f` = FULL). **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating rows whose total TOAST data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because PostgreSQL must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There is currently no replication-safe path for rows of that size; avoid creating them. +- Use `REPLICA IDENTITY FULL` on tables with `TOAST`-able columns (`text`, `bytea`, `jsonb`, large `varchar`, and so on). PGD 6 sets this by default for new tables via the `BDR_AUTO` replica identity. Overriding it with `REPLICA IDENTITY DEFAULT` or `USING INDEX` risks data divergence under concurrent updates. Rows whose total `TOAST` data exceeds approximately 1 GB can't be replicated regardless of this setting. - Avoid volatile functions in `CHECK` constraints. PGD reexecutes `CHECK` constraints on apply by default, and if a volatile function returns a different result than it did on the origin, replication can break. This behavior is controlled by the [`check_constraints`](/pgd/current/reference/cli/command_ref/group/set-option/) group option. diff --git a/product_docs/docs/pgd/6.4/developing-applications/planning.mdx b/product_docs/docs/pgd/6.4/developing-applications/planning.mdx index feca3ed061..c144e1c23b 100644 --- a/product_docs/docs/pgd/6.4/developing-applications/planning.mdx +++ b/product_docs/docs/pgd/6.4/developing-applications/planning.mdx @@ -62,7 +62,7 @@ The following behaviors differ from a single-node database and require attention - Use `BYTEA` columns for binary data. They replicate fully, including the underlying `TOAST` storage, up to 1 GB. If your application requires the stream-oriented large object interface, PGD also supports [replicated large objects](/pgd/current/large-objects/). -- Use `REPLICA IDENTITY FULL` on tables with TOAST-able columns (`text`, `bytea`, `jsonb`, large `varchar`, and so on). PGD 6 sets this by default for new tables via the `BDR_AUTO` replica identity, so most users don't need to do anything. If you override it — for example with `ALTER TABLE ... REPLICA IDENTITY DEFAULT` or `... USING INDEX` — concurrent updates that change different TOAST columns of the same row on different nodes can produce stale data on the receiver and cause divergence between nodes. The reason is that PostgreSQL sends unchanged TOAST columns in the replication stream as an *unchanged* marker rather than as full data; under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire pre-update row (including its TOAST data) into WAL so the receiver always has an authoritative source for those columns, while under `REPLICA IDENTITY DEFAULT` or `USING INDEX` the receiver fills them from its (possibly stale) local copy. To check the current setting: `SELECT relname, relreplident FROM pg_class WHERE relname = 'your_table';` (`f` = FULL). **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating rows whose total TOAST data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because PostgreSQL must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There is currently no replication-safe path for rows of that size; avoid creating them. +- Use `REPLICA IDENTITY FULL` on tables with `TOAST`-able columns (`text`, `bytea`, `jsonb`, large `varchar`, and so on). PGD 6 sets this by default for new tables via the `BDR_AUTO` replica identity. Overriding it with `REPLICA IDENTITY DEFAULT` or `USING INDEX` risks data divergence under concurrent updates. Rows whose total `TOAST` data exceeds approximately 1 GB can't be replicated regardless of this setting. - Avoid volatile functions in `CHECK` constraints. PGD reexecutes `CHECK` constraints on apply by default, and if a volatile function returns a different result than it did on the origin, replication can break. This behavior is controlled by the [`check_constraints`](/pgd/current/reference/cli/command_ref/group/set-option/) group option. diff --git a/product_docs/docs/pgd/6/appusage/behavior.mdx b/product_docs/docs/pgd/6/appusage/behavior.mdx index a9d3ca245b..1889548425 100644 --- a/product_docs/docs/pgd/6/appusage/behavior.mdx +++ b/product_docs/docs/pgd/6/appusage/behavior.mdx @@ -133,46 +133,18 @@ user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems. !!! Warning REPLICA IDENTITY FULL required for tables with TOAST-able columns - Tables that hold TOAST-able column types (`text`, `bytea`, `jsonb`, large - `varchar`, and so on) must use `REPLICA IDENTITY FULL` to be replicated - safely under concurrent updates from multiple nodes. PGD 6 sets this by - default for new tables via the `BDR_AUTO` replica identity. If you - override this — for example with `ALTER TABLE ... REPLICA IDENTITY - DEFAULT` or `... USING INDEX` — concurrent updates that change different - TOAST columns of the same row on different nodes can produce stale data - on the receiver and cause divergence between nodes. - - The reason: when an `UPDATE` doesn't change a TOAST column, the column is - sent in the replication stream as an *unchanged* marker rather than as - full data. Under `REPLICA IDENTITY FULL`, PostgreSQL flattens the entire - pre-update row (including its TOAST data) into WAL, so the receiver - always has an authoritative source for unchanged columns. Under - `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills - unchanged TOAST columns from its local copy of the row, which may be - stale if a concurrent update on another node modified a different column - of the same row. - - To check or set: - - ```sql - -- Check current replica identity - SELECT relname, relreplident FROM pg_class - WHERE relname = 'your_table'; - -- 'd' = DEFAULT, 'n' = NOTHING, 'f' = FULL, 'i' = USING INDEX - - -- Set REPLICA IDENTITY FULL - ALTER TABLE your_table REPLICA IDENTITY FULL; - ``` - - **Limitation:** `REPLICA IDENTITY FULL` does not enable replicating - rows whose total TOAST data exceeds approximately 1 GB. Updates on - such rows fail at the source with `invalid memory alloc request - size` because PostgreSQL must flatten the entire pre-update row in - memory before WAL logging, and the flattened tuple exceeds - `MaxAllocSize`. There is currently no replication-safe path for - rows of that size; avoid creating them. +Tables that hold `TOAST`-able column types (`text`, `bytea`, `jsonb`, large `varchar`, and so on) must use `REPLICA IDENTITY FULL`. BDR 4 doesn't set this behavior by default. Enable it explicitly on every table that can store `TOAST` values: + +```sql +ALTER TABLE your_table REPLICA IDENTITY FULL; +``` !!! +When an `UPDATE` doesn't change a `TOAST` column, the column is sent in the replication stream as an unchanged marker rather than as full data. Under `REPLICA IDENTITY FULL`, Postgres flattens the entire pre-update row (including its `TOAST` data) into WAL, so the receiver always has an authoritative source for unchanged columns. Under `REPLICA IDENTITY DEFAULT` (or `USING INDEX`), the receiver fills unchanged `TOAST` columns from its local copy of the row, which may be stale if a concurrent update on another node modified a different column of the same row, resulting in data divergence between nodes. + +!!! Note +`REPLICA IDENTITY FULL` doesn't enable replicating rows whose total `TOAST` data exceeds approximately 1 GB. Updates on such rows fail at the source with `invalid memory alloc request size` because Postgres must flatten the entire pre-update row in memory before WAL logging, and the flattened tuple exceeds `MaxAllocSize`. There's currently no replication-safe path for rows of that size. Avoid creating them. +!!! ### Other restrictions PGD can't work correctly if Replica Identity columns are marked as external.