Implement cluster shrink (2nd phase) by whitehawk · Pull Request #2247 · arenadata/gpdb

whitehawk · 2026-02-16T03:03:42Z

Implement cluster shrink (2nd phase)

List of changes:

Change the order of shrunk segment processes stopping. Now mirrors are
stopped strictly after primaries in order to avoid hanging replication
processes.
Do not stop the tool execution in case we couldn't stop some of the shrinked
segments. Now we only emit a warning. It is done to comply with the
requirements.
Rework fault injection when stopping a segment due to the item above, as now
we will not stop in case of an exception inside the 'SegmentStopAfterShrink'
worker. So now, when a fault is injected, send SIGINT to the ggrebalance
process to halt its work.
Improve logging inside 'SegmentStopAfterShrink'.
Add support for redistribution of materialized views, external writable
tables, partitioned tables, unlogged tables. Skip processing of temp tables.
It is done to comply with the requirements.
Add 'relkind' info into the rebalance schema, as materialized views require
2-phase handling: 1st to perform 'ALTER TABLE ... REBALANCE' and 2nd to
perform 'REFRESH MATERIALIZED VIEW'.
Implement the mentioned 2-phase handling for materialized views. Add separate
type of workers 'MatViewRefreshTask' (derived from 'TableRebalanceTask') to work
during the 2nd phase. 2nd phase doesn't intersect with the 1st phase, because
REFRESH may not properly update the materialized view if the table it depends
on is currently being rebalanced. For ex.:

1: create table t(a int) distributed by (a);
1: create materialized view mv_t as select a from t distributed by (a);
1: alter materialized view mv_t rebalance 1;

2: begin;
2: alter table t rebalance 1;

1: refresh materialized view mv_t;
2: commit;

In this case the materialized view will contain values only from 1 segment. It
is a consequence of how Postgres/Greengage work with the snapshots of the system
catalog and the table's data. When the REFRESH command starts, it works with the
snapshot of table data (from 't'), where the data is still distributed across
several segments (as the transaction of session №2 is not yet committed). Then
REFRESH command suspends its execution as it depends on 't' locked by the other
transaction. Once the second transaction commits, REFRESH is unlocked, and it
immediately sees the changes of system catalog (it is 'work as designed' in
Postgres/Greengage), so it sees that the 't' distribution is the same as 'mv_t',
and creates a plan to update data in 'mv_t' without motions. And, as result, we
have data in the materialized view only from one segment.
To resolve this situation, we separate the stages of tables rebalance and
materialized views refreshing in time.

Add checks that the database and the table exists before we actually start
to rebalance the table. It is needed as one could drop it in parallel after we
have created the rebalance table list.
Add retry logic into table rebalance worker. It is needed, when for ex.,
other session opens a transaction after we have created the rebalance table
list, drops the table before we started to rebalance it, and commits the
transaction when we started to rebalance the table (and are hanging on the
table's locks).
Remove not used flag 'needs_repopulate'.
Add new behave test cases and update old ones to cover the new functionality.
Add new behave step definitions to support the updates in the tests.
Fix behave test steps for view/matview creation - they opened a connection,
but didn't use it. Instead, they tried to use the connection from the context,
which was not properly configured.
Update code in the behave utils to support new test step definitions for
materialized views and unlogged tables.
Add into the fault injector the ability to suspend execution instead of
crashing it.

…gging

whitehawk · 2026-02-19T10:07:20Z

2nd to perform 'REFRESH MATERIALIZED VIEW'.

Why is just rebalancing not enough? does gpexpand refresh mat view after expanding?

After f2f discussion, we need to evaluate CTAS approach for mat views, as current approach can have potential issues with race condition, if one mat view depends on another mat view. Created GG-225.

KnightMurloc · 2026-02-19T10:24:28Z

Skip processing of temp tables

Why? If the database is still available to users during shrink, what will be the state of their temporary tables after that?

bimboterminator1 · 2026-02-19T10:51:42Z

Created GG-225.

Current approach won't be cut off for now?

bimboterminator1 · 2026-02-19T10:54:20Z

gpMgmt/test/behave/mgmt_utils/steps/mgmt_utils.py

+@when('the user waits till {process_name} prints "{log_msg}" in the logs')
+@then('the user waits till {process_name} prints "{log_msg}" in the logs')
+def impl(context, process_name, log_msg):
+    command = "while sleep 0.1; " \


Is any timeout needed in case of log_msg absence ? Or it's only fair for run_async_command?

KnightMurloc · 2026-02-19T10:47:47Z

gpMgmt/bin/gprebalance_modules/shrink.py

+                    if self.rel_kind == 'm' and table_exists:
+                        self.shrink.rebalance_schema.setStatusForTableToRebalance(self.db_name, self.schema_name, self.rel_name, self.STATUS_MAT_VIEW_REFRESH_REQUIRED)


Why don't we need to do analyze for mat view? mat views have their own statistics in pg_statistics.

KnightMurloc · 2026-02-19T11:00:22Z

gpMgmt/test/behave/mgmt_utils/steps/mgmt_utils.py

+@given('a long-run session starts')
+@when('a long-run session starts')
+@then('a long-run session starts')
+def impl(context):
+    dbname = 'gptest'
+    context.long_run_conn = dbconn.connect(dbconn.DbURL(dbname=dbname), unsetSearchPath=False)
+
+@given('a long-run session ends')
+@when('a long-run session ends')
+@then('a long-run session ends')
+def impl(context):
+    if context.long_run_conn != None:
+        context.long_run_conn.close()
+    context.long_run_conn = None
+
+@given('sql "{sql}" is executed in a long-run session')
+@when('sql "{sql}" is executed in a long-run session')
+@then('sql "{sql}" is executed in a long-run session')
+def impl(context, sql):
+    dbconn.execSQL(context.long_run_conn, sql)
+


Can we use:
the user connects to "{dbname}" with named connection "{cname}"
the user drops the named connection "{cname}"
and
the user executes "{sql}" with named connection "{cname}"

whitehawk added 12 commits February 10, 2026 12:57

Exclude temp tables from shrink + test unlogged tables

affd3c0

Support shrink of matviews

ae5133e

Add check for shrink of partitioned tables

ff134f9

Support ext writable tables in shrink

afec7ba

Rework matviews handling, add more table types into tests, improve lo…

11024d2

…gging

Add more interruption points into the test with cluster restart

8014899

Check table and db existence

1c3f8fe

Updates for mat views

0848321

Update segments stop procedure

7e1ecf2

Fix the case when table is dropped in a parallel transaction

643f838

Cosmetic changes

ad6430e

Merge branch 'feature/ADBDEV-6608' into GG-110

3ff91f0

whitehawk changed the title ~~Gg 110~~ Implement cluster shrink (2nd phase) Feb 18, 2026

whitehawk added 3 commits February 18, 2026 15:21

Fix tests

0c4487d

Remove redundant test

6fec958

Cosmetic changes

35bab85

whitehawk marked this pull request as ready for review February 18, 2026 05:48

This comment was marked as resolved.

Sign in to view

bimboterminator1 reviewed Feb 19, 2026

View reviewed changes

KnightMurloc reviewed Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement cluster shrink (2nd phase)#2247

Implement cluster shrink (2nd phase)#2247
whitehawk wants to merge 15 commits intofeature/ADBDEV-6608from
GG-110

whitehawk commented Feb 16, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

whitehawk commented Feb 19, 2026

Uh oh!

KnightMurloc commented Feb 19, 2026

Uh oh!

bimboterminator1 commented Feb 19, 2026

Uh oh!

bimboterminator1 Feb 19, 2026 •

edited

Loading

Uh oh!

KnightMurloc Feb 19, 2026

Uh oh!

KnightMurloc Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

		if self.rel_kind == 'm' and table_exists:
		self.shrink.rebalance_schema.setStatusForTableToRebalance(self.db_name, self.schema_name, self.rel_name, self.STATUS_MAT_VIEW_REFRESH_REQUIRED)

Conversation

whitehawk commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

whitehawk commented Feb 19, 2026

Uh oh!

KnightMurloc commented Feb 19, 2026

Uh oh!

bimboterminator1 commented Feb 19, 2026

Uh oh!

bimboterminator1 Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KnightMurloc Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

KnightMurloc Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

whitehawk commented Feb 16, 2026 •

edited

Loading

bimboterminator1 Feb 19, 2026 •

edited

Loading