ADBDEV-3951: Backport of "Implemented InPlaceUpdate to be used for updates made on non-distribution columns"#609
ADBDEV-3951: Backport of "Implemented InPlaceUpdate to be used for updates made on non-distribution columns"#609KnightMurloc wants to merge 2 commits intoadb-6.x-devfrom
Conversation
|
Allure report https://allure-ee.adsw.io/launch/53331 |
a199223 to
64582c6
Compare
|
Allure report https://allure-ee.adsw.io/launch/53341 |
6d7cfac to
ddc5dde
Compare
|
Allure report https://allure-ee.adsw.io/launch/53456 |
ddc5dde to
20e1c18
Compare
|
Allure report https://allure-ee.adsw.io/launch/53506 |
20e1c18 to
92a9469
Compare
|
Allure report https://allure-ee.adsw.io/launch/53536 |
92a9469 to
f7148c0
Compare
|
Allure report https://allure-ee.adsw.io/launch/53552 |
|
If it's a backport, then maybe you should have cherrypicked it to keep the authorship. |
|
Allure report https://allure-ee.adsw.io/launch/60486 |
|
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/890515 |
|
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/890516 |
|
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/890509 |
|
Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/890510 |
129e453 to
a02d221
Compare
|
Allure report https://allure-ee.adsw.io/launch/60723 |
|
Failed job Orca unittests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/905014 |
|
Failed job ORCA code linter on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/905019 |
|
Failed job Orca unittests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/905013 |
If it doesn't apply to the specified backport, then might it be better to make it a separate commit(s) (perhaps a backport(s) of some other commit(s))? |
|
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/905017 |
|
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/905018 |
These changes are necessary for the patch to work. GP7 uses the Modify Table node instead of DML. I'm not sure if we want to port these changes. |
ee9bb4e to
c6b7dd6
Compare
|
Allure report https://allure-ee.adsw.io/launch/61589 |
|
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/958078 |
|
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/958079 |
|
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/958072 |
This comment was marked as resolved.
This comment was marked as resolved.
|
Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/958073 |
c6b7dd6 to
0031781
Compare
|
Allure report https://allure-ee.adsw.io/launch/61705 |
|
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/962859 |
|
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/962861 |
0031781 to
f2124d9
Compare
Added the exec Update call in the DML node. Added partitioning selection in case of In-Place update by sorted table. added a call to reconstructMatchingTupleSlot in execUpdate to update plans generated by ORCA to handle cases of dropped attributes.
…tion columns.
Currently, ORCA uses Split-Update for updates on both distribution
and non-distribution columns. With this commit,
ORCA uses an InPlaceUpdate whenever updates are made to
non-distribution columns or non-partition keys, and Split Update
if any of modified columns are either distribution or partition keys.
Consider below setup where we are updating
non-distibution column, b in the table foo.
`
create table foo(a int, b int);
explain update foo set b=4;
`
ORCA produces plan with Split and Update nodes
```
Update on public.foo
-> Result
Output: foo_1.a, foo_1.b, (DMLAction), foo_1.ctid, foo_1.gp_segment_id
-> Split
Output: foo_1.a, foo_1.b, foo_1.ctid, foo_1.gp_segment_id, DMLAction
-> Seq Scan on public.foo foo_1
Output: foo_1.a, foo_1.b, 4, foo_1.ctid, foo_1.gp_segment_id
```
There is no point in using a Split and Update for this as we are updating a
non-distribution column which do not require any redistribution. This
commit uses an InPlace Update to perform updates on non-distribution
columns like Planner. Below is the new plan produced with this commit.
New Plan
```
Update on public.foo
-> Seq Scan on public.foo foo_1
Output: foo_1.a, 4, foo_1.ctid, foo_1.gp_segment_id
Optimizer: Pivotal Optimizer (GPORCA)
```
greenplum 6 specific changes:
1. Some constructors have been changed because the list of arguments in 6X and 7X
are different.
2. fixed a bug in CParseHandlerPhysicalDML::startElement where preserve_oids_xml
was used instead of fSplit, which could lead to SIGSEGV during DML node parsing.
3. Changed create_index_hot test. Removed disabling the optimizer before updating
since ORCA no longer uses split update in this case.
(cherry picked from commit 3ced85b)
f2124d9 to
c89d8ac
Compare
|
Allure report https://allure-ee.adsw.io/launch/61708 |
|
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/962989 |
|
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/962990 |
|
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/962981 |
| /* | ||
| * Perform partition selection for in place update | ||
| */ | ||
| if (isUpdate && !AttributeNumberIsValid(plannode->actionColIdx)) |
There was a problem hiding this comment.
We don't need to perform partition selection in case of updating leaf partition. Do we?
| * Perform partition selection for in place update | ||
| */ | ||
| if (isUpdate && !AttributeNumberIsValid(plannode->actionColIdx)) | ||
| node->ps.state->es_result_relation_info = |
There was a problem hiding this comment.
Do the tests check this code branch?
| checkPartitionUpdate(estate, slot, resultRelInfo); | ||
|
|
||
| if (planGen == PLANGEN_OPTIMIZER) | ||
| slot = reconstructMatchingTupleSlot(slot, resultRelInfo); |
There was a problem hiding this comment.
Do the tests check this code branch?
Implemented InPlaceUpdate to be used for updates made on non-distribution columns.
Currently, ORCA uses Split-Update for updates on both distribution
and non-distribution columns. With this commit,
ORCA uses an InPlaceUpdate whenever updates are made to
non-distribution columns or non-partition keys, and Split Update
if any of modified columns are either distribution or partition keys.
Consider below setup where we are updating
non-distibution column, b in the table foo.
create table foo(a int, b int); explain update foo set b=4;ORCA produces plan with Split and Update nodes
There is no point in using a Split and Update for this as we are updating a
non-distribution column which do not require any redistribution. This
commit uses an InPlace Update to perform updates on non-distribution
columns like Planner. Below is the new plan produced with this commit.
New Plan
greenplum 6 specific changes:
are different.
was used instead of fSplit, which could lead to SIGSEGV during DML node parsing.
since ORCA no longer uses split update in this case.
node. Added partitioning selection in case of In-Place update by sorted table.
Added a call to reconstructMatchingTupleSlot in execUpdate to update plans
generated by ORCA to handle cases of dropped attributes.
original commit: https://github.com/greenplum-db/gpdb/commit/3ced85b65732728edff66fd9a2ce6d7485d65a06