HIVE-29522: Prevent cleaner from creating COMPLETED_COMPACTIONS entries for soft-deleted ACID tables#6399
HIVE-29522: Prevent cleaner from creating COMPLETED_COMPACTIONS entries for soft-deleted ACID tables#6399VenuReddy2103 wants to merge 2 commits intoapache:masterfrom
Conversation
6b853b8 to
e455b50
Compare
...store-server/src/main/java/org/apache/hadoop/hive/metastore/txn/entities/CompactionInfo.java
Outdated
Show resolved
Hide resolved
| NamedParameterJdbcTemplate jdbcTemplate = jdbcResource.getJdbcTemplate(); | ||
| MapSqlParameterSource param; | ||
| if (!info.isAbortedTxnCleanup()) { | ||
| if (!info.isAbortedTxnCleanup() && !info.isSoftDelete()) { |
There was a problem hiding this comment.
can we just add at the top
if (info.isSoftDelete()) {
removeTxnComponents(info, jdbcResource);
return
}
There was a problem hiding this comment.
If it is not soft delete, we want to add a row to COMPLETED_COMPACTIONS, delete row from COMPACTION_QUEUE(in removeCompactionAndAbortRetryEntries()), delete rows from COMPLETED_TXN_COMPONENTS and TXN_COMPONENTS too. Have refactored a bit to avoid !info.isSoftDelete() at too many places now.
There was a problem hiding this comment.
sorry, updated the snipper, removed NOT
There was a problem hiding this comment.
Actually table related entries from TXN_COMPONENTS would have been deleted when table is dropped. So we don’t need to call removeTxnComponents() here if soft delete is true.
We just need to remove table entry from COMPACTION_QUEUE when soft delete is true and that happens in removeCompactionAndAbortRetryEntries()
There was a problem hiding this comment.
Have modified it as below to make early return.
if (info.isSoftDelete()) {
removeCompactionAndAbortRetryEntries(info, jdbcTemplate);
return null;
}
...store/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/AcidEventListener.java
Outdated
Show resolved
Hide resolved
e455b50 to
573e805
Compare
573e805 to
a215223
Compare
a215223 to
d15ae24
Compare
…es for soft-deleted ACID tables
d15ae24 to
c64d0b7
Compare
| cleanUsingAcidDir(ci, t, path, cleanerWaterMark); | ||
| } | ||
| } else { | ||
| ci.setSoftDelete(true); |
There was a problem hiding this comment.
could we set softDelete flag inside cleanUsingLocation? i think we call it only in case of DB/table/partition softDelete
we should also skip adding entries for soft-deleted partitions
There was a problem hiding this comment.
Agreed and done
0bbff59 to
399d0ff
Compare
|



What changes were proposed in this pull request?
When a table is soft deleted,
MarkCleanedFunctiondo not add a row in theCOMPLETED_COMPACTIONStable.Why are the changes needed?
When the configuration property hive.acid.createtable.softdelete is set to true, deletion of ACID table data is performed asynchronously by a background cleaner thread.
As part of this process, the cleaner thread removes entry corresponding to the acid table from the COMPACTION_QUEUE and records a completion entry in the COMPLETED_COMPACTIONS table using
org.apache.hadoop.hive.metastore.txn.TxnStore#markCleaned().However, by the time this operation is executed, the associated ACID table has already been deleted (soft-deleted) in HMS. As a result, a COMPLETED_COMPACTIONS entry is created for a table that no longer exists.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Tested manually