Skip to content

HIVE-29565: Add DB configs to HS2 docker-compose to enable Hive compaction#6430

Open
deniskuzZ wants to merge 1 commit intoapache:masterfrom
deniskuzZ:docker_compaction
Open

HIVE-29565: Add DB configs to HS2 docker-compose to enable Hive compaction#6430
deniskuzZ wants to merge 1 commit intoapache:masterfrom
deniskuzZ:docker_compaction

Conversation

@deniskuzZ
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ commented Apr 14, 2026

What changes were proposed in this pull request?

Add missing DB configs to HS2

Why are the changes needed?

HS2 uses a direct connection to backend DB for Compaction

Does this PR introduce any user-facing change?

No

How was this patch tested?

docker-compose + beeline with compaction sql

@deniskuzZ deniskuzZ requested a review from abstractdog April 14, 2026 07:24
@deniskuzZ deniskuzZ changed the title HIVE-29565: Add DB configs to HS2 to enable Hive compaction HIVE-29565: Add DB configs to HS2 doker-compose to enable Hive compaction Apr 14, 2026
@deniskuzZ deniskuzZ changed the title HIVE-29565: Add DB configs to HS2 doker-compose to enable Hive compaction HIVE-29565: Add DB configs to HS2 docker-compose to enable Hive compaction Apr 14, 2026
@sonarqubecloud
Copy link
Copy Markdown

@abstractdog
Copy link
Copy Markdown
Contributor

abstractdog commented Apr 14, 2026

@deniskuzZ: I tested this with docker llap + compaction:

export POSTGRES_LOCAL_PATH=~/.m2/repository/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar
./build.sh -hive 4.3.0-SNAPSHOT -hadoop 3.4.1 -tez 0.10.5
./start-hive.sh --llap
docker compose --profile llap logs -f

# in beeline as you suggested:

set hive.mapred.mode=nonstrict;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

-- Check with update
drop table if exists test_major_minor;
CREATE TABLE test_major_minor (
  name VARCHAR(50),
  age TINYINT,
  num_clicks BIGINT)
stored as orc TBLPROPERTIES ("transactional"="true", 'NO_AUTO_COMPACTION'='true');

insert into test_major_minor values ('amy', 35, 12341234);
insert into test_major_minor values ('bob', 66, 1234712348712);
insert into test_major_minor values ('cal', 21, 431);
insert into test_major_minor values ('fse', 28, 8456);

--before compaction
SELECT * FROM test_major_minor;

ALTER TABLE test_major_minor  COMPACT 'major' ;

unfortunately, the full successful testing is blocked on HIVE-29566, but at least I reached the QueryCompactor codepath with your patch, which is fine and promising
however, there are some confusing things here, that need clarification before moving on:

HS2 uses a direct connection to the backend DB for Compaction

this is confusing, can you confirm if this is by design? does this mean that on hive clusters, the backend RDBMS configs have to be replicated HS2 in order to run query-based compaction?
depending on your answer, I can imagine 2 possible outcomes of this patch:

  1. keep it as is, considering that CRUD major compaction is the primary supported one (is it?)
  2. remove CRUD compaction config with and all the postgres connection configs from HS2 to simplify this patch

I'm fine with 1) if at least a comment in docker compose file can emphasize that HS2 really needs postgres connection configs (for CRUD compaction), otherwise, we cannot guarantee that confusing folks in the future (like me now :) ) won't remove this config

if we can agree on / clarify this, the patch is +1 from my side, as fixing/supporting compaction is a great improvement for our wonderful docker-based testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants