Skip to content

feat(hive): vendor Hive 3.1 metastore + fb303 Thrift IDL#694

Open
MisterRaindrop wants to merge 1 commit into
apache:mainfrom
MisterRaindrop:chore/hive-vendor-idl
Open

feat(hive): vendor Hive 3.1 metastore + fb303 Thrift IDL#694
MisterRaindrop wants to merge 1 commit into
apache:mainfrom
MisterRaindrop:chore/hive-vendor-idl

Conversation

@MisterRaindrop

Copy link
Copy Markdown
Contributor

Vendor the Apache Hive 3.1 standalone-metastore IDL and the fb303 helper IDL it includes into third_party/hive_metastore/. These files are the input for the C++ HMS client bindings, generated by a follow-up commit that invokes thrift --gen cpp at build time.

Provenance:

  • hive_metastore.thrift - apache/hive @ branch-3.1, standalone-metastore
  • share/fb303/if/fb303.thrift - apache/thrift @ master, contrib/fb303

Both upstream files retain their Apache 2.0 license headers; only trailing whitespace and final newlines were normalized by the repository's pre-commit hooks. third_party/hive_metastore/NOTICE records the upstream sources, and the project root NOTICE references it. .github/.licenserc.yaml gains third_party/** to paths-ignore so the license-eye check skips the vendored tree.

@wgtmac wgtmac left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for importing Hive related files. I've left some minor comments. BTW, the title is not core. Perhaps rename to feat(hive): vendor Hive 3.1 metastore + fb303 Thrift IDL

Comment thread .github/.licenserc.yaml Outdated
- 'requirements.txt'
- 'src/iceberg/util/murmurhash3_internal.*'
- 'src/iceberg/test/resources/**'
- 'third_party/**'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename third_party to thirdparty which is more widely used?

Comment thread third_party/hive_metastore/NOTICE Outdated
================

* hive_metastore.thrift
Apache Hive 3.1 standalone-metastore.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this specific version? How do we want to upgrade or maintain multiple versions in the future?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I selected Hive 3.1.3 because it is a mature and widely deployed HMS
version. I also considered the precedent from iceberg-rust, which maintains a single
client generated from a Hive 2.3 IDL and integration-tests it against Hive 3.1.3.

For future maintenance, I propose keeping a single vendored IDL pinned to an
immutable Hive release tag or commit, rather than maintaining separate generated
clients for each Hive version. The implementation should use RPCs shared across the
supported versions whenever possible.

If future Hive releases introduce incompatible RPC changes, we can add narrowly
scoped runtime adapters or fallback logic, allowing one build of iceberg-cpp to
support multiple Hive versions. We should validate and document the supported
versions through a CI compatibility matrix before claiming compatibility.

Comment thread NOTICE Outdated
This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).

Third-party Thrift IDLs vendored under third_party/hive_metastore/ are

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, ASF projects are exempted here. cc expert @jbonofre

Comment thread third_party/hive_metastore/NOTICE Outdated

* hive_metastore.thrift
Apache Hive 3.1 standalone-metastore.
Source: https://github.com/apache/hive/blob/branch-3.1/standalone-metastore/src/main/thrift/hive_metastore.thrift

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also pin the exact upstream commit SHAs (or tags) for the vendored files in third_party/hive_metastore/NOTICE or a README so provenance is reproducible and future updates are deterministic.

@MisterRaindrop MisterRaindrop force-pushed the chore/hive-vendor-idl branch from 291b25f to 3d67f5d Compare June 8, 2026 07:52
@MisterRaindrop MisterRaindrop changed the title chore(hive): vendor Hive 3.1 metastore + fb303 Thrift IDL feat(hive): vendor Hive 3.1 metastore + fb303 Thrift IDL Jun 8, 2026
Vendor the Apache Hive standalone-metastore IDL and the fb303 helper
IDL it includes into thirdparty/hive_metastore/. These files are the
input for the C++ HMS client bindings, generated by a follow-up commit
that invokes `thrift --gen cpp` at build time.

Provenance is pinned to immutable upstream tags and commit SHAs so it
is reproducible and future updates are deterministic:
* hive_metastore.thrift       - apache/hive rel/release-3.1.3
                                @ 04c1b307d1bbd1ae268ad47dc36ca4f50c6d9cd8
* share/fb303/if/fb303.thrift - apache/thrift v0.14.0
                                @ 8411e189b0af09e5baad34031555870cf692c1ad

Both upstream files retain their original Apache 2.0 license headers;
only trailing whitespace and final newlines were normalized by the
repository's pre-commit hooks. thirdparty/hive_metastore/README.md
records the pinned sources. The vendored tree consists of other ASF
projects' files, so no NOTICE entry is required (ASF projects are
exempt); .github/.licenserc.yaml adds thirdparty/** to paths-ignore so
the license-eye check skips it.

Part of the iceberg-cpp HiveCatalog port that follows iceberg-rust's
iceberg-catalog-hms crate as a blueprint.
@MisterRaindrop MisterRaindrop force-pushed the chore/hive-vendor-idl branch from 3d67f5d to d9fc3f8 Compare June 8, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants