Draft: Secondary index specification by pvary · Pull Request #16961 · apache/iceberg

pvary · 2026-06-25T16:07:15Z

No description provided.

Co-authored-by: Huaxin Gao <huaxin.gao11@gmail.com>

RussellSpitzer · 2026-06-26T19:29:44Z

+
+Indexes are optional. Engines may choose to create, maintain, consume, or ignore them.
+
+## Goals


Not sure we need the goals section here?

I think the spec normally has a Goals section (see udf-spec and view-spec). The overlap comes from the This specification defines: list in Background. I think Background should be the motivation and what an index is, and Goals should hold that list. So I suggest keeping Goals and removing the list from Background.

RussellSpitzer · 2026-06-26T19:32:32Z

+The index type communicates the capabilities of an index to query engines and helps determine whether an index is
+applicable to a particular query.
+
+### Index Transform Function


I think these sections (Transform, Instance, Snapshot) should follow the overview. Currently they have a lot of undefined terms in them

Agree to move these definitions to after the Overview section.

RussellSpitzer · 2026-06-26T22:24:05Z

+
+## Overview
+
+Indexes are stored as independent catalog objects.


Suggestion that we just have a short description of the whole thing write here

Suggested change

Indexes are stored as independent catalog objects.

Indexes are stored as a collection of files with some Iceberg table like semantics. At a high level they consist of a tracking file (similar to a root manifest file) which contains listings for a defined set of leaf files (similar to data files.) Leaf files store an ordered set of rows containing at least a key and the path of a Iceberg Table data file and the position within that file where the row where that key is stored. The organization of leaf files is defined by an Indexing Transform which varies based on the type of index. This structure is recorded in an Index metadata.json file which contains a set of snapshots, each of which points to a single tracking file mapping to the complete state of an Iceberg table at a given Iceberg table snapshot.

Agree this is cleaner.

RussellSpitzer · 2026-06-26T22:25:40Z

+| required    | uuid                | string                   | Stable UUID assigned at creation                |
+| required    | table-uuid          | string                   | UUID of the indexed table                       |
+| required    | location            | string                   | Index root location                             |
+| required    | type                | string                   | Logical index type                              |


Are we going to have this only be chosen from a set of index types we define? Feels like we should if these are going to be interoperable. This also makes me think a bit about the "reserved" terms above. I think basically everything should be reserved unless we define it here imho.

Agree it should be a closed set for interoperability. One step back though: I think we only scoped the key-lookup index, we didn't actually agree on a SCALAR/VECTOR/TERM type yet.

RussellSpitzer · 2026-06-26T22:26:13Z

+| required    | table-uuid          | string                   | UUID of the indexed table                       |
+| required    | location            | string                   | Index root location                             |
+| required    | type                | string                   | Logical index type                              |
+| required    | transform-function  | string                   | Physical organization transform                 |


This probably needs to be well defined? An expression or something we explicitly make here?

RussellSpitzer · 2026-06-26T22:26:58Z

+| required    | transform-function  | string                   | Physical organization transform                 |
+| required    | key-column-ids      | list<int>                | Indexed columns                                 |
+| optional    | included-column-ids | list<int>                | Included columns                                |
+| required    | file-format         | string                   | Leaf file format                                |


Why do we need to define the leaf file format? Shouldn't this be done per row in the tracking file?

Agree this should be defined in the tracking file.

RussellSpitzer · 2026-06-26T22:27:36Z

+| optional    | included-column-ids | list<int>                | Included columns                                |
+| required    | file-format         | string                   | Leaf file format                                |
+| optional    | properties          | map<string,string>       | Index properties applicable for every snapshot  |
+| required    | snapshots           | list<index-snapshot>     | Known index snapshots                           |


Why "known" ?

Agree known should be removed.

huaxingao · 2026-06-27T16:46:10Z

+
+| Type   |
+|--------|
+| SCALAR |


SCALAR is listed but never defined. I suggest adding a description column.

huaxingao · 2026-06-27T17:35:37Z

+The transform function determines the physical organization of the indexed data and therefore influences which query
+patterns can efficiently leverage the index.
+
+The following index types are reserved for future specifications:


Suggested change

The following index types are reserved for future specifications:

The following transform functions are defined in this specification::

huaxingao · 2026-06-27T17:41:37Z

+|-----------|
+| IDENTITY  |
+| HASH      |
+| HILBERT   |


I think we agreed the organization transform is an Iceberg-style transform with a sort order, so I think we should use the Iceberg transform names: use bucket instead of hash.

I think for now the key-lookup index only needs identity and bucket, so we should move hilbert to the reserved table below.

Also add a sentence somewhere to say that tuple transforms like (bucket(key, 256), key) (bucket first, then sort) are also supported.

huaxingao · 2026-06-27T17:52:10Z

+- The transform function
+- The indexed columns
+- The included columns
+- Index properties


Shall we mark The included columns and Index properties optional?

huaxingao · 2026-06-27T17:59:03Z

+```text
+Index Metadata
+    |
+    +-- Index Snapshot


+-- Index Snapshot (one or more)?

huaxingao · 2026-06-27T18:13:23Z

+| optional    | included-column-ids | list<int>                | Included columns                                |
+| required    | file-format         | string                   | Leaf file format                                |
+| optional    | properties          | map<string,string>       | Index properties applicable for every snapshot  |
+| required    | snapshots           | list<index-snapshot>     | Known index snapshots                           |


The metadata has a snapshots list but nothing says which one is current. Should we add a current-snapshot-id, or define current as the snapshot whose source-table-snapshot-id matches the table's current snapshot?

huaxingao · 2026-06-27T18:15:23Z

+| 103      | record_count       | long    | required     | Number of records contained in the referenced file or aggregated under the referenced tracking file.         |
+| 104      | file_size_in_bytes | long    | required     | Total file size in bytes.                                                                                    |
+| 146      | content_stats      | struct  | optional     | Statistics used for planning and pruning, including transform-key statistics and optional column statistics. |
+| 131      | key_metadata       | binary  | optional     | Implementation-specific key metadata, used for leaf file encryption.                                         |


key_metadata -> key-metadata?

huaxingao · 2026-06-27T18:17:42Z

+
+The tracking file may be stored using any supported metadata file format.
+
+### Tracking File Entry


remove this

huaxingao · 2026-06-27T18:22:49Z

+| 101      | file_format        | string  | required     | File format name, such as parquet, avro, or orc.                                                             |
+| 103      | record_count       | long    | required     | Number of records contained in the referenced file or aggregated under the referenced tracking file.         |
+| 104      | file_size_in_bytes | long    | required     | Total file size in bytes.                                                                                    |
+| 146      | content_stats      | struct  | optional     | Statistics used for planning and pruning, including transform-key statistics and optional column statistics. |


Does content_stats contain the transform bounds (transform_min / transform_max)? If so, I think we should make them explicit, required fields. They're needed for routing and non-overlapping ranges, but content_stats is marked optional here, so the bounds could be missing.

huaxingao · 2026-06-27T21:31:14Z

+|----------|--------------------|---------|--------------|--------------------------------------------------------------------------------------------------------------|
+| 100      | location           | string  | required     | Location of the referenced file.                                                                             |
+| 101      | file_format        | string  | required     | File format name, such as parquet, avro, or orc.                                                             |
+| 103      | record_count       | long    | required     | Number of records contained in the referenced file or aggregated under the referenced tracking file.         |


remove or aggregated under the referenced tracking file?

huaxingao · 2026-06-27T21:46:43Z

+
+The schema of a leaf file is determined by the index definition and contains:
+- All key columns defined by the index
+- All included columns defined by the index


Since this is optional, maybe word it as "Any included columns defined by the index" to make clear it can be empty?

huaxingao · 2026-06-27T21:49:53Z

+The schema of a leaf file is determined by the index definition and contains:
+- All key columns defined by the index
+- All included columns defined by the index
+- The transform value produced by the transform function


for an identity transform on the key, the transform value equals the key column, do we still want to save the transform value?

huaxingao · 2026-06-27T22:03:47Z

+
+The following index types are reserved for future specifications:
+
+| Transform |


The Leaf Files Transform functions section also has this table and the reserved table below it. Should we remove the tables here, or remove them from the Leaf Files section, so the list lives in only one place?

huaxingao · 2026-06-27T22:19:27Z

+|-----------|-----------------|--------|------------------------------------------------------------------------|
+| TBD       | transform_value | long   | The result of applying the index transform function to the key columns |
+| TBD       | file_path       | string | The path of the source data file the entry references                  |
+| TBD       | position        | long   | The row position of the entry within the source data file              |


file_path and position are basically Iceberg's reserved _file (2147483646) and _pos (2147483645). Should we reuse those reserved IDs and give transform_value another reserved ID?

huaxingao · 2026-06-27T22:22:55Z

+
+| Field Id  | Column          | Type   | Description                                                            |
+|-----------|-----------------|--------|------------------------------------------------------------------------|
+| TBD       | transform_value | long   | The result of applying the index transform function to the key columns |


the type is not always long, maybe change to determined by the transform function?

huaxingao · 2026-06-27T22:25:46Z

+Transform Function:
+
+```text
+HASH(primary_key)


change to bucket(primary_key, N)?

huaxingao · 2026-06-27T22:26:45Z

+| file_path        |
+| position         |
+
+The leaf files are organized by hash key, while the tracking file stores summary information and pruning statistics.


The leaf files are organized by hash key -> The leaf files are organized by transform value?

huaxingao · 2026-06-27T23:47:06Z

Since @pvary is out, I'll make the simple/mechanical changes now to keep this PR moving forward and leave the design decisions for him to review when he's back.

Co-authored-by: pvary <peter.vary.apache@gmail.com>

github-actions Bot added the Specification Issues that may introduce spec changes. label Jun 25, 2026

Index.md

4151eb9

pvary force-pushed the index_spec branch from cc0b7e4 to 4151eb9 Compare June 25, 2026 16:08

Add co-author

23ba6ab

Co-authored-by: Huaxin Gao <huaxin.gao11@gmail.com>

RussellSpitzer reviewed Jun 26, 2026

View reviewed changes

huaxingao reviewed Jun 27, 2026

View reviewed changes

Spec: Address review comments on the index spec

ed3c525

Co-authored-by: pvary <peter.vary.apache@gmail.com>


		Indexes are optional. Engines may choose to create, maintain, consume, or ignore them.

		## Goals


		## Overview

		Indexes are stored as independent catalog objects.

	The following index types are reserved for future specifications:
	The following transform functions are defined in this specification::


		The tracking file may be stored using any supported metadata file format.

		### Tracking File Entry


		The following index types are reserved for future specifications:

		\| Transform \|

Uh oh!

Conversation

pvary commented Jun 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huaxingao commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants