Skip to content

Filter out Elasticsearch metadata columns from result sets by default #45

@fupelaqu

Description

@fupelaqu

Description

Query results currently include Elasticsearch internal metadata fields (_id, _index, _score) appended to every row alongside business columns. This is confusing for end users and pollutes result sets in downstream tools like Superset, DuckDB, and JDBC clients.

Example

SELECT * FROM ecommerce LIMIT 1

Current result:

order_id customer_name ... _id _index _score
ORD-001 Alice Martin ... 1 ecommerce 1.0

Expected Behaviour

Default: exclude metadata columns

SELECT * FROM ecommerce LIMIT 1
order_id customer_name ...
ORD-001 Alice Martin ...

Explicit access: allow when requested

SELECT _id, _score, customer_name FROM ecommerce LIMIT 1
_id _score customer_name
1 1.0 Alice Martin

Metadata Fields to Filter

Field Description
_id Document ID
_index Index name
_score Relevance score (from full-text search)

Impact

  • Analytics tools (Superset, DuckDB) display irrelevant metadata columns
  • Arrow Flight SQL and JDBC clients receive unexpected columns in the schema
  • SELECT * returns a different column count than expected from the index mapping

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions