Skip to content

Bug: multi-index search drops index provenance (hits become indistinguishable / id collisions) #38

@kapral18

Description

@kapral18

Problem

When querying multiple indices in Elasticsearch (e.g. index: kibana,elasticsearch), each hit has an _index field.

This MCP currently drops that provenance when mapping hits into tool responses (e.g. semantic_code_search returns only { id, score, type, language, kind, content, locations }).

That makes multi-index results indistinguishable to the caller and also creates a correctness hazard:

  • _id values are only unique per index, not globally. Two different indices can return the same _id, but this MCP treats id as globally unique.

Impact

  • Users cannot tell which repo/index a hit came from.
  • Follow-up operations that rely on a single id can be ambiguous or wrong if _id collisions occur across indices.

Code pointers

  • src/mcp_server/tools/semantic_code_search.ts: hit._id is used but hit._index is not exposed.
  • Other tools also key results purely by filePath (which can also collide across repos/indices).

Expected

  • Include index provenance in responses (e.g. index or sourceIndex per hit), and ensure downstream joins/reads can disambiguate using (index, id).
  • If multi-index is not supported, fail fast with a clear error instead of returning blended/untraceable results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions