Skip to content

[routing] Add LLM-powered ClusteringEngine implementation for semantic tool grouping #153

@dgenio

Description

@dgenio

Context

Issue #47 defines the ClusteringEngine protocol for partitioning tools into groups. The default implementation will be JaccardClusterer, wrapping the existing tag/Jaccard-based logic from TreeBuilder. However, no LLM-powered implementation is planned — and this is where the core Pillar 3 value lies: "use an LM to better understand the relationship between tools."

Current state

Why it matters

  • Vision centerpiece — "Use an LM to understand the relationship between tools" requires semantic understanding of tool descriptions, not just tag matching.
  • Better ChoiceGraph quality — LLM-grouped trees present more intuitive navigation to the agent: "Communication tools" vs "Data tools" vs "Admin tools" instead of arbitrary tag-based splits.
  • Scale — At 500+ tools, manual tagging breaks down. Semantic clustering scales without human curation.

Acceptance Criteria

  • LLMClusteringEngine class implementing ClusteringEngine protocol (from [routing] Add EngineRegistry with pluggable Retriever, Reranker, and ClusteringEngine protocols #47)
  • Accepts an llm_fn: Callable[[str], str] parameter — no dependency on any LLM provider
  • Clusters tools by semantic similarity of descriptions using the LLM
  • Prompt template: presents all tool names + descriptions, asks the LLM to propose ~k groups with names and membership
  • Parses structured LLM output (JSON/YAML) into dict[str, list[SelectableItem]]
  • Graceful fallback: if LLM output is unparseable, falls back to JaccardClusterer
  • Deterministic for same inputs + same LLM response (no randomness in clustering logic itself)
  • Registers in EngineRegistry as "llm" clustering engine
  • Unit tests with mock llm_fn (valid response, invalid response, empty tools, single group)
  • pyproject.toml: no new runtime dependencies (LLM is user-provided via callable)

Implementation Notes

class LLMClusteringEngine:
    def __init__(self, llm_fn: Callable[[str], str], *, fallback: ClusteringEngine | None = None) -> None: ...
    
    def cluster(self, items: list[SelectableItem], k: int) -> dict[str, list[SelectableItem]]:
        prompt = self._build_prompt(items, k)
        response = self.llm_fn(prompt)
        try:
            return self._parse_response(response, items)
        except ParseError:
            if self.fallback:
                return self.fallback.cluster(items, k)
            raise

Files likely touched:

  • src/contextweaver/engines.py (or new src/contextweaver/extras/clustering_llm.py)
  • tests/test_engines.py

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/routingRouting engine: catalog, graph, router, cardscomplexity/complexCross-cutting, significant design or riskenhancementNew feature or requestpriority/mediumMedium priority — production readiness

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions