Skip to content

Add mlx-stack integrate command for AI agent harness onboarding #49

@weklund-agent

Description

@weklund-agent

Summary

Users who already have an AI agent harness installed (Hermes Agent, Claude Code, Cursor, etc.) currently have to manually edit config files to point at mlx-stack's endpoint. This is error-prone and requires knowing the config schema of both tools. A single command like mlx-stack integrate hermes would eliminate this friction.

Motivation

Setting up Hermes Agent to use mlx-stack today requires manually:

  1. Editing ~/.hermes/config.yaml to set provider: "custom", base_url: "http://localhost:4000/v1", and model: "standard"
  2. Setting context_length explicitly (auto-detection probes from 2M down, wasting time)
  3. Routing auxiliary tasks (compression, vision) to the fast tier via separate config keys
  4. Knowing which mlx-stack tier names map to which harness concepts

This is a common onboarding path — users discover mlx-stack because they want local inference for an agent they already use. Making this a one-command experience would significantly reduce setup friction.

Proposed UX

# Auto-detect and configure a known harness
mlx-stack integrate hermes

# List supported harnesses
mlx-stack integrate --list

# Dry-run: show what would change without writing
mlx-stack integrate hermes --dry-run

# Undo: restore the harness config to its previous state
mlx-stack integrate hermes --remove

Example output

$ mlx-stack integrate hermes

  Detected Hermes Agent v0.7.0 at ~/.hermes/

  Applying configuration:
    model.provider:       custom
    model.base_url:       http://localhost:4000/v1
    model.default:        standard (Qwen3-Coder-30B-A3B)
    model.context_length: 32768
    compression.summary_model:    fast (Qwen3-8B)
    compression.summary_provider: main

  ✓ Updated ~/.hermes/config.yaml
  ✓ Backed up original to ~/.hermes/config.yaml.bak

  Run `hermes` to start using local models.

Suggested initial harness support

Harness Config Location Key Mappings
Hermes Agent ~/.hermes/config.yaml model.default → standard tier, compression → fast tier
Claude Code ~/.claude/settings.json API proxy / MCP server config
Cursor .cursor/mcp.json or settings model endpoint override
Continue.dev ~/.continue/config.json models[].apiBase
Open Interpreter env vars OPENAI_BASE_URL + OPENAI_API_KEY

Each integration would need to:

  • Detect if the harness is installed (check config dir / binary)
  • Read the current stack tiers and map them to harness-specific roles
  • Back up the existing config before modifying
  • Support --remove to cleanly revert

Additional considerations

  • Tier-to-role mapping: Different harnesses have different concepts (Hermes has main + auxiliary, Claude Code has a single model). The integration should map tiers intelligently based on the harness's capabilities.
  • Stack changes: If a user runs mlx-stack setup --add/--remove after integrating, the harness config could become stale. Consider a post-setup hook or a warning.
  • Plugin system: If the number of supported harnesses grows, this could be plugin-based — each harness integration is a small module that knows how to read/write that harness's config.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions