Production-grade AI Agent framework, rewritten in Rust for speed and reliability
Inspired by Nous Research's Hermes Agent — built from the ground up in Rust
Install · Why Hakimi · Capabilities · Architecture · Compare · 中文
Linux / macOS:
curl -sSL https://raw.githubusercontent.com/Mouseww/hakimi-agent/main/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/Mouseww/hakimi-agent/main/install.ps1 | iexBuild from source (any platform with Rust):
cargo install hakimi-agentQuick setup:
hakimi setup # guided configuration wizard
hakimi doctor # diagnose setup and connectivity
hakimi --serve # start the embedded WebUI/API on 127.0.0.1:3005v0.3.282 — 智能上下文压缩优化 (Smart Context Compression Tuning):
- 🗜️ 更早触发压缩:压缩阈值从 70% 降低到 60%,避免接近 token 限制时才压缩
- 📉 更激进的压缩策略:
- Tier 1 (Drop Tool Results): 保留 50% 最近消息(原 60%)
- Tier 2 (Summarize Old Turns): 保留 30% 最近消息(原 40%)
- Tier 3 (Sliding Window): 保留最近 10 条消息(不变)
- 🎯 优化压缩梯度:
- Tier 1 触发: 60-75% 占用(原 70-85%)
- Tier 2 触发: 75-90% 占用(原 85-95%)
- Tier 3 触发: >90% 占用(原 >95%)
- ✅ 效果:显著降低 context overflow 概率,为长对话腾出更多空间
v0.3.277 — WebUI 流式输出优化 + 工具调用显示完善 (Streaming & Tool Display Polish):
- 🚀 真正的实时流式输出:流式阶段使用纯文本显示(
textContent),消除 Markdown 重复渲染导致的闪烁 - 🛠️ 工具调用完全隐藏:预判
incompleteLine中的工具调用前缀(hakimi_tool:、表情符号),永不显示在消息中 - 🎨 完成后格式化:流式结束时才渲染 Markdown,既快速又美观
- 📊 性能提升:减少 DOM 操作 90%+,流式输出更流畅
- 🔧 底部状态栏:工具调用状态仅显示在底部状态栏(紧凑单行,最多 3 条)
- ✅ 用户体验提升:无闪烁、无"工具调用行出现后消失"现象
v0.3.276 — WebUI Gateway 控制面板完整实现 (Gateway Control Panel Complete):
- 🎨 完整前端界面:WebUI 新增"网关"标签页,提供完整的 Gateway 可视化管理
- 📊 实时状态监控:显示网关运行状态、已连接平台(Telegram、ClawBot)和连接详情
- ⚙️ 配置实时修改:可在 WebUI 中直接修改网关配置(忙碌模式 queue/interrupt)
- 🔄 一键重启:提供安全的网关重启功能(通过 systemd),自动重连
- ✅ 功能验证完成:Gateway 面板已完整实现,可在浏览器中管理所有网关功能
v0.3.275 — 消息处理逻辑统一重构 (Message Processing Unification):
- 🔄 代码重构:提取
process_gateway_messages_loop()函数(1229 行),消除重复代码 - 🎯 统一逻辑:分离模式和统一模式共享相同的消息处理逻辑
- 📉 代码简化:entry.rs 从 10580 行减少到 10510 行,消除 1200+ 行重复代码
- ✅ 功能完整:所有命令、Agent 调用、流式响应、工具调用状态更新全部保留
- 🛡️ 可维护性提升:未来修复和功能添加只需修改一处代码
v0.3.274 — 统一模式成为默认行为 (Unified Mode by Default):
- 🚀 默认统一模式:
hakimi命令现在默认启动统一模式(WebUI + Gateway),无需额外参数 - 🔄 重启功能实现:WebUI Gateway 面板的重启按钮现在可用(通过 systemd 重启)
- ⚙️ 服务配置更新:systemd 服务配置简化为
hakimi --addr 127.0.0.1:3005(自动启用统一模式) - 📦 向后兼容:
--serve和--gateway参数仍可用于显式控制模式 - 🎯 用户体验提升:新用户无需理解复杂的模式选项,直接运行
hakimi即可获得完整功能 - 🔧 部署简化:单个 systemd 服务
hakimi.service替代原hakimi-unified.service
v0.3.273 — Phase 4: WebUI Gateway 管理界面 (Gateway Management UI):
- 🎨 完整前端界面:新增 Gateway 标签页,提供可视化管理面板
- 📊 状态监控面板:实时显示 Gateway 运行状态、已连接平台、消息计数
- ⚙️ 配置编辑器:可视化编辑繁忙模式、访问控制、用户白名单、过滤选项
- 💾 一键保存:配置修改自动持久化到
~/.hakimi/config.yaml - 🔄 重启控制:WebUI 内一键重启 Gateway(统一模式)
- 🎯 实时刷新:手动刷新按钮获取最新 Gateway 状态
- 🖼️ 深色主题:UI 样式与现有 WebUI 风格完全一致(268 行组件 + 300 行 CSS)
- ✅ TypeScript 全栈:前后端类型安全,完整的 API 类型定义
- 📦 生产就绪:前端已构建并集成到 WebUI,立即可用
v0.3.272 — Phase 3C: WebUI Gateway 配置 API (Gateway Control Backend):
- 🎯 Gateway 配置 API:实现 4 个 REST 端点供 WebUI 控制 Gateway
- 📡 状态监控:
GET /api/gateway/status获取运行状态、平台连接、消息计数 - ⚙️ 配置读写:
GET /api/gateway/config和PATCH /api/gateway/config管理配置 - 🔄 重启控制:
POST /api/gateway/restart重启 Gateway(仅统一模式) - 🔒 认证保护:所有 Gateway API 端点需要 Bearer Token 认证
- 💾 配置持久化:PATCH 更新自动保存到
~/.hakimi/config.yaml - ✅ 测试验证:所有端点通过 curl 测试(分离模式 + 统一模式)
- 🏗️ 下一步 Phase 4:WebUI 前端界面(连接状态监控、配置编辑器、日志查看)
v0.3.271 — Critical Fix: 修复 Gateway 幽灵任务 Bug (Phantom Task Fix):
- 🐛 修复幽灵任务:修复
active_tasks清理逻辑导致的"正在处理之前的消息"错误提示 - 🔧 根本原因:任务完成时的
task_id匹配条件在并发场景下可能失败,导致记录永久残留 - 🎯 修复方案:重写清理逻辑,嵌套
if代替let chains确保兼容性和正确性 - ✅ 影响范围:所有使用 Gateway 的用户(Telegram Bot、Discord Bot 等)
- 🔒 稳定性提升:消除了导致 Gateway 假死的关键 bug
v0.3.270 — Phase 2: Gateway 完整集成到统一服务器 (Full Gateway Integration):
- 🎯 Gateway 完整集成:
hakimi --serve --gateway start现在完全整合 Gateway 和 WebUI - 🔄 共享资源:Agent、SessionDB、Config 在 WebUI 和 Gateway 间共享(
Arc<Mutex<T>>) - 🚀 并行运行:Gateway 消息循环和 WebUI HTTP 服务器在独立 tokio 任务中运行
- 📡 平台连接:统一模式下 Gateway 自动连接所有配置的平台(Telegram、ClawBot 等)
- 🔒 进程独占锁:Gateway lock 文件防止多实例冲突
- 📤 出站消息处理:后台任务自动处理工具调用的出站消息队列
- ⚙️ 基础架构就绪:为 Phase 3(Gateway 控制 API)铺路
- 🏗️ TODO Phase 3:Gateway 消息处理循环(目前为占位符,接收但不处理消息)
v0.3.269 — Phase 1: 统一服务器架构 (Unified Server Architecture):
- 🏗️ 统一模式:
hakimi --serve --gateway start启动合并进程(WebUI + Gateway) - 🎯 向后兼容:
--serve和--gateway单独使用时保持原有行为 - 📦 共享资源基础:扩展
AppState结构,准备共享 Agent/SessionDB/Config - 🚀 MVP 发布:统一模式当前运行 WebUI(Gateway 集成后续完善)
- 🎨 CLI 参数优化:添加统一模式文档,明确
--gateway与--serve组合语义 - 🔧 依赖更新:
hakimi-server添加hakimi-gateway依赖,为后续集成铺路
v0.3.268 — 修复 /clear 命令的多用户 bug (Critical Security Fix):
- 🔐 安全修复:移除
/clear命令中的agent.clear_messages()调用 - 🐛 问题诊断:Gateway 使用共享 Agent 架构,
agent.clear_messages()会影响所有聊天 - ✅ 修复方案:只清空当前聊天的
histories和usage数据,不操作全局 Agent - 📝 代码注释:添加详细说明,防止未来重复引入此问题
- 🎯 用户隔离:保证不同用户/聊天的对话历史完全隔离
v0.3.267 — 修复 /stop 命令队列清空功能:
- 🛑 队列清空:
/stop命令现在会同时清空消息队列 - 📊 详细反馈:显示清空了多少条排队消息
- 🎯 用户体验:解决 queue 模式下
/stop后队列消息继续执行的问题
v0.3.266 — 消息队列功能 (参考 Hermes Agent):
- 📬 Queue 模式:新消息加入队列,任务按顺序执行(默认)
- ⚡ Interrupt 模式:新消息立即中断当前任务(向后兼容)
- 🎛️ 配置项:
gateways.busy_input_mode("queue"|"interrupt") - 💬 用户提示:排队时显示友好消息
- 🔄 架构对齐:参考 Hermes Agent 的并发处理机制
v0.3.265 — 修复上下文压缩通知格式问题 (Hotfix):
- 🐛 修复 Unicode 转义:
\\u{001e}→\u{001e},Telegram 现在能正确识别消息前缀 - 📦 独立消息框:压缩开始和完成通知现在分别显示在独立消息框中
- ✅ 完整格式:
\u{001e}hakimi_tool:🗜️ ...与工具调用通知保持一致 - 🎯 用户反馈:感谢用户报告格式问题,快速修复部署
v0.3.264 — 上下文压缩实时通知 (Context Compression Notifications):
- 🗜️ 主动压缩通知:当上下文接近限制时(
should_compress()触发),通过streaming_callback实时通知用户"正在自动压缩" ⚠️ 溢出恢复通知:API 返回context_length_exceeded错误时,通知用户"上下文溢出,正在压缩并重试"- ✅ 完成反馈:压缩完成后发送"压缩完成,继续/重试任务",让用户清楚任务未中断
- 🎯 用户体验提升:杜绝长任务假死错觉 — 用户始终知道 Agent 在做什么
- 📍 实现位置:
loop_impl.rs第 159-173 行(主动压缩)+ 196-214 行(溢出恢复)
v0.3.263 — 修复 Router index=1 工具调用导致的空占位符 bug:
- 过滤掉流式累积产生的空工具调用(某些 Router 后端从 index=1 开始计数)
process_tool_calls在处理前移除name.is_empty()的占位符工具调用- 增强调试日志:显示原始/有效工具调用数量和详细信息
- 修复
WebuiConfigClippy 警告(使用derive(Default)替代手动实现) - 彻底解决"思考循环":空工具名不再触发 guardrail 警告或错误提示
v0.3.262 — SSE 格式自动检测:
ChatCompletionsTransport使用SseEventStream::auto()实现自动格式检测- 首个 SSE 事件动态决定解析器模式(Anthropic/OpenAI/Gemini)
- 消除因
api_mode: chat_completions配置错误导致的"思考循环" - Router 端点返回 Anthropic Messages API 格式时无缝工作
v0.3.261 — WebUI password + session workdir binding:
- WebUI password can now be configured in
config.yaml(webui.password) or via Settings panel - Sessions table extended with
workdirfield for workspace-session binding - Password defaults to config, falls back to
HAKIMI_WEBUI_PASSWORDenv var - All API routes protected by Bearer token middleware (empty password = no auth)
The WebUI Control Center can create, pause, resume, run-now, and delete persisted cron jobs via /api/cron/jobs, the /clear slash command now persists by deleting the current session transcript via /api/sessions/{id}/messages, and the mobile layout lets the conversation title toggle the session list so phones keep the chat area usable.
hakimi --serve ships the WebUI assets inside the release binary, so /, /static/style.css, /static/hakimi.js, /static/composer.js, /static/workspace.js, and /static/favicon.svg work from any current directory without copying a separate static/ folder. The WebUI workspace browser treats / as the active working-directory root (not the OS filesystem root), while still rejecting .. path escapes. Control-center modals honor native hidden state and can be dismissed via close button, overlay click, or Escape. When HAKIMI_WEBUI_PASSWORD is set, the WebUI prompts for the password on the first authenticated API call, stores it locally as a Bearer token, retries automatically, and renders send/auth errors inline instead of silently dropping messages. The embedded server persists WebUI sessions in ~/.hakimi/sessions.db and initializes the schema on startup, so creating a chat session works immediately after launch. Streaming WebUI chat requests now carry the active session_id, restore that transcript before each turn, and persist both the user prompt and assistant reply back into the same session; the frontend also commits finalized streamed replies into its in-memory message list so a second send or session switch does not erase the previous response. The WebUI also exposes persisted cron jobs through /api/cron/jobs, supports session deletion from the sidebar, de-duplicates client-provided session titles during create/fork so repeated "New Chat" actions do not hit the SQLite title uniqueness constraint, and ships a polished skin system with Linear Dark, Obsidian, Midnight, Light, and System appearance choices persisted in localStorage. Theme switching now writes the resolved skin CSS variables directly at runtime, so color changes are immediate and resilient to cached/static stylesheet ordering. The refreshed UI uses glassy panels, richer message cards, focused composer chrome, theme swatches in Settings → Appearance, keeps theme switching local and instant, adds a mobile drawer sidebar with a compact top-bar menu, hides the workspace panel on narrow screens, and adapts messages/composer controls to safe-area mobile viewports. It also shows directory-style SKILL.md skills using their parent directory names instead of generic SKILL labels, and releases the composer immediately after completed streamed replies so follow-up messages remain responsive.
Python agent frameworks are slow, memory-hungry, and crash at runtime. Hakimi is built different — from the ground up in Rust with production reliability baked in.
| Metric | Python Agent | Hakimi (Rust) |
|---|---|---|
| Startup | ~2s | ~50ms |
| Idle memory | ~150MB | ~15MB |
| Async model | asyncio + GIL | tokio native async |
| Tool safety | Runtime crashes | Compile-time guarantees |
| Tests | ~500 | 1767 |
Not a wrapper. Not a demo. A real production system:
- 20+ error types auto-classified with recovery strategies
- Hermes-style turn retry state for one-shot recovery guards
- Multi-key credential pool with circuit breakers
- 3-tier context compression (no manual window management)
- Contextual first-touch onboarding hints tracked in
onboarding.seen - Decision tree conversation history with backtracking
- Intent reasoning engine — predicts what tools you need
- Role adaptation — automatically switches between Coder, Researcher, Writer modes
Smart Context Management
- Three-tier compression: drop stale tool results → LLM summarization → sliding window
- No manual context window management — Hakimi handles it automatically
- Model-aware context windows:
model.context_lengthoverrides static metadata before compression and tool disclosure thresholds - Intent classification into 10 categories with next-tool prediction
Built-in Tools (63+)
- Files: read, write, search, patch with safe-root sandbox
- Shell: terminal, background processes
- Web: search, extract, browser automation (Chromium with screenshot vision capture, Playwright cache/headless-shell discovery, raw CDP dispatch, CDP frame-tree inspection, cloud-provider readiness status, and provider CDP endpoint routing)
- Desktop: Hermes-style
computer_usereadiness surface with safe wait, macOS screenshot/list-app discovery, and guarded action schema - Code: Python/JS/Bash execution with sandbox
- Media: vision analysis, video analysis, TTS, transcription with silence-hallucination filtering and oversized WAV chunking
- Memory: persistent memory + FTS5 full-text search +
hakimi knowledge/ TUI/knowledge/ gateway/knowledgegraph operations - Productivity: todo, Kanban boards with profile routing, worker logs, event trails, diagnostics, notification subscriptions, swarm graph creation, dashboard read/write management, cron scheduler with interval/five-field cron expressions and home-channel fan-out delivery
- Meta: sub-agent delegation, Mixture-of-Agents reasoning, skills system, MCP plugins
- Evaluation: Hermes-compatible ShareGPT JSONL trajectory saving for completed and failed turns
Multi-Platform Gateway
- Telegram · Discord · Slack · Mattermost · Webhook · Microsoft Graph webhook · Signal · SMS/Twilio · Email/SMTP · WhatsApp Business Cloud · Home Assistant · Matrix · DingTalk · WeCom · Feishu/Lark · BlueBubbles/iMessage · QQBot outbound · WeChat (via iLink/ClawBot) · Weixin/iLink alias
- Config-driven multi-adapter fan-in: run chat and webhook gateways simultaneously
- Real-time streaming with progressive edits, native Telegram draft previews, flood-control backoff, per-platform preview policy, and UTF-8-safe overflow chunking for long replies
- Persistent lifecycle diagnostics record adapter, connect, route, filter, and edit events to
~/.hakimi/logs/gateway-events.log;/logs,/logs events, and/logs gatewayread recent logs without shelling out totail - Gateway
/undo [N]rewinds recent in-memory chat turns and echoes the target prompt for editing before resend - Gateway
/stopimmediately cancels the running task and clears any queued messages, supporting bothinterruptandqueuemodes configured viagateways.busy_input_mode - Gateway
/usageshows last-turn token/cost/rate-limit data, best-effort OpenRouter-compatible/v1/modelslive pricing with a profile-scoped freshness cache and request fees, OpenRouter/creditsplus/keyquota/usage, Anthropic OAuth account windows, Codex usage windows, and a shared Nous rate-limit guard without exposing credentials - Cron jobs scheduled from chat with
/cron add, including30m/2hintervals, five-field cron syntax such as*/15 * * * *or0 9 * * MON-FRI, and delivery targets likelocal,origin,all,platform,platform:home, orplatform:#channel - Gateway
/voice on|off|tts|status|doctortoggles spoken-response guidance and reports voice I/O readiness without polluting prompt cache or chat history - Gateway
/updatesends the in-chat restart notice, then the restarted gateway proactively reports update success, current version, and release-note feature bullets after adapters connect - TUI
/config [field]shows sanitized runtime configuration,/gateway [cmd]inspects configured adapters, cached channel targets, and lifecycle events,/sessions [cmd]browses saved SQLite sessions,/skills [cmd]browses/searches local Skills Hub metadata,/cron [cmd]manages the persistent cron DB locally,/undo [N]prefills recent prompts for editing,/checkpoints [cmd]inspects the shared shadow-git checkpoint store without entering the model loop, and/voice statusplus configurable Ctrl+B/Ctrl+letter push-to-talk share the samevoice.*config, TTS/transcription tools, audio environment checks, PCM16 WAV recording artifact validation, oversized WAV chunked STT dispatch, local TTS playback launch, recorder-backedvoice_capture, automatic transcript submission, continuous restart mode, second-press capture cancellation, three-no-speech auto-exit, and Hermes-style start/stop audio cues
Extensibility
- MCP (Model Context Protocol) client — stdio / HTTP / SSE transports, CLI/gateway catalog search and config snippets, and stdio server-initiated sampling with tool schema forwarding plus
tool_usehandoff - HTTP plugin system with YAML templates
- HTTP API discovery — OpenAI-compatible
/v1/models,/v1/capabilities,/v1/skills,/v1/toolsets, text/v1/chat/completionswith completed SSE snapshots forstream=true,/v1/responseswith SQLite-backedprevious_response_idchaining plus completed SSE snapshots, pollable and cancellable/v1/runswith live lifecycle SSE events, and session lifecycle/messages/search discovery for external UI feature detection - Dashboard admin API —
/api/status,/api/sessionscreate/update/delete/fork plus message/search inspection,/api/mcp/servers,/api/credentials/pool,/api/webhooks, and Kanban/api/kanbanboard/task read-write management expose redacted operational state plus runtime-scoped admin writes for WebUI/admin panels - Hakimi WebUI — Hermes-inspired React/Vite operator console with left-side session browsing, central
/api/chatlive turns, right-side runtime/tool/skill/control panels, Bearer token support, and runtime config editing through the existing HTTP API - Skills Hub — install community skills with
/skills install - Static i18n foundation —
display.language,HAKIMI_LANGUAGE/HERMES_LANGUAGE, Hermes-compatible language aliases, YAML catalog directory loading, English fallback, and named placeholders for static user-facing messages - CLI Skin Engine —
hakimi skin list|inspect|set|pathplus gateway/skindiscover built-in and~/.hakimi/skins/*.yamlthemes, inherit missing values fromdefault, persistdisplay.skin, apply selected branding/colors/logo/hero to the CLI startup banner, and drive TUI thinking spinner faces/verbs/wings plus status, session, selection, completion, help, input, response, tool-prefix, tool emoji labels, running-tool progress, and tool-panel colors - Isolated profiles — manage named workspaces, clone/export profile archives, install/update shareable
distribution.yamlprofile distributions, create~/.hakimi/bin/<profile>wrapper aliases, use gateway/profile, and bind--profile/ stickyactive_profileruns to profile-scoped config, memory, sessions, skills, cron, trajectories, gateway logs, and TUI defaults - 10 curated MCP catalog entries: GitHub, filesystem, Brave Search, PostgreSQL, Puppeteer, memory, fetch, SQLite, sequential-thinking, and the Hermes-reviewed n8n bridge
- Secret redaction — API keys, JWTs, tokens masked before output
- Prompt injection detection — scans skills, cron prompts, context files
- SSRF protection — blocks private/metadata URL fetches
- Command safety guard — blocks malicious shell patterns
- Tool loop guardrails — warns on repeated no-progress read-only calls and blocks runaway exact-call loops
- One-time onboarding hints — first-touch CLI/gateway tips persist under
onboarding.seen - Write safe-root sandbox — config-protected directories
- Read credential guard — protects config files
- Shared shadow-git checkpoints —
checkpointand gateway/checkpointssnapshots live under~/.hakimi/checkpoints/store, not the project.git - Tool output limits — configurable
tools.output.max_bytesboundary before tool results enter context
20 crates, each with a single responsibility:
hakimi-agent/
├── hakimi-core/ # Agent loop, error classifier, credential pool
├── hakimi-transports/ # OpenAI, Anthropic, Gemini, Bedrock transports + prompt caching/rate guards
├── hakimi-tools/ # 63+ built-in tools + plugin registry
├── hakimi-session/ # SQLite WAL + FTS5, decision tree history
├── hakimi-context/ # Context engine, compression, intent reasoning, roles
├── hakimi-knowledge/ # Knowledge graph (petgraph)
├── hakimi-skills/ # Skill system + meta-skill extraction
├── hakimi-cron/ # Persistent cron scheduler
├── hakimi-gateway/ # 19 runtime-exposed platform adapters
├── hakimi-mcp/ # MCP client (stdio/HTTP/SSE)
├── hakimi-cli/ # REPL CLI + setup wizard + doctor
└── hakimi-tui/ # ratatui terminal UI
User Message
│
▼
┌─────────────────────────────────────┐
│ Intent Classification │
│ → Which tools does this need? │
├─────────────────────────────────────┤
│ Role Adaptation │
│ → Coder / Researcher / Writer... │
├─────────────────────────────────────┤
│ Build Context │
│ → System prompt + knowledge graph │
│ → Apply 3-tier compression │
├─────────────────────────────────────┤
│ Credential Pool │
│ → Acquire API key (rotation-ready) │
│ → LLM call via SSE streaming │
├─────────────────────────────────────┤
│ Tool Dispatch │
│ → Execute + guardrails check │
│ → Error classification + recovery │
├─────────────────────────────────────┤
│ Decision Tree │
│ → Record response + backtrack Capable│
└─────────────────────────────────────┘
│
▼
Response + Memory + Stats
| Feature | Hermes (Python) | Hakimi (Rust) |
|---|---|---|
| Language | Python 3.11+ | Rust 2024 |
| Startup time | ~2s | ~50ms |
| Idle memory | ~150MB | ~15MB |
| Async model | asyncio + GIL | tokio native |
| Tool registration | Runtime AST | Compile-time trait |
| Error recovery | Basic retry | 20+ classifiers |
| Knowledge model | Flat file | Graph DB (petgraph) |
| Intent detection | None | 10-category classifier |
| Role adaptation | None | 8 roles auto-detected |
| Conversation model | Flat list | Decision tree |
| Tests | ~500 | 1767 |
# Build everything
cargo build --workspace
# Run tests
cargo test --workspace
# Lint
cargo clippy --workspace
# Debug logging
RUST_LOG=debug cargo run -p hakimi-cli- Core agent loop + tool dispatch
- OpenAI / Anthropic / Gemini transports + SSE streaming, plus non-streaming AWS Bedrock Converse
- 63+ built-in tools
- 19 runtime-exposed platform adapters
- Gateway target directory + send_message channel resolution
- MCP client + CLI/gateway server catalog
- HTTP API model/capability discovery + text Chat Completions/Responses SSE snapshots + cancellable Runs with live lifecycle events
- Dashboard admin API summaries + runtime writes + Kanban read/write management
- Gateway
/usagerate-limit, account-limit, live pricing with request fees, Nous shared rate guard, and offline OpenAI/Anthropic/Gemini/DeepSeek/MiniMax/Bedrock cost estimates - Plugin system + HTTP templates
- Profile distributions with install/update/info and protected user data
- CLI skin engine with built-in/user YAML themes,
display.skinpersistence, startup banner theming, and TUI spinner, status, completion, help, tool emoji/progress, and surface theming - ratatui TUI with local slash commands, sanitized config browser, and gateway status panel
- Smart context compression (3-tier)
- Error classifier + credential pool
- Prompt caching (Anthropic)
- Vision + video analysis
- Knowledge graph memory with CLI/TUI/gateway operator commands
- Intent reasoning engine
- Decision tree backtracking
- Role adaptation
- Meta-skill auto-extraction
- Browser automation (Chromium + Playwright cache discovery + CDP readiness probe + frame-tree inspection + cloud-provider readiness status + provider CDP endpoint routing)
- Computer Use readiness surface
- Kanban task boards + notification cursors + swarm graphs + dashboard read/write management
- Gateway voice-response mode
- TUI voice readiness and media-tool config parity
- Voice environment diagnostics and STT silence-hallucination filtering
- PCM16 WAV recording artifact validation for voice capture
- Voice TTS playback text cleanup, MP3 cache planning, and local player launch
- Voice capture tool with system recorder backends and STT dispatch
- Oversized WAV chunking for captured-recording STT dispatch
- TUI Ctrl+B continuous push-to-talk capture loop
- TUI checkpoint viewer and manager slash command
- TUI saved session browser slash command
- TUI skill browser slash command
- TUI cron job manager slash command
- Voice capture second-press interrupt key
- Voice capture start/stop audio cues
- Voice capture continuous restart mode
- Mixture-of-Agents reasoning via OpenRouter
- OpenRouter, Anthropic, and Codex account usage display in gateway
/usage - Basic Hakimi WebUI operator console
- WASM plugin runtime
- Web dashboard PTY terminal, session-scoped streaming, and full Kanban UI
消息队列功能 - 参考 Hermes Agent 实现并发消息处理
- 🎯 新增配置项
gateways.busy_input_mode(默认"queue")"queue"模式:新消息排队等待当前任务完成"interrupt"模式:立即取消当前任务(原有行为)
- 🔄 队列处理机制:任务完成后自动处理队列中的下一条消息
- 💬 用户友好提示:队列模式下显示 "⏳ 正在处理之前的消息,已将新消息加入队列..."
- 🧩 无缝集成:支持文本和媒体附件的队列化处理
- 🐛 向后兼容:可通过配置切换回 interrupt 模式
技术细节:
- 添加
QueuedMessage结构体和 per-chat 队列存储 - 实现基于
task_key的消息队列管理(platform:bot_id:chat_id) - 任务完成后通过
gateway.route_message()递归触发下一条消息
MIT License