diff --git a/AGENTS.md b/AGENTS.md index c043713..f6e5fa0 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -12,6 +12,23 @@ It contains: - Knowledge base structure and lookup rules - Session state management (`.trtc-session.yaml`) +## Conversational AI routing (AI customer service) + +When the user describes a Conversational AI scenario — building or integrating +an AI-powered customer service agent, intelligent Q&A system, or TRTC +Conversational AI capabilities — route directly to the `trtc-ai-service` skill: + +**Conversational AI triggers** (match any): +- "AI客服" / "智能客服" / "AI customer service" +- "build AI agent" / "搭建AI客服" / "搭建智能客服" +- "conversational AI" / "conversational AI demo" / "TRTC Conversational AI" +- "integrate AI service" / "集成AI客服" / "AI customer service agent" +- "voice agent" + "customer service" / "语音助手" + "客服" + +When triggered, read `skills/trtc-ai-service/SKILL.md` and follow its guided +workflow. Conversational AI has its own capability model and adapter layer — +it does NOT go through the standard product/platform/scenario routing below. + Below are additional rendering-specific rules that apply to all agents: # ui-mode diff --git a/CLAUDE.md b/CLAUDE.md index 91d81dd..c876a42 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -10,13 +10,15 @@ 当用户提出 TRTC 相关问题时: -1. **识别产品**:Chat / Call / RTC Engine / Live / Conference +1. **识别产品**:Chat / Call / RTC Engine / Live / Conference / Conversational AI 2. **识别平台**:Web / Android / iOS / Flutter / Electron / Unity 3. **读取知识库**:先读 `knowledge-base/slices/{product}/` 下的产品级概览,再读 `{product}/{platform}/` 下的平台实现细节 4. **引用来源**:标明参考的 slice ID 和官方文档链接 **新用户检测**:当用户首次使用或描述从零开始的集成需求时,优先进入 `skills/trtc-onboarding/SKILL.md` 引导流程。 +**AI 客服场景识别**:当用户描述搭建 AI 客服、智能客服、对话式 AI 等场景时,直接路由到 `skills/trtc-ai-service/SKILL.md`。该路径使用 TRTC Conversational AI,拥有独立的原子能力模型和适配器层,不经过标准的 onboarding/topic 流程。 + ## 关键路径 | 用途 | 路径 | diff --git a/CODEBUDDY.md b/CODEBUDDY.md index 4417dc2..9702d38 100644 --- a/CODEBUDDY.md +++ b/CODEBUDDY.md @@ -45,6 +45,7 @@ Before identifying product / platform, check if an onboarding session is already | **RTC Engine** | 进房、退房、推流、拉流、混流、音视频、采集、编码、码率、低延时、SEI、TRTC 引擎 | enter room, leave room, publish stream, play stream, mix stream, audio/video, capture, encoding, bitrate, low latency, SEI, RTC engine | `TRTC`, `TRTCCloud` | | **Live** | 直播、推流、连麦、观众、主播、弹幕、礼物、打赏、美颜、变声、开播、下播、PK、房管 | live streaming, publish, co-guest, co-host, audience, host, anchor, barrage, danmu, gift, beauty filter, voice changer, start broadcast, end broadcast, PK, moderator | `AtomicXCore`, `LiveCoreView`, `LiveListStore` | | **Conference** | 会议、多人视频、视频会议、入会、离会、创建会议、预约会议、参会人、会控、屏幕共享、举手、录制、等候室、虚拟背景、静音全员 | meeting, multi-person video, video conferencing, join meeting, leave meeting, create meeting, schedule meeting, participant, moderation, screen share, raise hand, record, waiting room, virtual background, mute all | `TUIRoomKit` | +| **Conversational AI** | AI客服、智能客服、对话式AI、语音客服、语音助手 | AI customer service, conversational AI demo, voice agent, intelligent Q&A, build AI agent, integrate AI service | TRTC Conversational AI | If ambiguous, ask — keep it easy: "Your question sounds like it could be about Chat (messaging) or RTC Engine (audio/video). Which one?" @@ -68,6 +69,7 @@ If the user doesn't specify and it matters for the answer, ask. Conceptual quest | User intent | Skill to follow | |-------------|----------------| +| **"build AI customer service" / "搭建AI客服" / "智能客服"** (AI customer service scenario) | `skills/trtc-ai-service/SKILL.md` — uses TRTC Conversational AI, bypasses standard product/platform routing | | **"get started" / "help me integrate" / "I'm new"** | `skills/trtc-onboarding/SKILL.md` | | **"I want to ADD / BUILD / IMPLEMENT X"** (feature or demo) | `skills/trtc-onboarding/SKILL.md` Path A2 — **never dump slice content directly** | | **"从零开始" / "帮我接入" / "try the demo"** | `skills/trtc-onboarding/SKILL.md` | diff --git a/README.md b/README.md index 83a4379..05e7423 100644 --- a/README.md +++ b/README.md @@ -50,11 +50,21 @@ The skill saves your progress in the project. If you close the tool and come bac | Product | Description | Availability | |---------|-------------|--------------| | **Conference** | Video conferencing — multi-party meetings, screen sharing, in-meeting chat | Web ✅ | +| **Conversational AI** | TRTC Conversational AI — AI voice agents, intelligent Q&A, human handoff, session summaries | Web ✅ | | **Live** | Interactive live streaming — anchor/audience roles, co-hosting, barrage, gifts, beauty filters | Coming soon | | **Chat** | Instant messaging — messages, conversations, groups, user profiles | Coming soon | | **Call** | Audio/video calling — 1-on-1 and group calls | Coming soon | | **RTC Engine** | Low-level real-time audio/video engine — room management, publishing, subscribing | Coming soon | +### Conversational AI — Supported Scenarios + +Conversational AI is a standalone product built on TRTC's voice and LLM capabilities. The following application scenarios are supported: + +| Scenario | Description | Status | +|----------|-------------|--------| +| **AI Customer Service** | Build an AI-powered voice customer service agent that can answer customer questions, look up knowledge base articles, escalate to human agents, and summarize conversations — all through natural voice interaction. | ✅ Available | +| **AI Oral Practice** | Practice spoken language with an AI conversation partner that listens, responds, and gives feedback in real time — ideal for language learning and interview rehearsal. | Coming soon | + --- diff --git a/README.zh.md b/README.zh.md index b61d5cc..3cabe77 100644 --- a/README.zh.md +++ b/README.zh.md @@ -50,11 +50,21 @@ Skill 会在项目中保存你的进度。关掉工具下次回来,可以从 | 产品 | 说明 | 可用状态 | |------|------|---------| | **Conference** | 视频会议——多人会议、屏幕共享、会中聊天 | Web ✅ | +| **Conversational AI** | TRTC 对话式 AI——AI 语音助手、智能问答、人工转接、会话摘要 | Web ✅ | | **Live** | 互动直播——主播/观众、连麦、弹幕、礼物、美颜 | 即将支持 | | **Chat** | 即时通信——消息、会话、群组、用户资料 | 即将支持 | | **Call** | 音视频通话——1v1 和群组通话 | 即将支持 | | **RTC Engine** | 实时音视频引擎——进房、推流、拉流 | 即将支持 | +### Conversational AI — 支持的场景 + +Conversational AI 是基于 TRTC 语音和 LLM 能力构建的独立产品,目前支持以下应用场景: + +| 场景 | 说明 | 状态 | +|------|------|------| +| **AI 客服** | 搭建 AI 语音客服助手,通过自然语音对话回答客户问题、检索知识库、转接人工坐席、自动生成会话摘要。 | ✅ 已支持 | +| **AI 口语陪练** | 与 AI 对话伙伴进行口语练习,实时倾听、回应并给出反馈——适合语言学习和面试模拟。 | 即将支持 | + --- diff --git a/ai-instructions/base.md b/ai-instructions/base.md index 741168f..3e5d15f 100644 --- a/ai-instructions/base.md +++ b/ai-instructions/base.md @@ -10,4 +10,21 @@ It contains: - Knowledge base structure and lookup rules - Session state management (`.trtc-session.yaml`) +## Conversational AI routing (AI customer service) + +When the user describes a Conversational AI scenario — building or integrating +an AI-powered customer service agent, intelligent Q&A system, or TRTC +Conversational AI capabilities — route directly to the `trtc-ai-service` skill: + +**Conversational AI triggers** (match any): +- "AI客服" / "智能客服" / "AI customer service" +- "build AI agent" / "搭建AI客服" / "搭建智能客服" +- "conversational AI" / "conversational AI demo" / "TRTC Conversational AI" +- "integrate AI service" / "集成AI客服" / "AI customer service agent" +- "voice agent" + "customer service" / "语音助手" + "客服" + +When triggered, read `skills/trtc-ai-service/SKILL.md` and follow its guided +workflow. Conversational AI has its own capability model and adapter layer — +it does NOT go through the standard product/platform/scenario routing below. + Below are additional rendering-specific rules that apply to all agents: diff --git a/bin/cli.js b/bin/cli.js index ee2fcd9..f136b4a 100755 --- a/bin/cli.js +++ b/bin/cli.js @@ -5,9 +5,9 @@ /** * @tencent-rtc/trtc-agent-skills installer * - * Installs the TRTC AI Integration skill suite (6 cross-referencing skills: - * trtc + trtc-onboarding/docs/topic/search/apply) plus the shared - * knowledge-base into your IDE's skills directory, and wires up the + * Installs the TRTC AI Integration skill suite (cross-referencing skills: + * trtc + trtc-onboarding/docs/topic/search/apply + trtc-ai-service) plus the + * shared knowledge-base into your IDE's skills directory, and wires up the * `tencent-rtc-skill-tool` MCP server (used for prompt / runtime telemetry). * * IMPORTANT — why skills are copied as SIBLING DIRECTORIES: @@ -50,15 +50,21 @@ const SKILLS_SRC = path.join(PKG_ROOT, "skills"); const KB_SRC = path.join(PKG_ROOT, "knowledge-base"); const HOOKS_SRC = path.join(PKG_ROOT, "hooks"); -// The 6 skills that make up the suite. Order is cosmetic; `trtc` is the entry. -const SKILL_NAMES = [ - "trtc", - "trtc-onboarding", - "trtc-docs", - "trtc-topic", - "trtc-search", - "trtc-apply", -]; +// Dynamically discover all skills under SKILLS_SRC. Each skill must be a +// directory containing a SKILL.md entry point. `trtc` is always listed first; +// the rest are sorted alphabetically. This avoids the stale-hardcoded-list +// problem — adding a new skill directory is enough to get it installed. +function getSkillNames() { + return fs.readdirSync(SKILLS_SRC, { withFileTypes: true }) + .filter(entry => entry.isDirectory()) + .map(entry => entry.name) + .filter(name => fs.existsSync(path.join(SKILLS_SRC, name, "SKILL.md"))) + .sort((a, b) => { + if (a === "trtc") return -1; + if (b === "trtc") return 1; + return a.localeCompare(b); + }); +} // IDE skill-install targets (project-level). Each IDE reads skills from a // different directory, but the layout inside is identical: one dir per skill. @@ -290,13 +296,20 @@ function printHelp() { } function listSkills() { + const descriptions = { + "trtc": "Entry router — detects product/platform, routes to sub-skills", + "trtc-ai-service": "AI customer service scenarios (TRTC Conversational AI)", + "trtc-onboarding": "Get-started / integration / troubleshooting flow", + "trtc-docs": "Docs & error-code lookup", + "trtc-topic": "Step-by-step scenario walkthrough", + "trtc-search": "Internal slice lookup (AI-facing)", + "trtc-apply": "Internal compile/integration quality gate", + }; console.log(`\n ${c.bold("Skills shipped in this package:")}\n`); - console.log(` ${c.cyan("trtc/")} ${c.dim("Entry router — detects product/platform, routes to sub-skills")}`); - console.log(` ${c.cyan("trtc-onboarding/")} ${c.dim("Get-started / integration / troubleshooting flow")}`); - console.log(` ${c.cyan("trtc-docs/")} ${c.dim("Docs & error-code lookup")}`); - console.log(` ${c.cyan("trtc-topic/")} ${c.dim("Step-by-step scenario walkthrough")}`); - console.log(` ${c.cyan("trtc-search/")} ${c.dim("Internal slice lookup (AI-facing)")}`); - console.log(` ${c.cyan("trtc-apply/")} ${c.dim("Internal compile/integration quality gate")}`); + for (const name of getSkillNames()) { + const desc = descriptions[name] || ""; + console.log(` ${c.cyan(name + "/")}` + (desc ? ` ${c.dim(desc)}` : "")); + } console.log(""); } @@ -304,7 +317,7 @@ function listSkills() { function cleanSkills(skillsRootAbs) { if (!fs.existsSync(skillsRootAbs)) return 0; let wiped = 0; - for (const name of SKILL_NAMES) { + for (const name of getSkillNames()) { const target = path.join(skillsRootAbs, name); if (fs.existsSync(target)) { rmrf(target); wiped++; } } @@ -385,7 +398,7 @@ function cleanHooksSettings(ideList, resolvedRoot) { function installSkills(skillsRootAbs) { ensureDir(skillsRootAbs); - for (const name of SKILL_NAMES) { + for (const name of getSkillNames()) { const src = path.join(SKILLS_SRC, name); if (fs.existsSync(src)) { copyRecursive(src, path.join(skillsRootAbs, name)); @@ -812,7 +825,7 @@ function main() { } installSkills(skillsRootAbs); - for (const name of SKILL_NAMES) console.log(c.green(" ✓ ") + name + "/"); + for (const name of getSkillNames()) console.log(c.green(" ✓ ") + name + "/"); const kbDest = copyKnowledgeBase(skillsRootAbs); console.log(c.green(" ✓ ") + "knowledge-base/ " + c.dim("→ " + kbDest)); diff --git a/skills/trtc-ai-service/README.md b/skills/trtc-ai-service/README.md new file mode 100644 index 0000000..05ae948 --- /dev/null +++ b/skills/trtc-ai-service/README.md @@ -0,0 +1,195 @@ +# TRTC AI Customer Service Skill + +[English](README.md) | [中文](README.zh-CN.md) + +> A zero-code AI customer service builder. Just say a sentence in the chat window and the AI will guide you step by step to get your customer service system up and running — no terminal, no scripts, no coding required. + +## What is this? + +Building an "AI customer service agent with TRTC Conversational AI" packaged into a plug-and-play Skill: + +``` +You (in your IDE's AI chat window): + "Build me an AI customer service agent with TRTC" + +AI (does everything automatically): + 1. Checks your runtime environment + 2. Lets you choose a setup mode (Quick Experience / Integrate into My System) + 3. Guides you through configuring 3 keys (cloud service credentials) + 4. Installs dependencies and assembles the customer service capabilities + 5. Starts the service and gives you a browser URL to see it in action + +You never open a terminal or run a script manually. +``` + +## Two ways to start + +> The core capability of this Skill is **TRTC Conversational AI (voice agent)**. + +| Mode | Who it's for | What you get | What you need to do | +|------|-------------|-------------|---------------------| +| **Quick Experience** | First-timers who want to see what it looks like | A complete voice agent web UI + ticket management dashboard | Configure 3 keys | +| **Integrate into My System** | Users who already have a website or app and want to embed the AI agent's "brain" | Backend API endpoints + interface specs + sample code (no UI generated) | Configure 3 keys + choose capabilities and interaction modes | + +**No matter which one you choose, the AI will walk you through every step** — zero programming experience needed. + +## The only entry point + +[`SKILL.md`](./SKILL.md) — Read and executed by your Coding Agent (CodeBuddy / Cursor / Claude Code). + +> **Install anywhere**: This Skill can live in a project subdirectory, `.agents/skills/`, `.codebuddy/skills/`, or anywhere else — +> it does **not** need to be at the workspace root. Scripts are self-locating; the Agent just needs to use absolute paths. + +### Installation + +#### Codex CLI + +**User-level** (recommended — available across all projects): +```bash +/skills install https://github.com/Burgerham4R/ai-customer-service-skill +``` + +**Project-level** (only available in the current project): +```bash +# The skill will be installed to ./.codex/skills/ (Cmd+Shift+. to show hidden folders in Finder) +/skills install --project https://github.com/Burgerham4R/ai-customer-service-skill +``` + +#### Claude Code CLI + +**User-level** (recommended — available across all projects): +```bash +mkdir -p ~/.claude/skills +git clone https://github.com/Burgerham4R/ai-customer-service-skill.git ~/.claude/skills/ai-customer-service-skill +``` + +**Project-level** (only available in the current project): +```bash +mkdir -p ./.claude/skills +git clone https://github.com/Burgerham4R/ai-customer-service-skill.git ./.claude/skills/ai-customer-service-skill +``` + +#### Other agents (CodeBuddy / Cursor / etc.) + +Clone to any location and point your agent to `SKILL.md`: +```bash +git clone https://github.com/Burgerham4R/ai-customer-service-skill.git +# Then tell your agent: +# "Load the Skill from /path/to/ai-service-skill/SKILL.md" +``` + +> **After installation, restart your CLI session** to ensure the Skill is properly registered and loaded. + +### Trigger keywords + +- "AI customer service" / "build customer service" / "customer service bot" +- "TRTC + customer service" / "voice agent + customer service" +- "Build me an AI customer service agent with TRTC" + +## What are the 3 keys? + +To get the customer service agent running, you need 3 cloud service credentials. Don't worry — they're just 3 strings you copy-paste from the corresponding websites. + +> **Tencent RTC (trtc.io)** is Tencent Cloud's international Real-Time Communication brand. The TRTC Conversational AI service runs on Tencent Cloud infrastructure — your TRTC account and Tencent Cloud account are linked through a unified login system. When you get your API Key, the system will automatically sync your login session. + +| Key | Purpose | Where to find it | +|-----|---------|-----------------| +| Key 1: Tencent Cloud API Key | Proves you have permission to use Tencent Cloud voice & calling services | https://console.tencentcloud.com/cam/capi | +| Key 2: TRTC Application Credentials | Lets the agent make calls and do voice chat | https://console.trtc.io/app | +| Key 3: LLM API Key | Lets the agent "think" — understand queries and respond | Your registered AI service website (e.g. OpenAI, DeepSeek, etc.) | + +> The AI will tell you exactly how to get each key step by step. Your key info is only used for this configuration session — the system does not log or leak it. + +## What capabilities does the agent have? + +| Capability | Description | Quick Experience | Integration | +|------------|-------------|:---:|:---:| +| Conversation | Voice + text two-way communication | ✅ Auto-assembled | ✅ Default included | +| Knowledge Base | Upload docs, agent auto-retrieves and answers FAQ | ✅ Simulated demo | 🔘 Optional | +| Human Handoff | Complex issues auto-escalate to a human agent | ✅ Simulated demo | 🔘 Optional | +| Tool Calling | Agent can proactively query data from your system | ❌ Not supported | 🔘 Optional | +| Session Summary | Auto-generates a summary after each conversation | ✅ Simulated demo | 🔘 Optional | + +> "Simulated demo" means: the UI and workflow are complete, but uses demo data without connecting to a real business system. For real integration, choose "Integrate into My System". + +## Communication modes (optional for integration) + +| Mode | Description | Best for | +|------|-------------|---------| +| Text-only IM | Agent replies via text chat | Web chat widgets, in-app messaging | +| Text + TTS | Agent types replies + reads them aloud | Smart speakers, voice assistants | +| Omni-modal | Text, voice, and video all supported | Advanced customer service scenarios | +| Voice-only Call | Agent communicates only via phone | Call centers, hotlines | + +## Advanced: Customize TRTC Conversational AI + +If you want to fine-tune the AI agent's voice behavior or change the underlying models, refer to the official TRTC Conversational AI documentation: + +### Adjust voice parameters (speed / pitch / timbre) + +Both STT (speech-to-text) and TTS (text-to-speech) are powered by Tencent's in-house engines. You can adjust voice parameters via the following documentation: + +| Stage | Documentation | +|-------|--------------| +| STT (Speech-to-Text) | [STT configuration parameters](https://trtc.io/document/69592?product=conversationalai) | +| TTS (Text-to-Speech) | [TTS configuration parameters](https://trtc.io/document/68340?product=conversationalai) | + +### Switch STT / LLM / TTS models + +To change the underlying STT, LLM, or TTS models, check the model overview for each pipeline stage and follow the integration guide: + +| Stage | Documentation | +|-------|--------------| +| STT (Speech-to-Text) | [STT Model Overview](https://trtc.io/document/69592?product=conversationalai) | +| LLM (Language Model) | [LLM Model Overview](https://trtc.io/document/68338?product=conversationalai) | +| TTS (Text-to-Speech) | [TTS Model Overview](https://trtc.io/document/68340?product=conversationalai) | + +### Full documentation + +For any other configuration needs, start from the Conversational AI overview and navigate from there: + +- [TRTC Conversational AI Overview](https://trtc.io/document/conversational-ai-overview?product=conversationalai) + +## Directory structure + +``` +ai-service-skill/ +├── SKILL.md # ★ The only entry point (triggered by Coding Agent) +├── start.sh # Bootstrap script (auto-install deps + start service) +│ +├── scripts/ # AI-invoked tool scripts +│ ├── verify-credentials.py # 3-key verification +│ ├── setup-credentials.py # Interactive developer setup +│ ├── add-capability.py # Capability assembly +│ ├── contract-adapt.py # Interface contract adaptation +│ └── lib/ # Shared modules +│ +├── capabilities/ +│ ├── conversation-core/ # Generic Voice Agent skeleton +│ ├── knowledge-base/ # FAQ knowledge base retrieval +│ ├── tool-calling/ # Tool calling +│ ├── human-handoff/ # Human handoff + ticket management +│ ├── session-summary/ # Session summaries +│ └── digital-human/ # Digital human (placeholder) +│ +├── scenarios/ +│ ├── customer-service/ # Path A: Demo UI +│ └── custom-builder/ # Path B: Capability selection wizard +│ +├── auto_adapters/ # Tech stack adapters +└── tests/ # Test suite +``` + +## FAQ + +| Issue | Solution | +|-------|----------| +| Key verification failed | Go back to the key configuration step and double-check each key value | +| Port 3000 is in use | Use a different port (e.g. `--port 8080`) or stop the program occupying the port | +| Python version too low | Download and install Python 3.9+ from python.org | +| Browser shows blank page after startup | Hard refresh: `Cmd+Shift+R` (Mac) or `Ctrl+Shift+R` (Windows) | +| Want to connect a real business system | Re-run the workflow and choose "Integrate into My System" | + +--- + +> **One last thing**: This Skill is designed so that anyone — even with zero coding experience — can get an AI customer service agent up and running. If you run into any issues along the way, just tell the AI in the chat window and it'll help you resolve them. diff --git a/skills/trtc-ai-service/README.zh-CN.md b/skills/trtc-ai-service/README.zh-CN.md new file mode 100644 index 0000000..77d2151 --- /dev/null +++ b/skills/trtc-ai-service/README.zh-CN.md @@ -0,0 +1,193 @@ +# TRTC AI 客服 Skill + +[English](README.md) | [中文](README.zh-CN.md) + +> 零代码 AI 客服搭建器。只需在聊天窗口中说一句话,AI 会一步步引导你完成客服系统的搭建——无需终端、无需脚本、无需编程。 + +## 这是什么? + +将"基于 TRTC Conversational AI 的 AI 客服智能体"打包成一个即插即用的 Skill: + +``` +你(在 IDE 的 AI 聊天窗口中说): + "帮我用 TRTC 搭建一个 AI 客服" + +AI(自动完成所有操作): + 1. 检查你的运行环境 + 2. 让你选择搭建模式(快速体验 / 集成到我的系统) + 3. 引导你完成 3 个密钥的配置(云服务凭证) + 4. 安装依赖并组装客服能力 + 5. 启动服务并给你一个浏览器地址,直接查看效果 + +你完全不需要打开终端或手动执行任何脚本。 +``` + +## 两种方式开始 + +> 本 Skill 的核心能力是 **TRTC Conversational AI(语音智能体)**。 + +| 模式 | 适合谁 | 能得到什么 | 需要做什么 | +|------|--------|-----------|-----------| +| **快速体验** | 想先看看效果的新用户 | 一个完整的语音智能体 Web 界面 + 工单管理后台 | 配置 3 个密钥 | +| **集成到我的系统** | 已有网站或应用、想嵌入 AI 智能体"大脑"的用户 | 后端 API 接口 + 接口规范 + 示例代码(不生成 UI) | 配置 3 个密钥 + 选择能力和交互模式 | + +**无论选择哪种方式,AI 都会引导你走完每一步**——零编程经验也没问题。 + +## 唯一入口 + +[`SKILL.md`](./SKILL.md) — 由你的编程助手(CodeBuddy / Cursor / Claude Code)读取和执行。 + +> **任意位置安装**:本 Skill 可以放在项目子目录、`.agents/skills/`、`.codebuddy/skills/` 或任何位置—— +> **不需要**放在工作区根目录。脚本会自动定位自身路径,Agent 只需使用绝对路径。 + +### 安装方式 + +#### Codex CLI + +**用户级安装**(推荐 — 所有项目均可使用): +```bash +/skills install https://github.com/Burgerham4R/ai-customer-service-skill +``` + +**项目级安装**(仅当前项目可用): +```bash +# Skill 将安装到 ./.codex/skills/(访达中按 Cmd+Shift+. 可显示隐藏文件夹) +/skills install --project https://github.com/Burgerham4R/ai-customer-service-skill +``` + +#### Claude Code CLI + +**用户级安装**(推荐 — 所有项目均可使用): +```bash +mkdir -p ~/.claude/skills +git clone https://github.com/Burgerham4R/ai-customer-service-skill.git ~/.claude/skills/ai-customer-service-skill +``` + +**项目级安装**(仅当前项目可用): +```bash +mkdir -p ./.claude/skills +git clone https://github.com/Burgerham4R/ai-customer-service-skill.git ./.claude/skills/ai-customer-service-skill +``` + +#### 其他 Agent(CodeBuddy / Cursor 等) + +克隆到任意位置,然后让 Agent 加载 `SKILL.md`: +```bash +git clone https://github.com/Burgerham4R/ai-customer-service-skill.git +# 然后对你的 Agent 说: +# "从 /path/to/ai-service-skill/SKILL.md 加载这个 Skill" +``` + +> **安装完成后,请重启 CLI 会话** 以确保 Skill 被正确注册和加载。 + +- "AI客服" / "搭建客服" / "客服机器人" +- "TRTC + 客服" / "语音智能体 + 客服" +- "帮我用 TRTC 搭建一个 AI 客服" + +## 3 个密钥是什么? + +要让客服智能体跑起来,你需要 3 个云服务凭证。别担心——它们只是从相应网站复制粘贴的 3 个字符串: + +> **Tencent RTC (trtc.io)** 是腾讯云旗下的国际实时音视频通信品牌。TRTC Conversational AI 服务基于腾讯云基础设施运行——你的 TRTC 账号和腾讯云账号通过统一登录体系关联。获取 API Key 时,系统会自动同步你的登录状态。 + +| 密钥 | 用途 | 获取地址 | +|------|------|---------| +| 密钥 1:Tencent Cloud API Key | 证明你有权限使用 TRTC 语音和通话服务 | https://console.tencentcloud.com/cam/capi | +| 密钥 2:TRTC 应用凭证 | 让智能体能够拨打电话和进行语音聊天 | https://console.trtc.io/app | +| 密钥 3:LLM API Key | 让智能体能够"思考"——理解用户问题并回复 | 你注册的 AI 服务网站(如 OpenAI、DeepSeek 等) | + +> AI 会一步步详细告诉你如何获取每个密钥。你的密钥信息仅用于本次配置会话——系统不会记录或泄露。 + +## 智能体有哪些能力? + +| 能力 | 描述 | 快速体验 | 集成模式 | +|------|------|:---:|:---:| +| 对话 | 语音 + 文字双向交流 | ✅ 自动组装 | ✅ 默认包含 | +| 知识库 | 上传文档,智能体自动检索并回答常见问题 | ✅ 模拟演示 | 🔘 可选 | +| 人工转接 | 复杂问题自动转接至人工客服 | ✅ 模拟演示 | 🔘 可选 | +| 工具调用 | 智能体可主动查询你的系统中的数据 | ❌ 不支持 | 🔘 可选 | +| 会话摘要 | 每次对话后自动生成摘要 | ✅ 模拟演示 | 🔘 可选 | + +> "模拟演示"指:界面和工作流是完整的,但使用的是演示数据,未接入真实业务系统。如需真实接入,请选择"集成到我的系统"。 + +## 通信模式(集成模式可选) + +| 模式 | 描述 | 适用场景 | +|------|------|---------| +| 纯文字即时消息 | 智能体通过文字聊天回复 | 网页聊天插件、应用内消息 | +| 文字 + TTS | 智能体打字回复 + 语音朗读 | 智能音箱、语音助手 | +| 全模态 | 文字、语音、视频全部支持 | 高级客服场景 | +| 纯语音通话 | 智能体仅通过电话沟通 | 呼叫中心、热线 | + +## 进阶:自定义 TRTC Conversational AI + +如果你想微调 AI 智能体的语音行为或更换底层模型,请参阅 TRTC Conversational AI 官方文档: + +### 调整声音参数(语速 / 音调 / 音色) + +STT(语音识别)和 TTS(语音合成)均使用腾讯自研引擎。你可以通过以下文档调整声音参数: + +| 阶段 | 文档 | +|------|------| +| STT(语音识别) | [STT 配置参数](https://trtc.io/document/69592?product=conversationalai) | +| TTS(语音合成) | [TTS 配置参数](https://trtc.io/document/68340?product=conversationalai) | + +### 切换 STT / LLM / TTS 模型 + +如需更换底层 STT、LLM 或 TTS 模型,请查看各环节的模型总览并按照接入指引操作: + +| 阶段 | 文档 | +|------|------| +| STT(语音识别) | [STT 模型总览](https://trtc.io/document/69592?product=conversationalai) | +| LLM(大语言模型) | [LLM 模型总览](https://trtc.io/document/68338?product=conversationalai) | +| TTS(语音合成) | [TTS 模型总览](https://trtc.io/document/68340?product=conversationalai) | + +### 完整文档 + +如有其他配置需求,请从 Conversational AI 总览页出发寻找相应答案: + +- [TRTC Conversational AI 总览](https://trtc.io/document/conversational-ai-overview?product=conversationalai) + +## 目录结构 + +``` +ai-service-skill/ +├── SKILL.md # ★ 唯一入口(由编程助手触发) +├── start.sh # 启动脚本(自动安装依赖 + 启动服务) +│ +├── scripts/ # AI 调用的工具脚本 +│ ├── verify-credentials.py # 三密钥验证 +│ ├── setup-credentials.py # 交互式开发者配置 +│ ├── add-capability.py # 能力组装 +│ ├── contract-adapt.py # 接口契约适配 +│ └── lib/ # 共享模块 +│ +├── capabilities/ +│ ├── conversation-core/ # 通用语音智能体骨架 +│ ├── knowledge-base/ # FAQ 知识库检索 +│ ├── tool-calling/ # 工具调用 +│ ├── human-handoff/ # 人工转接 + 工单管理 +│ ├── session-summary/ # 会话摘要 +│ └── digital-human/ # 数字人(占位) +│ +├── scenarios/ +│ ├── customer-service/ # 路径 A:演示界面 +│ └── custom-builder/ # 路径 B:能力选择向导 +│ +├── auto_adapters/ # 技术栈适配器 +└── tests/ # 测试套件 +``` + +## 常见问题 + +| 问题 | 解决方案 | +|------|---------| +| 密钥验证失败 | 返回密钥配置步骤,仔细检查每个密钥值 | +| 端口 3000 被占用 | 使用其他端口(如 `--port 8080`)或停止占用端口的程序 | +| Python 版本过低 | 从 python.org 下载安装 Python 3.9+ | +| 启动后浏览器显示空白页 | 强制刷新:`Cmd+Shift+R`(Mac)或 `Ctrl+Shift+R`(Windows) | +| 想接入真实业务系统 | 重新运行工作流,选择"集成到我的系统" | + +--- + +> **最后再说一句**:本 Skill 的设计目标是让任何人——即使完全不会编程——都能搭起一个 AI 客服智能体。如果在过程中遇到任何问题,直接在聊天窗口中告诉 AI,它会帮你解决。 diff --git a/skills/trtc-ai-service/SKILL.md b/skills/trtc-ai-service/SKILL.md new file mode 100644 index 0000000..7e2d125 --- /dev/null +++ b/skills/trtc-ai-service/SKILL.md @@ -0,0 +1,931 @@ +--- +name: trtc-ai-service +description: > + AI customer service scenario skill for TRTC Conversational AI. Guide users + step-by-step through building an AI-powered customer service application — + from zero to a working demo, or integrate AI service capabilities into an + existing project. Use this skill when the user wants to "build an AI + customer service agent", "set up intelligent Q&A", "create a smart客服 + system", or describes a complete AI service use case. This skill loads + scenario files that define the sequence of capabilities to implement and + guides the user through each step with code examples, UI components, and + verification checkpoints. +--- + +# AI Customer Service Skill (v1.2) + +> This document is the Coding Agent's execution SOP. It also serves as a user-friendly guide reference. +> For any natural-language intent involving "build / integrate AI customer service," the AI must **read this file first** before taking action. +> All script calls must strictly follow §12 Tool Whitelist. + +## Entry points + +This skill is reached two ways: + +1. **Direct routing from `../trtc/SKILL.md`** — the primary path. The root skill has identified the user's intent as Conversational AI / AI customer service and routed here directly. No onboarding session is required; proceed to §0 below. + +2. **Handoff from `../trtc-topic/SKILL.md`** — when the user has gone through onboarding and explicitly selected an AI service scenario. In this case, the scenario id is already resolved; skip to the relevant path section. + +--- + +## 0. Path Baseline (SKILL_ROOT / PROJECT_ROOT) —— 🔴 Top Priority — Read First + +All runtime assets of this Skill (`capabilities/`, `scripts/`, `scenarios/`, `auto_adapters/`, +`start.sh`) reside in the **Skill's own directory** and are **not necessarily** at the user's workspace root. +The Skill can be installed in arbitrary locations: a project subdirectory, `.agents/skills/`, `.codebuddy/skills/`, +`.codex/skills/`, and will work across IDEs (Claude Code / Codex / Cursor). Therefore, **never assume "Skill root == Workspace root."** + +### 0.1 Definition of the Two Roots + +| Variable | Meaning | How to Obtain | +|---|---|---| +| `SKILL_ROOT` | **Skill's own directory** (contains `SKILL.md` / `scripts/` / `capabilities/` …) | = The absolute path of the **Base directory** injected by the system when this Skill is loaded. In the agent-skills repo context, this is `${CLAUDE_PLUGIN_ROOT}/skills/trtc-ai-service`. The Agent must remember it. | +| `PROJECT_ROOT` | **User's current project root** (= workspace root; the integration target for Path B) | = The absolute path of the current workspace root. | + +> Demo path (A) uses `SKILL_ROOT` (fetch capability sources + start core) and `PROJECT_ROOT` (where demo artifacts land); +> Integration path (B) uses `SKILL_ROOT` (fetch capability sources + start core) and `PROJECT_ROOT` (integration target). +> These two **may or may not be the same** — do not mix them up. + +### 0.2 Hard Rules for Path Usage + +1. **All commands that call Skill-bundled scripts / assets must use the absolute path of `SKILL_ROOT`**, e.g.: + ```bash + cd "$SKILL_ROOT" && python3 scripts/add-capability.py ... + # or + python3 "$SKILL_ROOT/scripts/add-capability.py" ... + ``` + **Do not** write bare relative paths (e.g., `python3 scripts/...`) assuming they resolve against the workspace root — that was the root cause of bugs in previous versions. +2. For all command templates in this document that contain `$SKILL_ROOT` / `$PROJECT_ROOT`, the Agent **must substitute them with actual absolute paths** before execution. +3. The scripts themselves (`start.sh` / `add-capability.py` / `post-install-patch.py`) already self-locate + (via `__file__` / `BASH_SOURCE`), so as long as you invoke them with their **absolute path**, they work regardless of cwd. +4. If `SKILL_ROOT` cannot be determined immediately, fall back to a one-shot detection (do not ask the user to move directories): + ```bash + find "$PWD" -maxdepth 4 -name SKILL.md -path '*ai-service*' 2>/dev/null | head -1 + ``` + If still not found, ask the user where the Skill is installed. **Never ask the user to move the Skill directory to the workspace top level.** + +--- + +## 1. When to Use This Skill + +**Trigger conditions** (activate this Skill if any match): +- The user message contains one of the triggers.keywords +- The user message contains "TRTC" and refers to "customer service / after-sales / customer support" +- The user, in a session where this Skill is already loaded, explicitly expresses "let's start / run it / integrate it" + +**Not applicable** (refuse and explain): +- Purely a general voice conversation demo (not customer service business) → direct to the conversation-core README +- Requires digital human / outbound phone calls → not in current scope +- User is in a non-TRTC ecosystem (Agora / Shengwang) → suggest the corresponding Skill + +> **Product positioning note**: This Skill encapsulates **TRTC Conversational AI (voice)** capabilities. The selling point is "voice customer service." +> Therefore the demo scenario (Path A) is voice-first. If the user only wants plain text and merely reuses the RTC channel, advise them to configure it themselves. +> This Skill does **not** generate artifacts for text-only scenarios. + +--- + +## 2. Interaction Language Detection (Hard Constraint Throughout the Process) + +> **Purpose**: Throughout the setup process, all of the AI's guidance text, `ask_followup_question` questions / options, +> prompts, and summaries must **follow the natural language of the user's first prompt**. Do not hardcode Chinese. + +**Detection rules** (complete after Skill start, before §3; store the result in the internal variable `interaction_lang`): +- Use the **message that triggered this Skill** as the basis for detection +- Predominantly Chinese → `interaction_lang = zh` +- Predominantly English (or other non-Chinese language) → `interaction_lang = en` (approximate other languages with English) +- If the user explicitly requests a language switch mid-session → update `interaction_lang` immediately and apply to all subsequent interactions + +**Scope (must be followed)**: + +| Scenario | Requirement | +|---|---| +| Path selection options | question and each option use `interaction_lang` | +| Path B Q&A dialogue | use `interaction_lang` | +| Three-Keys setup dialogue | use `interaction_lang` | +| Contract alignment options and checklist | use `interaction_lang` | +| Post-launch entry list / trial suggestions | use `interaction_lang` | +| Error recovery / warning messages | use `interaction_lang` | + +**Relationship with artifact UI language** (only Path A involves UI): +- `interaction_lang` controls the language of the **setup process dialogue**. +- **Path A** artifact UI default language (`recipe.yaml metadata.language`) **defaults to following `interaction_lang`**, + unless the user specifies otherwise. +- **Path B** generates no UI, so there is only "dialogue language," not "artifact UI language." Delivered code comments / READMEs use `interaction_lang`. + +> Do not default to Chinese in conversations just because SKILL.md itself is originally written in Chinese. **Follow the user's language.** + +--- + +## 3. Environment Check (Fully Automatic — No User Action Needed) + +> **AI guidance text** (output the following in `interaction_lang`): + +Before we officially start, the system will automatically check if your runtime environment meets the requirements. You don't need to do anything for this step — just wait a moment. + +**Checks performed**: +- Python version >= 3.9 +- Skill directory files are intact +- Whether the three keys (cloud service credentials) have been configured + +If all checks pass, we'll automatically move to the next step. If something fails, the system will tell you exactly what's missing and how to fix it. + +--- + +**AI execution actions** (substitute `$SKILL_ROOT` in all commands with the absolute path determined in §0 before execution): + +### 3.1 Python ≥ 3.9 +```bash +python3 -c "import sys; assert sys.version_info >= (3, 9), sys.version" && echo OK || echo BAD_PY +``` +Fail → tell the user: +> Your Python version is too old. You need version 3.9 or above. You can download the latest version at https://www.python.org/downloads/. Once installed, we'll continue. + +**Do not proceed** until the Python version is satisfied. + +### 3.2 SKILL_ROOT Verification +```bash +test -f "$SKILL_ROOT/capabilities/conversation-core/manifest.yaml" && echo OK || echo MISSING +``` +- OK → path baseline is correct. Continue. +- MISSING → `$SKILL_ROOT` was set incorrectly. Use the `find` fallback from §0.2 item 4 to re-determine `SKILL_ROOT`, then rerun this check. Only ask the user for the Skill install location if it still fails. + +### 3.3 .env Status +```bash +test -f "$SKILL_ROOT/capabilities/conversation-core/.env" && echo OK || echo MISSING +``` +- OK → Indicates the three keys have been configured before. Tell the user: + > I see you've configured your keys before. We can reuse them directly. If you want to reconfigure, just let me know. + Can skip §5 (unless the user explicitly wants to "reconfigure keys"). +- MISSING → The first step must be §5 Three-Keys Configuration. + +--- + +## 4. Path Selection + +> **AI guidance text**: + +Environment check passed! Now let's make a choice — how would you like to get started? + +--- + +**First required action**: Use the `ask_followup_question` tool to present a **single-choice question**: + +```json +[{ + "id": "path", + "question": "How would you like to set up your AI customer service?", + "options": [ + "Quick Start — Get the agent running right away. You'll see the results in your browser (a web chat window + ticket management dashboard). You'll need to configure 3 keys, and the system will automatically install default capabilities. You should see results within 2-3 minutes. Best for first-timers who want to see 'what this thing looks like'", + "Integrate into My System (backend capabilities only) — If you already have your own website or app and want to plug in the AI customer service 'brain', choose this. The system will provide a set of API interfaces with no web UI generated. You'll need to configure 3 keys, then choose the interaction mode and additional capabilities" + ], + "multiSelect": false +}] +``` + +- Choose A → Go to §6 (Path A: Quick Start) +- Choose B → Go to §7 (Path B: Integrate into My System) + +> Fallback when Coding Agent does not support `ask_followup_question`: +> List both paths in natural language and collect the user's answer from the conversation. **Do not make assumptions.** + +**Key boundaries the AI should proactively explain**: +> Whichever you choose, I'll walk you through it step by step. Here's a quick summary of the two paths: +> - Quick Start: I'll generate a complete customer service web interface for you. You'll be able to see and experience it right in your browser. +> - Integrate into My System: I'll give you the AI customer service backend capabilities only (API interfaces). The UI is yours — I'll hand you the API docs and sample code, and your developers can connect to them directly. + +--- + +## 5. Three-Keys Configuration + +> **Trigger condition**: §3.3 returned MISSING, or a key was subsequently judged as failed by verify-credentials.py. +> Substitute `$SKILL_ROOT` in commands with absolute paths before execution. + +--- + +> **AI guidance text**: + +To get the customer service agent running, you'll need to configure 3 keys — they're the access passes for cloud services. Don't worry, I'll walk you through each one. + +--- + +### 5.1 Configuration Methods + +You can configure keys in one of two ways: + +**Method 1: Fill them in yourself** +In the `.env` file in the project root, find the corresponding configuration items and replace the values on the right side of the equals sign with your own. A complete configuration template is provided below — you can copy and paste the whole block into your `.env` file. + +**Method 2: Send them to me and I'll fill them in** +Send each key's value through the chat, and I'll write them into the `.env` file for you. Your key information is only used for this configuration write. The system handles it securely — your keys will not be logged or leaked. + +--- + +### 5.2 Complete Configuration Template (can be given to the user for copy-paste) + +```bash +# ========================================== +# AI Customer Service Skill - Environment Variable Template +# Copy the entire block into your .env file and replace the values on the right side of the equals sign +# ========================================== + +# --- Key 1: Tencent Cloud API Credentials --- +# Get them here: https://console.tencentcloud.com/cam/capi +TENCENT_CLOUD_SECRET_ID=yourSecretId +TENCENT_CLOUD_SECRET_KEY=yourSecretKey + +# --- Key 2: TRTC Application Credentials --- +# Get them here: https://console.trtc.io/app +# (China-region accounts use: https://console.cloud.tencent.com/trtc) +TRTC_SDK_APP_ID=yourSDKAppID (e.g., 1400000000) +TRTC_SDK_SECRET_KEY=yourSDKSecretKey (64-character string) + +# --- Key 3: LLM API Key --- +# Enter the API Key for the AI language model service you're using +LLM_API_KEY=yourAPIKey +LLM_API_URL=yourAPIEndpoint (fill in if using a non-OpenAI service) +LLM_MODEL_NAME=yourModelName (e.g., gpt-4o / deepseek-chat / claude-3-opus) +``` + +--- + +### 5.3 Key-by-Key Collection Process + +#### Key 1: Tencent Cloud API Credentials (SecretId / SecretKey) + +**The AI should say**: +> Let's start with Key 1 — Tencent Cloud API Credentials. This key proves you have permission to use TRTC's voice and calling services. +> +> A quick note: Tencent RTC (trtc.io) is Tencent Cloud's international RTC brand. Your TRTC account and Tencent Cloud account are connected — you'll use a unified login. +> +> To get it: +> 1. If you haven't signed up yet, go to https://trtc.io and create a TRTC account first +> 2. After logging in, open: https://console.tencentcloud.com/cam/capi (your login session will sync automatically) +> 3. You'll see a page called "API Key Management." There will be a **SecretId** and a **SecretKey** (you may need to click "Show" to see the full content) +> +> Fill the two values into the code block below. Make sure to replace the placeholder text (`yourSecretId` and `yourSecretKey`), then **copy and send the entire block to me**: +> +> ``` +> # My Tencent Cloud API credentials +> TENCENT_CLOUD_SECRET_ID=yourSecretId +> TENCENT_CLOUD_SECRET_KEY=yourSecretKey +> ``` + +**After the user replies with the code block**, the AI must parse the values on the right side of the equals sign: +1. Validate format: SecretId is typically 36 characters, `^[A-Za-z0-9]+$`; SecretKey is not empty +2. Tool: `write_to_file("$SKILL_ROOT/capabilities/conversation-core/.env", )` +3. Do not echo the full key; only confirm "Received — length/format OK" +4. Tool: `execute_command("cd \"$SKILL_ROOT\" && python3 scripts/verify-credentials.py --type tencent")` +5. Parse stdout JSON: + - `{"ok": true, ...}` → Tell the user "Key 1 verified successfully" and proceed to Key 2 + - `{"ok": false, "error": "E001"}` → Respond per the §5.5 error code table; ask the user to retry + - `{"ok": false, "error": "E000"}` → Check if some value in the user's code block is still a Chinese placeholder; if so, prompt "I noticed a value still looks like a placeholder — please send the complete code block again with all values filled in" + +#### Key 2: TRTC Application Credentials (SDKAppID / SDKSecretKey) + +**The AI should say**: +> All set! Now Key 2 — TRTC Application Credentials. This key lets the customer service agent make voice calls and chat with voice. +> +> To get it: +> 1. Open this page: https://console.trtc.io/app (If you're using a Tencent Cloud China-region account, use: https://console.cloud.tencent.com/trtc) +> 2. Find the "Conversational AI" application you previously created (create a new one if you don't have one yet) +> 3. Once inside the application, you'll need to find two pieces of information: +> - **SDKAppID**: a string of numbers +> - **SDKSecretKey**: a long string of mixed letters and numbers (found in the "Server-side Integration" section) +> 4. ⚠️ Important: There may also be something called STSecretKey on the page — that one is for the client side. We don't want that. We need the **SDKSecretKey** (the server-side one) +> +> Fill the two values into the code block below. Make sure to replace the placeholder text (`yourSDKAppID` and `yourSDKSecretKey`), then **copy and send the entire block to me**: +> +> ``` +> # My TRTC application credentials +> TRTC_SDK_APP_ID=yourSDKAppID +> TRTC_SDK_SECRET_KEY=yourSDKSecretKey +> ``` + +**After the user replies with the code block**, the AI must parse the values on the right side of the equals sign: +1. Validate: SDKAppID is an integer; SDKSecretKey must be 64 characters `[0-9a-f]` + (**Special case**: if 128 characters detected and the first 64 equal the last 64 → auto-truncate to first 64 and inform the user) +2. Tool: `write_to_file` append `TRTC_SDK_APP_ID=` + `TRTC_SDK_SECRET_KEY=` (default international site; do not write TRTC_REGION) +3. Tool: `execute_command("cd \"$SKILL_ROOT\" && python3 scripts/verify-credentials.py --type trtc")` +4. Parse stdout JSON as above (on failure, respond per §5.5 error code table; if value is still a placeholder, prompt to resend) + +#### Key 3: LLM API Key + +**The AI should say**: +> Great! Last one — the LLM API Key. This key lets the customer service agent "think" — understand customer questions and generate replies. You'll need an account with an AI language model service provider. +> +> If you don't have an LLM account yet, you can pick one from the providers below, sign up, and get an API Key. (The API Key page link is listed for each — just click to go directly): +> +> | Provider | Model Series | Get your API Key here | +> |----------|-------------|----------------------| +> | OpenAI | GPT Series | https://platform.openai.com/api-keys | +> | Anthropic | Claude Series | https://console.anthropic.com/settings/keys | +> | Google AI | Gemini Series | https://aistudio.google.com/apikey | +> | DeepSeek | DeepSeek Series (cost-effective, strong Chinese) | https://platform.deepseek.com/api_keys | +> | Together AI | Open-source model hosting | https://api.together.ai/settings/api-keys | +> | Groq | High-performance inference | https://console.groq.com/keys | +> | Cohere | Enterprise AI | https://dashboard.cohere.com/api-keys | +> | Mistral AI | Mistral Series (EU provider) | https://console.mistral.ai/api-keys | +> +> Once you've chosen a provider and gotten your API Key, fill it into the code block below (replace the placeholder text), then **copy and send the entire block to me**: +> +> ``` +> # My LLM API configuration +> LLM_API_KEY=yourAPIKey +> LLM_API_URL=yourAPIEndpoint +> LLM_MODEL=yourModelName +> ``` +> +> Things to keep in mind: +> - If you're using **OpenAI**, you can delete the `LLM_API_URL` line (the default is OpenAI's endpoint) +> - If you're using another provider (e.g., DeepSeek, Claude, Gemini, etc.), you must fill in both `LLM_API_URL` and `LLM_MODEL`. Check your provider's documentation for the exact values — search for "API Base URL" and "Model Name" + +**After the user replies with the code block**, the AI must parse the values on the right side of the equals sign: +1. Validate: `LLM_API_KEY` is not empty +2. If `LLM_API_URL` is empty or is a placeholder, default to `https://api.openai.com/v1` +3. If `LLM_MODEL` is empty or is a placeholder, default to `gpt-4o` +4. Tool: `write_to_file` append `LLM_API_KEY=` + `LLM_API_URL=...` + `LLM_MODEL=...` +5. Tool: `execute_command("cd \"$SKILL_ROOT\" && python3 scripts/verify-credentials.py --type llm")` +6. Parse stdout JSON: + - `{"ok": true, ...}` → Tell the user "All three keys are ready — moving to the next step" + - `{"ok": false, "error": "E003"}` → Respond per §5.5 error code table with hints + - If value is still a Chinese placeholder, prompt "I see the code block still has placeholder text that hasn't been replaced — please fill in all values and resend" + +--- + +### 5.4 Security Constraints (Red Lines — Violations Are Considered Defects) + +| Red Line | Correct Approach | +|---|---| +| Do not pass keys as command-line arguments to any script | Write to .env via write_to_file, then call verify-credentials.py with no arguments | +| Do not echo the full key value in chat replies | Only confirm "Received + length/format OK" | +| Do not output keys to logs / stdout | verify-credentials.py automatically outputs only ok/error/message/latency_ms | +| Do not use `echo $SECRET` / `cat .env` | shell history / terminal logs will record it | +| After writing .env, its permissions must be 600 | execute_command("chmod 600 \"$SKILL_ROOT/capabilities/conversation-core/.env\"") | + +--- + +### 5.5 Error Codes → AI Response Templates + +| error | Meaning | What the AI should tell the user | +|---|---|---| +| E000 | Credential not configured / empty | "It looks like this entry in .env is empty or missing — please send it again" | +| E001 | Tencent Cloud API verification failed | "Tencent Cloud API verification failed. Common causes: ① Id/Key order might be swapped ② Key may have been disabled ③ STS service not enabled on your account. Please check at console.cloud.tencent.com/cam" | +| E002 | TRTC verification failed | "TRTC verification failed. Please double-check: ① Does the SDKAppID belong to your account ② Did you mix up SDKSecretKey and STSecretKey ③ China-region apps may need `TRTC_REGION=cn` added to .env" | +| E003 | LLM verification failed | "LLM verification failed. If you're using a non-OpenAI service, you may need to update the API endpoint. Which provider are you using?" | +| E004 | Network unreachable | "Cannot reach the verification server. Please check: ① Do you need a proxy ② Is there a corporate firewall ③ Is your network working. You can also skip deep verification and continue" | + +--- + +## 6. Path A: Quick Start + +> User selected A in §4. +> Default artifact: **Voice Customer Service UI** (TRTC real connection, FAQ silent RAG, handoff queue animation + simulated connection, product/order business panel). +> Substitute `$SKILL_ROOT` in all commands with absolute paths before execution. + +--- + +> **AI guidance text** (Path A entry point): +> Alright, going with the Quick Start path! I'll set up the entire customer service system for you. You don't need to do anything — just wait a moment. +> +> This path will automatically install the following capabilities: +> - **Conversation capability**: The agent can actually understand what you say and respond (because real AI keys are configured) +> - **Human handoff**: You'll see the handoff flow and UI (using demo data) +> - **Knowledge base**: You'll see KB search results in action (using demo documents) +> - **Session summary**: Default-installed in Path A — when a handoff ticket is created, an LLM-generated summary of the conversation is written into the ticket Description so agents see the context immediately +> +> Once set up, you'll open your browser and see a full customer service chat interface and a ticket management dashboard. + +--- + +### 6.0 Deployment Parameters (Adjustable) + +| Parameter | Default | Description | +|---|---|---| +| Deployment directory | `$PROJECT_ROOT/ai-customer-service-demo/` | Where the demo UI lands (independent of the Skill folder, easy for later customization); use a different directory if the user requests it | +| Port | `3000` | If occupied or user specifies a different one: `bash "$SKILL_ROOT/start.sh" --port `, then sync the port in all subsequent health checks / URLs | + +--- + +### 6.1 Step Sequence (6 Steps) + +**Step 1: Configure the Three Keys** +- Tool: `execute_command("test -f \"$SKILL_ROOT/capabilities/conversation-core/.env\" && echo OK || echo MISSING")` +- Returns OK → Proceed to Step 2 +- Returns MISSING → Enter §5 Three-Keys sub-flow, then return to Step 2 when done + +**Step 2: Assemble Capability Packages** + +> **AI should tell the user**: +> Installing dependencies and assembling default capabilities — this should take about 30-60 seconds... + +- Tool: `execute_command("cd \"$SKILL_ROOT\" && python3 scripts/add-capability.py knowledge-base human-handoff --apply --json")` +- Expected: JSON output with all `reports[*].errors == []`, no fatal `injection.error` +- Failure handling: + - Circular dependency / version conflict → explain to user based on stderr output; stop + - L2 (templates) → installed to templates directory; tell user where to manually inject + - L3 (manual) → output the path `$SKILL_ROOT/auto_adapters/integration_templates/generic-frontend.md` + +**Step 2.5: Post-Install Patch (Must Run)** +- Tool: `execute_command("cd \"$SKILL_ROOT\" && python3 scripts/post-install-patch.py")` +- Expected: returns `{"ok": true, ...}` +- This script does 3 things: + - Fixes stale extension point injection errors + - Appends recipe default capability config to `.env` (existing values untouched) + - Verifies `server.py`'s `StaticFiles(html=True)` + +**Step 3: UI Overlay (Must Run — Path A Exclusive) —— Default Voice Customer Service UI** +- Artifacts deployed to `$PROJECT_ROOT/ai-customer-service-demo/` (independent of Skill directory, easy for later edits) +- Tool (one command to create directory and copy): + ```bash + execute_command( + "mkdir -p \"$PROJECT_ROOT\"/ai-customer-service-demo/admin && \ + cp \"$SKILL_ROOT\"/scenarios/customer-service/ui/voice-customer-service/{index.html,app.js,styles.css,data.js,mock-shop.json,tokens.css} \ + \"$PROJECT_ROOT\"/ai-customer-service-demo/ && \ + cp -R \"$SKILL_ROOT\"/scenarios/customer-service/ui/admin-board/. \ + \"$PROJECT_ROOT\"/ai-customer-service-demo/admin/ && \ + echo \"WEB_DEMO_DIR=$PROJECT_ROOT/ai-customer-service-demo\" >> \"$SKILL_ROOT\"/capabilities/conversation-core/.env" + ) + ``` +- Expected: `$PROJECT_ROOT/ai-customer-service-demo/` contains `index.html / app.js / styles.css / data.js / mock-shop.json / tokens.css` + `admin/` subdirectory, and `WEB_DEMO_DIR` is written to `.env` +- Failure handling: check that `$SKILL_ROOT/scenarios/customer-service/ui/voice-customer-service/` is intact + +**Step 4: Proactively List business_contract** (enter §9) + +**Step 5: Start the Demo** + +> **AI should tell the user**: +> Starting the customer service system. The first launch needs to install some dependency packages and may take 30-60 seconds. Please wait... + +- Tool: `execute_command("cd \"$SKILL_ROOT\" && nohup bash start.sh > /tmp/ai-cs-start.log 2>&1 &")` +- Tool: `execute_command("sleep 8 && curl -sS http://localhost:3000/api/v1/health")` +- First launch creates venv + runs pip install, **typically takes 30-60s** + - If health check fails after sleep 8 → Tool: `execute_command("sleep 25 && curl -sS http://localhost:3000/api/v1/health")` try again + - Still fails → `tail -80 /tmp/ai-cs-start.log` check for pip install errors / port conflicts +- Health check returns `{"status":"ok",...}` → Proceed to Step 6 + +**Step 6: Output Entry List + Trial Suggestions** + +> **The AI should say**: +> All done! Your AI customer service agent is up and running. Open the following URLs in your browser to see it in action: + +| Page | URL | Description | +|---|---|---| +| AI Voice Agent | http://localhost:3000 | (customer service chat interface) | +| Admin board | http://localhost:3000/static/admin/ | (ticket management dashboard) | +| API docs (Swagger) | http://localhost:3000/docs | (API documentation) | +| Health probe | http://localhost:3000/api/v1/health | (health check) | + +``` +Try saying / typing: + · "How do I get a refund" → AI replies; KB silently augments answer + · "Talk to agent" → handoff queue + 8s progress bar + simulated connect + · Click any product / order card → auto-asks the AI about that item +``` + +> Note: The human handoff and knowledge base are using simulated data, so you won't see real business integration effects. If you want to connect to a real business system, you can start over and choose "B — Integrate into My System." + +--- + +### 6.2 Don'ts + +- ❌ Use bare relative paths to call scripts (must `cd "$SKILL_ROOT"` or use absolute paths — see §0) +- ❌ Skip .env check before assembling capability packages +- ❌ Pass any key via command-line arguments to scripts +- ❌ Modify `capabilities/*/src/core/` (this is the skeleton layer; do not touch) +- ❌ Skip Step 2.5 (not running post-install-patch.py leaves stale injection errors from add-capability → NameError on startup) +- ❌ Skip Step 3 (not running UI overlay leaves `/` at the conversation-core built-in voice self-check page → not the intended artifact) +- ❌ Say the admin board URL is `/admin/tickets` (**the correct path is `/static/admin/`**) +- ❌ Execute `git commit` / `git push` (unless the user explicitly requests it) +- ❌ Echo full key content in chat replies + +--- + +## 7. Path B: Integrate into My System (Backend Capabilities Only) + +> User selected B in §4. +> **Key positioning**: Integrate TRTC Conversational AI **backend capabilities** into the user's **existing project** (`PROJECT_ROOT`). +> - `conversation-core` is the core: must **end-to-end verify the voice conversation pipeline** (test until you can actually converse). +> - Other incremental capabilities (knowledge-base / human-handoff / session-summary / tool-calling): +> Only deliver **interface specifications + mock implementations + sample code**. The user replaces them with their own systems as needed. +> - **This path NEVER generates any frontend UI** — the UI is the user's own frontend/backend responsibility. + +--- + +> **AI guidance text** (Path B entry point — must explicitly state boundaries): +> Alright, going with the "Integrate into My System" path. This path will plug the AI customer service **backend capabilities** into your existing project. +> +> Here's what I'll do: +> - Install the voice conversation core (conversation-core) and run it end-to-end to confirm it can actually converse +> - For extra capabilities like knowledge base, human handoff, session summaries, etc., I'll only provide **interface specs + mock implementations + sample code**. You swap in your own real systems as needed +> - **I will not generate any web UI** — the UI is handled by your own project's frontend +> +> Now, let's walk through a few steps: first confirm your project, then pick capabilities, and finally choose the interaction mode for the agent. + +--- + +### 7.1 Confirm Integration Target (PROJECT_ROOT & Tech Stack) + +1. Confirm `PROJECT_ROOT` (default = current workspace root). If the user's project is in a subdirectory, have them specify it as `--target-project`. +2. Let the script auto-detect the project tech stack (no manual entry needed): + ```bash + cd "$SKILL_ROOT" && python3 scripts/add-capability.py --list --json + ``` + Tech stack detection is triggered automatically by `--target-project` during Step 7.3 assembly (`stack_detector`). + If auto-detection is inaccurate, override with `--tech-stack `. + +### 7.2 Configure Three Keys +- Tool: `execute_command("test -f \"$SKILL_ROOT/capabilities/conversation-core/.env\" && echo OK || echo MISSING")` +- MISSING → Enter §5 to complete the three keys (voice core hard-depends on all three keys — all are mandatory). + +### 7.3 Capability Selection (Optional Incremental Capabilities — Multi-Select) + +> **The AI should say** (using `ask_followup_question` multi-select mode): +> Now let's decide what extra capabilities the agent should have. Besides the built-in voice conversation capability, you can add the following. You can pick multiple, or none at all. Without any extras, the agent will only have basic conversation ability. + +| # | Capability Package | Description | What you'll get | +|---|---|---|---| +| 1 | Knowledge Base | FAQ / KB search | Upload a return policy PDF — the agent automatically answers "How do I return this?" | +| 2 | Human Handoff | Auto-escalate to a human when the bot can't handle it | Complex issues (complaints, refund disputes) are automatically routed to a human agent, with a ticket dashboard | +| 3 | Tool Calling | Let the agent query your system's data | Customer asks "Where's my order?" → agent queries your database and returns shipping status | +| 4 | Session Summary | Auto-generate a summary after each conversation | After each chat, a summary is written so you can review what the customer said and archive it | + +```json +[{ + "id": "capabilities", + "question": "Which additional capabilities do you need? (multi-select)", + "options": [ + "① Knowledge Base — FAQ / KB search", + "② Human Handoff — Escalate to human + ticket flow", + "③ Tool Calling — Let AI call your business tools", + "④ Session Summary — Auto-generate summaries after sessions", + "(None — just basic conversation)" + ], + "multiSelect": true +}] +``` + +> Made your choice? Just tell me the numbers (e.g., "1, 2, 3" or "all"). + +**Assembly command** (renders incremental capability adapters / samples into the user's project): +```bash +cd "$SKILL_ROOT" && python3 scripts/add-capability.py \ + --target-project "$PROJECT_ROOT" --apply --json +# If none selected, skip this command (only runs voice core) +``` +- `--target-project` triggers `auto_adapters` three-tier fallback rendering: + - L1: Based on detected tech stack, renders **room-entry components / backend proxy route examples** into `$PROJECT_ROOT` + - L2: Renders to templates directory and lists TODOs + - L3: Outputs generic integration guide for manual connection +- Parse the returned JSON and tell the user where the files landed + +**Post-install patch (must run)**: +```bash +cd "$SKILL_ROOT" && python3 scripts/post-install-patch.py +``` + +### 7.4 I/O Modality Selection (Choose the Agent's "Communication Method") + +> **The AI should say** (using `ask_followup_question` single-choice mode): +> Now let's decide the agent's "communication method" — how will your customer service agent interact with customers? Here are 4 options — **pick the one** that best fits your business: + +| # | Modality | Plain Description | Best For | +|---|---|---|---| +| 1 | Text-only IM | Agent replies via text chat only | Web live chat, in-app messaging, WeChat customer service | +| 2 | Text + TTS | Agent replies in text, with text-to-speech read aloud to the customer | Need voice feedback but don't want a phone line — e.g., smart speakers, app voice assistants | +| 3 | Full Modality | Text and voice both available — the most complete interaction | High-end scenarios requiring both text and voice | +| 4 | Voice-only Call | Agent communicates only via voice call, no text interface | Call centers, 400-phone customer service, voice hotlines | + +```json +[{ + "id": "modality", + "question": "Which communication method?", + "options": [ + "① Text-only IM — Chat via text", + "② Text + TTS — Text replies + voice readout", + "③ Full Modality — Both text and voice", + "④ Voice-only Call — Voice call only" + ], + "multiSelect": false +}] +``` + +> Made your choice? Just tell me the number. + +### 7.5 End-to-End Verification (Voice Core — No UI) + +> Since no UI is provided, voice quality is verified by the user in their own frontend. The Skill-side acceptance criteria are as follows (all three passing = end-to-end verified): + +1. **Health self-check**: `GET /api/v1/health` — three LEDs (tencent_cloud / trtc / llm) all green +2. **Control plane up**: `POST /api/v1/agent/start` returns `TaskId / SessionId` successfully +3. **Integration sample delivered**: Room-entry / control sample code rendered per the user's tech stack has landed in `$PROJECT_ROOT` + +Start core: +```bash +cd "$SKILL_ROOT" && nohup bash start.sh > /tmp/ai-cs-start.log 2>&1 & +sleep 8 && curl -sS http://localhost:3000/api/v1/health +``` + +### 7.6 Final Deliverables + +> **The AI should say**: +> Assembly complete! Your AI customer service backend capabilities are ready. Here's what has been delivered: + +- `/api/v1/*` backend API contract (output in §9) +- Outbound contract checklist + mock descriptions + replacement guide for each incremental capability +- Integration sample code paths matching your tech stack +- Integration entry point: `/docs` (Swagger) after launch + +> What you need to do next: hand the API checklist to your developers and have them follow the documentation to integrate the AI customer service capabilities into your website or app. If you run into issues during integration, come back anytime. + +### 7.7 Don'ts (Path B) +- ❌ Generate any frontend UI / apply voice-customer-service / widget-floating UI (those are Path A only) +- ❌ Use bare relative paths to call scripts (see §0) +- ❌ Replace mock implementations with real business systems on behalf of the user (only provide specs + adapters; the user decides when to swap) +- ❌ Modify `capabilities/*/src/core/` skeleton layer + +--- + +## 8. Capability Linking: Human Handoff ↔ Session Summary (Implemented) + +> When **human-handoff and session-summary are both installed** they automatically link up — no extra configuration by the AI needed. +> In Path A, session-summary is **default-installed** (see `recipe.yaml` `install:`), so the linkage is active out of the box. In Path B it links up only if the user selected session-summary. + +**Behavior**: When human-handoff **creates a ticket**, it best-effort triggers session-summary to generate an **LLM one-paragraph summary** of the conversation (from AI connect → handoff trigger) and writes it into the ticket's **`Description`** field. When an agent opens the ticket details on the dashboard, they **directly see** this conversation summary under "Conversation summary" — no separate "Session Summary" block, no manual "Generate Summary" click needed. + +**Implementation notes (for maintainers)**: +- Linkage entry point: `capabilities/human-handoff/src/summary_link.py` (`attach_summary_to_ticket`) +- Summary generation: `capabilities/session-summary/src/summarizer.py` → `summarize_paragraph(record)` (LLM, uses `LLM_API_KEY` / `LLM_API_URL` / `LLM_MODEL`). Falls back to leaving the description unchanged if LLM is not configured or the session has no recorded turns. +- **Soft dependency**: Dynamically loads session-summary via conversation-core's `_capability_loader`; not installed / any exception → silently skip, **does not affect the main handoff flow** +- Dashboard rendering (Path A): `admin-board/app.js` renders the ticket `description` as "Conversation summary"; the legacy structured `session_summary` block has been removed +- The transcript is uploaded by the frontend right before `/handoff/request` via `POST /api/v1/summary/{session_id}/record`, so the recorder has the turns to summarize + +> The LLM summary call runs synchronously inside the ticket-creation chain and may take a few seconds. This is acceptable because the Path A frontend fires `/handoff/request` **without awaiting** it (the handoff animation plays in parallel); the ticket `Description` is populated by the time the agent refreshes the board. + +--- + +## 9. API Contract Alignment + +> Trigger condition: mandatory after assembly is complete. Substitute `$SKILL_ROOT` in commands with absolute paths before execution. + +### 9.1 List Outbound APIs for Current Capability Packages + +Read `manifest.yaml.business_contract.external_apis` for each capability package. +**Only list entries where `direction == outbound`**, outputting in the following natural language format: + +``` +✓ Installed: conversation-core + knowledge-base + human-handoff. +This session uses mock / local implementations as demo data. + +Our capability packages call the following external business APIs: + + 1. POST /tickets ← human-handoff ticket creation + 2. GET /tickets/{ticket_id} ← human-handoff ticket status query + 3. POST /tickets/{ticket_id}/cancel ← human-handoff ticket cancellation + 4. POST /faq/search ← knowledge-base FAQ search + 5. GET /faq ← knowledge-base FAQ list + 6. POST /faq ← knowledge-base FAQ create/update + 7. DELETE /faq/{entry_id} ← knowledge-base FAQ delete +``` + +> Path B reminder: The contract checklist is one of the **core deliverables** to the user. Even if the user chooses "run with mocks first," leave this checklist with them. + +### 9.2 Ask the User + +> **The AI should say**: +> Do you want to switch to a real ticketing / knowledge base system? +> - Connect to my own system and adapt the interfaces accordingly +> - Run with mock data for now: skip interface adaptation and start directly + +Use `ask_followup_question` single-choice: +- Connect own system → Enter §9.3 +- Run with mocks → Jump to §10 (do not change adapter config) + +### 9.3 contract-adapt Flow + +1. AI asks "Which capability package to align?" (multi-select: human-handoff / knowledge-base) +2. For each capability package: + - AI asks "Paste your API description: ① curl command ② OpenAPI YAML file path" + - After collecting, **write to temp files**: + - curl → `write_to_file(/tmp/adapt_.curl.txt, )` + - OpenAPI → user already has a path; pass it directly + - Tool: `execute_command("cd \"$SKILL_ROOT\" && python3 scripts/contract-adapt.py --curl-file /tmp/adapt_.curl.txt --json")` + or `--openapi-file ` +3. Parse returned JSON: + - `{"level":"L1","artifact":""}` → Tell the user "Generated user_custom.py — ready to enable" + - `{"level":"L2","artifact":"","todos":[...]}` → List TODOs for the user to fill in + - `{"level":"L3","guide":"INTERFACE_ADAPT.md#section"}` → Have the user follow the documentation + +### 9.4 Enable user_custom + +`write_to_file` append to `$SKILL_ROOT/capabilities/conversation-core/.env`: +``` +HH_ADAPTER=user_custom # or KB_ADAPTER=user_custom +HH_REST_BASE_URL=https://... +HH_REST_TOKEN=... # if applicable +``` + +--- + +## 10. Launch & Verification + +> Default port is 3000. Adjust with `--port ` if needed. If port is changed, sync all URLs / health checks below. +> Substitute `$SKILL_ROOT` in commands with absolute paths before execution. + +### 10.1 Launch +```bash +cd "$SKILL_ROOT" && nohup bash start.sh > /tmp/ai-cs-start.log 2>&1 & +# Custom port: cd "$SKILL_ROOT" && nohup bash start.sh --port 8080 > /tmp/ai-cs-start.log 2>&1 & +``` + +### 10.2 Health Self-Check (first launch needs ≥30s — pip install) +```bash +sleep 8 && curl -sS http://localhost:3000/api/v1/health +# If connection refused: wait longer +sleep 25 && curl -sS http://localhost:3000/api/v1/health +``` +Expected: response contains `"status":"ok"`, three LEDs (tencent_cloud / trtc / llm) all ok. + +### 10.3A Path A — All Green → Output Final Message +``` +Setup complete. Open the following URLs: + + · AI Voice Agent http://localhost:3000 (default) + · Admin board http://localhost:3000/static/admin/ + · API docs (Swagger) http://localhost:3000/docs + · Health probe http://localhost:3000/api/v1/health + +To stop: lsof -ti :3000 -sTCP:LISTEN | xargs kill +``` + +> **Correct entry**: The admin dashboard path is `/static/admin/` (**not** `/admin/tickets` — that route does not exist). + +### 10.3B Path B — Verification → Output Final Message (No UI) +``` +Backend capabilities integrated. Verification: + + · Health probe http://localhost:3000/api/v1/health (3 LEDs green) + · Control-plane POST /api/v1/agent/start → returns TaskId / SessionId + · API docs (Swagger) http://localhost:3000/docs (integration entry point) + +Delivered to your project ($PROJECT_ROOT): + · Integration sample code (room entry / control), invocation order: get_config → enter room → agent/start → agent/control → agent/stop + · Outbound contract checklist + mock descriptions for each incremental capability (swap with your real system as needed) + +UI is implemented by your own frontend. Verify voice quality from your frontend after entering a room. +To stop: lsof -ti :3000 -sTCP:LISTEN | xargs kill +``` + +--- + +## 11. Common Issues + +If you encounter any of the following issues, here are the corresponding solutions: + +| Issue | Cause | Solution | +|---|---|---| +| Key verification failed | Configured key expired or incorrect | Go back to §5 and recheck each key value. You can re-enter only the one that failed | +| Port is occupied | Port 3000 is in use by another program | Switch to a different port (e.g., `--port 8080`), or stop the program using port 3000 | +| Network unreachable | Corporate network or firewall restriction | Check if you need a proxy, or contact your network administrator to open the relevant domains | +| Python version too old | Python < 3.9 | Download the latest version from https://www.python.org/downloads/ | +| Error on startup | Dependency version conflict | The system will auto-fix it. If errors persist, send me the error message | +| Browser shows old UI (Path A) | Browser cached the old page | `Cmd+Shift+R` (Mac) or `Ctrl+Shift+R` (Windows) to force refresh | +| `/admin/tickets` returns 404 (Path A) | That route doesn't exist | The correct entry is `http://localhost:3000/static/admin/` | + +--- + +### 11.1 Error Recovery Technical Details + +#### Can't find assets / "No such file" / scripts won't run +- **Root cause**: Bare relative paths used; cwd is not `SKILL_ROOT` (most common issue in older versions). +- **Solution**: Re-determine the absolute `SKILL_ROOT` per §0. All commands must `cd "$SKILL_ROOT"` or use absolute paths. Rerun. +- **Never** ask the user to move the Skill directory to the workspace top level. + +#### .env exists but keys are invalid +1. AI proactively asks "Reconfigure? Keep old values or overwrite all?" +2. Choose "reconfigure" → During §5 flow, only re-ask for the failed key (keep others) +3. Choose "overwrite all" → Backup .env as .env.bak, start from §5 Key 1 + +#### Port is occupied +- Tool: `execute_command("lsof -ti :3000 -sTCP:LISTEN")` +- Ask the user "Kill process PID xxx? Or use a different port?" +- Change port: `cd "$SKILL_ROOT" && bash start.sh --port 8080` +- Kill process: `kill ` (requires explicit user consent) + +#### add-capability reports circular dependency +- Parse "circular dependency among: [...]" from stderr +- Tell the user which capabilities conflict; guide them to modify **manifest.yaml.dependencies** and retry +- Do not modify any manifest yourself + +#### LLM verification failed but user is using a non-OpenAI service +- Ask which service the user is using (DeepSeek / Qwen / Moonshot / Anthropic, etc.) +- Guide them to update `LLM_API_URL` and `LLM_MODEL`: + - DeepSeek: `https://api.deepseek.com/chat/completions`, model: `deepseek-chat` + - Others: Have the user provide the official base_url + chat completions path +- Rerun: `cd "$SKILL_ROOT" && python3 scripts/verify-credentials.py --type llm` + +#### verify-credentials.py returns E004 (network unreachable) +- Ask if behind a corporate network / need a proxy +- Quick fix: Have the user append `HTTPS_PROXY=...` to .env +- TRTC deep verification failure can be downgraded: `--no-deep` for local UserSig self-consistency check only + +#### `NameError: name 'session_id' is not defined` after startup +- **Root cause**: Stale injection position from older version +- **Solution**: Run `cd "$SKILL_ROOT" && python3 scripts/post-install-patch.py` +- Do not manually edit agent.py — let the patch script handle it + +#### contract-adapt.py parse failure +- Outputs `{"level":"L3", ...}` → Guide the user to read the corresponding capability package's INTERFACE_ADAPT.md +- Do not ask the user to repaste curl more than 2 times. On the 3rd attempt, go directly to L3 manual flow + +--- + +## 12. AI Tool Whitelist (Mandatory) + +> Substitute all `$SKILL_ROOT` / `$PROJECT_ROOT` with absolute paths before execution. Always `cd "$SKILL_ROOT"` or use absolute paths when calling scripts. + +### 12.1 Allowed Commands (execute_command) + +| Command | Purpose | +|---|---| +| `python3 -c "import sys; assert sys.version_info >= (3,9)"` | Prerequisite check | +| `test -f "$SKILL_ROOT/" && echo OK \|\| echo MISSING` | File existence check | +| `find "$PWD" -maxdepth 4 -name SKILL.md -path '*ai-service*'` | SKILL_ROOT fallback detection | +| `cd "$SKILL_ROOT" && python3 scripts/verify-credentials.py [--type tencent\|trtc\|llm] [--no-deep]` | Key verification | +| `cd "$SKILL_ROOT" && python3 scripts/add-capability.py --apply --json [--target-project "$PROJECT_ROOT"] [--tech-stack ...]` | Capability assembly | +| `cd "$SKILL_ROOT" && python3 scripts/post-install-patch.py` | Post-install patch | +| `cd "$SKILL_ROOT" && python3 scripts/contract-adapt.py [--curl-file ... \| --openapi-file ...] --json` | API contract adaptation | +| `cp "$SKILL_ROOT"/scenarios/customer-service/ui/voice-customer-service/{index.html,app.js,styles.css,data.js,mock-shop.json,tokens.css} "$PROJECT_ROOT"/ai-customer-service-demo/` | UI overlay (Path A only) | +| `cp -R "$SKILL_ROOT"/scenarios/customer-service/ui/admin-board/. "$PROJECT_ROOT"/ai-customer-service-demo/admin/` | Admin board mount (Path A only) | +| `mkdir -p "$PROJECT_ROOT"/ai-customer-service-demo/admin` | Create demo deployment directory (Path A only) | +| `echo "WEB_DEMO_DIR=" >> "$SKILL_ROOT"/capabilities/conversation-core/.env` | Write demo directory path (Path A only) | +| `cd "$SKILL_ROOT" && bash start.sh [--port N] [--https]` | Launch | +| `cd "$SKILL_ROOT" && nohup bash start.sh > /tmp/ai-cs-start.log 2>&1 &` | Background launch | +| `sleep N && curl -sS http://localhost:3000/api/v1/health` | Health check | +| `tail -80 /tmp/ai-cs-start.log` | Startup failure diagnostics | +| `lsof -ti :3000 -sTCP:LISTEN` | Check port usage | +| `chmod 600 "$SKILL_ROOT/capabilities/conversation-core/.env"` | Tighten permissions | + +### 12.2 Forbidden Commands + +| Command | Reason Prohibited | +|---|---| +| `python3 scripts/setup-credentials.py validate-tencent-cloud --secret-id ...` | Key passed via command line → shell history leak | +| `echo $TENCENT_CLOUD_SECRET_ID` | shell history leak | +| `cat "$SKILL_ROOT/capabilities/conversation-core/.env"` | May leak via terminal recording / screenshots | +| `git add . && git commit` | Credentials may be committed by mistake | +| Any command with plaintext keys as arguments | Same as above | +| Bare relative paths to call scripts (`python3 scripts/...` without `cd "$SKILL_ROOT"`) | Wrong cwd assumption → can't find assets | + +### 12.3 File Write Whitelist (write_to_file) + +| Path | Purpose | +|---|---| +| `$SKILL_ROOT/capabilities/conversation-core/.env` | Key write | +| `$PROJECT_ROOT/` | Path B: integration samples (written by script, not AI) | +| `$SKILL_ROOT/capabilities//src/adapters/user_custom.py` | Generated by contract-adapt.py | +| `/tmp/adapt_.curl.txt` | Temporary storage for user's curl | +| `/tmp/ai-cs-start.log` | nohup startup log | + +Other file writes require **explicit user confirmation** before writing. +**Special note**: `capabilities/conversation-core/src/agent.py` and `capabilities/conversation-core/src/server.py` are the skeleton layer. The AI should **not** directly edit them by hand. + +--- + +## 13. Design Standards Reference (Path A UI Only) + +> Path B generates no UI. This section does not apply to Path B. + +Path A UI must follow `$SKILL_ROOT/scenarios/customer-service/ui/design-system/DESIGN_GUIDELINES.md`: + +| Item | Mandatory Requirement | +|---|---| +| Theme | Light glassmorphism locked (soft purple + light pink + pale blue ambient over `#f7f3ff`; no dark mode toggle) | +| Colors | Everything via CSS variables from `tokens.css`; **no hardcoded hex values** | +| Font | `SF Pro / Inter / Helvetica Neue`, Chinese fallback to system default | +| Icons | Lucide / Phosphor style monoline SVG icons, sizes: 16/20/24/32 | +| Emoji | **Completely disabled** in the UI rendering layer (use SVG icons + text instead) | +| Glassmorphism panels | `backdrop-filter: blur(20px)` + `@supports` fallback | + +### 13.1 Top Bar LED Tooltip Convention + +The 3 LEDs in the top right each show a tooltip on hover: + +| LED | Title | Explanation | +|---|---|---| +| Cloud | Tencent Cloud API | Control-plane (CAM/STS); used to issue temporary credentials | +| TRTC | TRTC (Real-Time Communication) | Data-plane media channel; carries voice streams / subtitles / custom messages | +| LLM | LLM provider | Inference engine; OpenAI-compatible protocol; swappable with DeepSeek / GPT / Claude, etc. | + +--- + +> **Final Reminders** (for the Coding Agent to internalize): +> - 🔴 **Path baseline first**: Determine `SKILL_ROOT` (= injected Base directory) and `PROJECT_ROOT` per §0 before anything else. +> Always `cd "$SKILL_ROOT"` or use absolute paths for all script/asset commands. **Never ask the user to move directories.** +> - At each step, first call the tool to get facts, then explain to the user (don't answer from memory) +> - Tool call failure → give the user a stderr summary; **do not** hide errors +> - Uncertain field / path → use `read_file` to check the manifest, then answer +> - Strictly follow §12 Tool Whitelist and §5.4 Security Red Lines throughout +> - **This Skill's selling point is voice**; text-only requests → advise the user to configure it themselves; do not generate artifacts +> - **Path A** must run all 6 steps. Never skip Step 2.5 (post-install-patch) or Step 3 (UI overlay) +> - **Path B** never generates any UI; core end-to-end verified + incremental capabilities provide specs/mocks/samples only +> - human-handoff legacy API field is `state` (values: `waiting/connected/closed/canceled/timeout`), not `status`/`queued`/`cancelled` diff --git a/skills/trtc-ai-service/auto_adapters/README.md b/skills/trtc-ai-service/auto_adapters/README.md new file mode 100644 index 0000000..1ccaba2 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/README.md @@ -0,0 +1,40 @@ +# auto_adapters - Tech Stack Decoupling Adapter Component Library + +> Read by the Agent during the integration phase. Based on the tech stack identified by `stack_detector`, +> selects the corresponding adapter, renders template code, and injects it into the user's project. + +## Adapter Index + +| Adapter | Matching Tech Stack | Injected Artifact | Default Target | +|:---|:---|:---|:---| +| `frontend-spa` | `react` / `vue` / `angular` / `next` | `VoiceAgent.{tsx,vue,ts}` component | `src/components/` | +| `node-backend` | `express` / `koa` / `fastify` | Reverse proxy middleware | `routes/voice-agent.js` | +| `java-backend` | `spring-boot` / `quarkus` | `Filter` or `Quarkus Filter` | `src/main/java/.../VoiceAgentFilter.java` | +| `python-backend` | `flask` / `fastapi` / `django` | Decorator / sub-router | `voice_agent_proxy.py` | + +## Template Rendering Variables + +All `.tpl` files use `${VAR}` placeholders (to avoid conflicts with JS / Python `{{}}`): + +| Variable | Default | Description | +|:---|:---|:---| +| `${SKELETON_BASE_URL}` | `http://localhost:3000` | conversation-core process address | +| `${API_PREFIX}` | `/api/v1` | Skeleton REST prefix | +| `${COMPONENT_NAME}` | `VoiceAgent` | Frontend component name | +| `${ROUTE_PREFIX}` | `/voice-agent` | Backend proxy route prefix | + +## Three-Level Degradation Chain + +``` +L1 Full Auto: stack_detector.primary matched → adapter.render() → write to user project + │ + │ Failed (syntax conflict / path conflict) + ▼ +L2 Semi-Auto: Output INTEGRATION_GUIDE.md (based on integration_templates/generic-*.md) + │ + │ stack_detector.primary is None + ▼ +L3 Manual API: Output integration_templates/generic-rest-api.md +``` + +See `scripts/lib/degrader.py` and `scripts/add-capability.py` for details. diff --git a/skills/trtc-ai-service/auto_adapters/frontend-spa/README.md b/skills/trtc-ai-service/auto_adapters/frontend-spa/README.md new file mode 100644 index 0000000..7935101 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/frontend-spa/README.md @@ -0,0 +1,27 @@ +# frontend-spa Adapter + +> Connect the conversation-core skeleton REST API to any frontend SPA. +> The Agent selects the appropriate subdirectory template based on the current `tech_stack` during the L1 phase and renders it into the user's project. + +| tech_stack | Template | Default Target | +|:---|:---|:---| +| react / next | `react/VoiceAgent.tsx.tpl` | `src/components/${COMPONENT_NAME}.tsx` | +| vue | `vue/VoiceAgent.vue.tpl` | `src/components/${COMPONENT_NAME}.vue` | +| angular | `angular/voice-agent.component.ts.tpl` | `src/app/voice-agent/voice-agent.component.ts` | + +## Dependency Installation (written by Agent into package.json) + +- `trtc-sdk-v5 >= 5.0.0` + +## Security + +- Templates use `${SKELETON_BASE_URL}` for fetch; in production this must be replaced with an HTTPS address (skeleton manifest `security.network.enforce_https = true`). +- TRTC SDK requires a wss channel; allow in CSP: + ``` + connect-src https://${SKELETON_BASE_URL} wss://*.trtc.tencent-cloud.com; + ``` + +## Capability Overlay + +- With `tool-calling` installed: type `/tool xxx {...}` in the Send box to trigger local tool calls. +- With `human-handoff` installed: sending keywords like "talk to agent" triggers the queue-and-connect flow. diff --git a/skills/trtc-ai-service/auto_adapters/frontend-spa/angular/voice-agent.component.ts.tpl b/skills/trtc-ai-service/auto_adapters/frontend-spa/angular/voice-agent.component.ts.tpl new file mode 100644 index 0000000..cc85bdd --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/frontend-spa/angular/voice-agent.component.ts.tpl @@ -0,0 +1,131 @@ +// voice-agent.component.ts +// Auto-generated by conversation-core/auto_adapters/frontend-spa (Angular) +// Connect to TRTC ConversationAI via skeleton REST +import { Component, OnInit } from '@angular/core'; +import { HttpClient } from '@angular/common/http'; + +const SKELETON_BASE_URL = '${SKELETON_BASE_URL}'; +const API_PREFIX = '${API_PREFIX}'; + +interface SessionCfg { + session_id: string; + sdk_app_id: number; + room_id: string; + user_id: string; + user_sig: string; +} + +@Component({ + selector: 'app-voice-agent', + template: ` +
+

TRTC Voice Agent

+
+ + + {{ kv.name }} + +
+ + + + + + +
{{ logs.join('\n') }}
+
+ `, +}) +export class VoiceAgentComponent implements OnInit { + health: any = null; + healthEntries: { name: string; status: string }[] = []; + session: SessionCfg | null = null; + logs: string[] = []; + text = ''; + trtc: any = null; + + constructor(private http: HttpClient) {} + + log(m: string) { + this.logs = [...this.logs.slice(-20), '[' + new Date().toLocaleTimeString() + '] ' + m]; + } + + ngOnInit() { + this.http.get(SKELETON_BASE_URL + API_PREFIX + '/health').subscribe({ + next: (h) => { + this.health = h; + this.healthEntries = Object.entries(h.checks || {}).map(([name, v]: [string, any]) => ({ + name, + status: v.status, + })); + }, + error: (e) => this.log('health failed: ' + e.message), + }); + } + + async start() { + try { + const cfg: any = await this.http + .post(SKELETON_BASE_URL + API_PREFIX + '/get_config', {}) + .toPromise(); + this.session = cfg.data; + this.log('session=' + cfg.data.session_id); + + const TRTC = (await import('trtc-sdk-v5')).default; + this.trtc = TRTC.create(); + await this.trtc.enterRoom({ + roomId: Number(cfg.data.room_id), + sdkAppId: cfg.data.sdk_app_id, + userId: cfg.data.user_id, + userSig: cfg.data.user_sig, + }); + await this.trtc.startLocalAudio(); + await this.http + .post(SKELETON_BASE_URL + API_PREFIX + '/agent/start', { + session_id: cfg.data.session_id, + language: 'zh', + }) + .toPromise(); + this.log('agent started'); + } catch (e: any) { + this.log('start failed: ' + e.message); + } + } + + async stop() { + if (!this.session) return; + try { + await this.http + .post(SKELETON_BASE_URL + API_PREFIX + '/agent/stop', { + session_id: this.session.session_id, + }) + .toPromise(); + if (this.trtc) { + await this.trtc.exitRoom(); + await this.trtc.destroy(); + this.trtc = null; + } + this.session = null; + this.log('stopped'); + } catch (e: any) { + this.log('stop failed: ' + e.message); + } + } + + async send() { + if (!this.session || !this.text.trim()) return; + try { + await this.http + .post(SKELETON_BASE_URL + API_PREFIX + '/agent/control', { + session_id: this.session.session_id, + text: this.text, + interrupt: true, + }) + .toPromise(); + this.log('pushed: ' + this.text); + this.text = ''; + } catch (e: any) { + this.log('push failed: ' + e.message); + } + } +} diff --git a/skills/trtc-ai-service/auto_adapters/frontend-spa/manifest.yaml b/skills/trtc-ai-service/auto_adapters/frontend-spa/manifest.yaml new file mode 100644 index 0000000..efff14e --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/frontend-spa/manifest.yaml @@ -0,0 +1,57 @@ +# frontend-spa adapter manifest +# Rendered by Agent when targeting a frontend SPA project (React / Vue / Angular / Next). + +name: "frontend-spa" +version: "1.0.0" +target_role: "frontend" +tech_stack: ["react", "vue", "angular", "next"] + +# Render entries: tech_stack -> template file + default target path + install hints +templates: + react: + file: "react/VoiceAgent.tsx.tpl" + target_path: "src/components/${COMPONENT_NAME}.tsx" + install_hint: | + In your root component render section, add: + import { ${COMPONENT_NAME} } from './components/${COMPONENT_NAME}'; + <${COMPONENT_NAME} /> + package_dependencies: + - "trtc-sdk-v5@>=5.0.0" + next: + file: "react/VoiceAgent.tsx.tpl" + target_path: "components/${COMPONENT_NAME}.tsx" + install_hint: | + In app/page.tsx (or pages/index.tsx): + 'use client'; + import dynamic from 'next/dynamic'; + const ${COMPONENT_NAME} = dynamic(() => import('@/components/${COMPONENT_NAME}'), { ssr: false }); + package_dependencies: + - "trtc-sdk-v5@>=5.0.0" + vue: + file: "vue/VoiceAgent.vue.tpl" + target_path: "src/components/${COMPONENT_NAME}.vue" + install_hint: | + In parent component: + + + package_dependencies: + - "trtc-sdk-v5@>=5.0.0" + angular: + file: "angular/voice-agent.component.ts.tpl" + target_path: "src/app/voice-agent/voice-agent.component.ts" + install_hint: | + Add VoiceAgentComponent to your NgModule's declarations; + use in your template. + package_dependencies: + - "trtc-sdk-v5@>=5.0.0" + +# 默认变量(被顶层 default_variables 覆盖) +defaults: + SKELETON_BASE_URL: "http://localhost:3000" + API_PREFIX: "/api/v1" + COMPONENT_NAME: "VoiceAgent" + +# 安全 +security: + csp_hint: "TRTC SDK 需要允许 wss://*.trtc.tencent-cloud.com 与 https://${SKELETON_BASE_URL}" + https_required: true diff --git a/skills/trtc-ai-service/auto_adapters/frontend-spa/react/VoiceAgent.tsx.tpl b/skills/trtc-ai-service/auto_adapters/frontend-spa/react/VoiceAgent.tsx.tpl new file mode 100644 index 0000000..1f38c12 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/frontend-spa/react/VoiceAgent.tsx.tpl @@ -0,0 +1,142 @@ +// ${COMPONENT_NAME}.tsx +// Auto-generated by conversation-core/auto_adapters/frontend-spa +// Connect to TRTC ConversationAI via skeleton REST (minimal join-room + status display, non-business version) +// Install dependency: trtc-sdk-v5 +import React, { useEffect, useRef, useState } from 'react'; + +const SKELETON_BASE_URL = '${SKELETON_BASE_URL}'; +const API_PREFIX = '${API_PREFIX}'; + +interface HealthResp { + status: 'ok' | 'partial_failure' | string; + checks: Record; +} + +interface SessionConfig { + session_id: string; + sdk_app_id: number; + room_id: string; + user_id: string; + user_sig: string; + agent_user_id: string; +} + +export const ${COMPONENT_NAME}: React.FC = () => { + const [health, setHealth] = useState(null); + const [session, setSession] = useState(null); + const [logs, setLogs] = useState([]); + const [text, setText] = useState(''); + const trtcRef = useRef(null); + + const log = (msg: string) => + setLogs((prev) => [...prev.slice(-20), '[' + new Date().toLocaleTimeString() + '] ' + msg]); + + const callJson = async (path: string, body?: unknown) => { + const resp = await fetch(SKELETON_BASE_URL + API_PREFIX + path, { + method: body ? 'POST' : 'GET', + headers: { 'Content-Type': 'application/json' }, + body: body ? JSON.stringify(body) : undefined, + }); + if (!resp.ok) throw new Error(path + ' -> HTTP ' + resp.status); + return resp.json(); + }; + + useEffect(() => { + callJson('/health').then(setHealth).catch((e) => log('health failed: ' + e.message)); + }, []); + + const start = async () => { + try { + const cfg = await callJson('/get_config', {}); + const data: SessionConfig = cfg.data; + setSession(data); + log('session=' + data.session_id + ' room=' + data.room_id); + + // Join room (user side) + const TRTC = (await import('trtc-sdk-v5')).default; + const trtc = TRTC.create(); + trtcRef.current = trtc; + await trtc.enterRoom({ + roomId: Number(data.room_id), + sdkAppId: data.sdk_app_id, + userId: data.user_id, + userSig: data.user_sig, + }); + await trtc.startLocalAudio(); + log('entered room'); + + // Start AI channel bot + await callJson('/agent/start', { session_id: data.session_id, language: 'zh' }); + log('agent started'); + } catch (e: any) { + log('start failed: ' + e.message); + } + }; + + const stop = async () => { + if (!session) return; + try { + await callJson('/agent/stop', { session_id: session.session_id }); + if (trtcRef.current) { + await trtcRef.current.exitRoom(); + await trtcRef.current.destroy(); + trtcRef.current = null; + } + log('stopped'); + setSession(null); + } catch (e: any) { + log('stop failed: ' + e.message); + } + }; + + const send = async () => { + if (!session || !text.trim()) return; + try { + await callJson('/agent/control', { + session_id: session.session_id, + text, + interrupt: true, + }); + log('text pushed: ' + text); + setText(''); + } catch (e: any) { + log('push failed: ' + e.message); + } + }; + + return ( +
+

TRTC Voice Agent

+
+ {health + ? Object.entries(health.checks).map(([k, v]) => ( + + + {' '} + {k} + + )) + : 'loading...'} +
+ {!session ? ( + + ) : ( + <> + + setText(e.target.value)} + placeholder="type message..." + style={{ marginLeft: 8, padding: 4, width: 240 }} + /> + + + )} +
+        {logs.join('\n')}
+      
+
+ ); +}; + +export default ${COMPONENT_NAME}; diff --git a/skills/trtc-ai-service/auto_adapters/frontend-spa/vue/VoiceAgent.vue.tpl b/skills/trtc-ai-service/auto_adapters/frontend-spa/vue/VoiceAgent.vue.tpl new file mode 100644 index 0000000..fcdda7d --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/frontend-spa/vue/VoiceAgent.vue.tpl @@ -0,0 +1,121 @@ + + + + + + diff --git a/skills/trtc-ai-service/auto_adapters/integration_templates/generic-backend.md b/skills/trtc-ai-service/auto_adapters/integration_templates/generic-backend.md new file mode 100644 index 0000000..1d8d509 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/integration_templates/generic-backend.md @@ -0,0 +1,45 @@ +# Generic Backend Integration Guide (L2 Semi-Auto) + +> When the Agent has identified your backend tech stack but auto-rendering failed, follow these steps to complete the integration. +> Core idea: mount a reverse proxy route in your web framework that transparently forwards `${ROUTE_PREFIX}/*` +> to the skeleton process's `${API_PREFIX}/*`. + +## 1. Deploy conversation-core Skeleton + +```bash +cd capabilities/conversation-core +python -m src.server # Default listens on 0.0.0.0:3000 +``` + +## 2. Copy Reverse Proxy Template + +| Framework | Template | +|:---|:---| +| Express | `auto_adapters/node-backend/express.js.tpl` | +| Koa | `auto_adapters/node-backend/koa.js.tpl` | +| Fastify | `auto_adapters/node-backend/fastify.js.tpl` | +| Spring Boot | `auto_adapters/java-backend/springboot/VoiceAgentFilter.java.tpl` | +| Quarkus | `auto_adapters/java-backend/quarkus/VoiceAgentFilter.java.tpl` | +| Flask | `auto_adapters/python-backend/flask.py.tpl` | +| FastAPI | `auto_adapters/python-backend/fastapi.py.tpl` | +| Django | `auto_adapters/python-backend/django.py.tpl` | + +Replace placeholder variables (`${SKELETON_BASE_URL}` / `${API_PREFIX}` / `${ROUTE_PREFIX}`). + +## 3. Register Route + +Mount the router / filter / blueprint in your app entry point as described in the template's `install_hint`. + +## 4. Security Checklist + +- **HTTPS**: Enforce in production. +- **SSRF**: The skeleton address must not come directly from user input; if connecting to an internal network skeleton, first confirm with the user explicitly. +- **Request Body Limit**: Default `64KB`, preventing large payloads from overwhelming the ASR/LLM pipeline. +- **Auth**: Inject auth logic (JWT / API Key) in the reverse proxy; the skeleton itself only trusts requests from the reverse proxy source. + +## 5. Verify + +```bash +curl -s http://localhost:8000${ROUTE_PREFIX}/health | jq .status +# Expected: "ok" +``` diff --git a/skills/trtc-ai-service/auto_adapters/integration_templates/generic-frontend.md b/skills/trtc-ai-service/auto_adapters/integration_templates/generic-frontend.md new file mode 100644 index 0000000..dc6015f --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/integration_templates/generic-frontend.md @@ -0,0 +1,51 @@ +# Generic Frontend Integration Guide (L2 Semi-Auto) + +> When the Agent has identified your frontend tech stack but auto-rendering failed due to path conflicts or syntax errors, +> follow these steps to complete the integration manually. + +## Step 1 · Install Dependencies + +```bash +npm install trtc-sdk-v5 +``` + +## Step 2 · Copy Component Template + +Choose the corresponding template from `auto_adapters/frontend-spa/` based on your framework: + +- React / Next: `react/VoiceAgent.tsx.tpl` +- Vue: `vue/VoiceAgent.vue.tpl` +- Angular: `angular/voice-agent.component.ts.tpl` + +Copy the template content to your project's components directory and replace these placeholder variables with real values: + +| Placeholder | Default | Description | +|:---|:---|:---| +| `${SKELETON_BASE_URL}` | `http://localhost:3000` | Skeleton process address | +| `${API_PREFIX}` | `/api/v1` | Skeleton REST prefix | +| `${COMPONENT_NAME}` | `VoiceAgent` | Component / file name | + +## Step 3 · Mount in Parent Component + +```tsx +import { VoiceAgent } from './components/VoiceAgent'; + +export default function Page() { + return
; +} +``` + +## Step 4 · CSP & HTTPS + +If CSP is deployed in production, append: + +``` +connect-src https://${SKELETON_BASE_URL} wss://*.trtc.tencent-cloud.com; +``` + +## Step 5 · Verify + +Open the page; you should see three LEDs at the top: `tencent_cloud / trtc / llm`. +Once all are green, click `Start` to join the room and talk to the AI. + +If any LED is red, check `.env` and the 3 keys based on the diagnostic JSON output in the page console. diff --git a/skills/trtc-ai-service/auto_adapters/integration_templates/generic-rest-api.md b/skills/trtc-ai-service/auto_adapters/integration_templates/generic-rest-api.md new file mode 100644 index 0000000..9068768 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/integration_templates/generic-rest-api.md @@ -0,0 +1,93 @@ +# Generic REST API Integration Guide (L3 Manual Fallback) + +> When the Agent cannot identify your tech stack, or it's not in the supported adapter list, +> connect directly via the REST API exposed by conversation-core. + +## 1. Start the Skeleton + +```bash +cd capabilities/conversation-core +python -m src.server # Default 0.0.0.0:3000 +``` + +## 2. Endpoint List + +| Method | Path | Description | +|:---|:---|:---| +| GET | `/api/v1/health` | Real-time connectivity check for 3 keys | +| POST | `/api/v1/get_config` | Issue RoomId / UserSig | +| POST | `/api/v1/agent/start` | Start AI channel bot | +| POST | `/api/v1/agent/stop` | Stop AI channel bot | +| POST | `/api/v1/agent/control` | Text injection / interrupt | +| GET | `/api/v1/sessions` | In-memory session list (debugging) | + +Capability extension endpoints: + +| Capability | Prefix | +|:---|:---| +| knowledge-base | `/api/v1/kb/*` | +| tool-calling | `/api/v1/tools/*` | +| human-handoff | `/api/v1/handoff/*` | +| session-summary | `/api/v1/summary/*` | +| digital-human | `/api/v1/digital-human/*` | + +## 3. Call Examples + +### 3.1 Request Room Credentials + +```bash +curl -X POST http://localhost:3000/api/v1/get_config \ + -H "Content-Type: application/json" \ + -d '{}' +``` + +Response: + +```json +{ + "code": 0, + "data": { + "session_id": "xxx", + "sdk_app_id": 1234567890, + "room_id": "987654321", + "user_id": "u_abc", + "user_sig": "...", + "agent_user_id": "ai_xyz", + "io_modality": { "voice_input": { ... } } + } +} +``` + +### 3.2 Start AI Bot + +```bash +curl -X POST http://localhost:3000/api/v1/agent/start \ + -H "Content-Type: application/json" \ + -d '{"session_id":"xxx","language":"zh"}' +``` + +### 3.3 Text Injection + +```bash +curl -X POST http://localhost:3000/api/v1/agent/control \ + -H "Content-Type: application/json" \ + -d '{"session_id":"xxx","text":"Hello","interrupt":true}' +``` + +## 4. SDK Packages + +If you'd rather not call REST directly, use these SDKs: + +| Ecosystem | Package | +|:---|:---| +| npm | `@trtc/voice-agent-sdk` | +| maven | `com.tencent.trtc:voice-agent-sdk` | +| pypi | `trtc-voice-agent` | + +> SDK versions align with skeleton manifest; in Phase 2, REST is authoritative. + +## 5. Security Compliance + +- **HTTPS**: Enforce in production. +- **SecretKey not sent to client**: The skeleton only sends `user_sig` (with TTL) to the client; never exposes `SDKSecretKey`. +- **Log redaction**: The skeleton includes a built-in `RedactingFilter`; the reverse proxy layer should also suppress Authorization header logging. diff --git a/skills/trtc-ai-service/auto_adapters/java-backend/README.md b/skills/trtc-ai-service/auto_adapters/java-backend/README.md new file mode 100644 index 0000000..812150d --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/java-backend/README.md @@ -0,0 +1,25 @@ +# java-backend Adapter + +Connect the conversation-core skeleton as a Filter into Spring Boot / Quarkus projects. + +| Framework | Template | Default Target | +|:---|:---|:---| +| Spring Boot | `springboot/VoiceAgentFilter.java.tpl` | `src/main/java/com/example/voiceagent/VoiceAgentFilter.java` | +| Quarkus | `quarkus/VoiceAgentFilter.java.tpl` | Same as above | + +## Configuration + +`application.yml` / `application.properties`: + +```yaml +skeleton: + base-url: ${SKELETON_BASE_URL} + api-prefix: ${API_PREFIX} + route-prefix: ${ROUTE_PREFIX} +``` + +## Notes + +- The template package `com.example.voiceagent` is replaced by the Agent during L1 rendering based on the user's actual project package name. +- Default `connectTimeout=3s`, `request timeout=10s`; adjustable as needed. +- Spring Boot registers `voiceAgentFilter` with order 10; should be placed before business Filters. diff --git a/skills/trtc-ai-service/auto_adapters/java-backend/manifest.yaml b/skills/trtc-ai-service/auto_adapters/java-backend/manifest.yaml new file mode 100644 index 0000000..82aa515 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/java-backend/manifest.yaml @@ -0,0 +1,30 @@ +name: "java-backend" +version: "1.0.0" +target_role: "backend" +tech_stack: ["spring-boot", "quarkus"] +description: "Java backend Filter: forward requests to skeleton process" + +templates: + spring-boot: + file: "springboot/VoiceAgentFilter.java.tpl" + target_path: "src/main/java/com/example/voiceagent/VoiceAgentFilter.java" + install_hint: | + Filter already includes @Component; Spring Boot auto-scans it. + To customize the path prefix, modify the PATH_PREFIX constant or inject via application.yml. + package_dependencies: + - "org.springframework.boot:spring-boot-starter-web" + quarkus: + file: "quarkus/VoiceAgentFilter.java.tpl" + target_path: "src/main/java/com/example/voiceagent/VoiceAgentFilter.java" + install_hint: | + Filter already includes @Provider; place the file in any package path scanned by Quarkus. + package_dependencies: + - "io.quarkus:quarkus-rest-client-reactive" + +defaults: + SKELETON_BASE_URL: "http://localhost:3000" + API_PREFIX: "/api/v1" + ROUTE_PREFIX: "/voice-agent" + +security: + https_required: true diff --git a/skills/trtc-ai-service/auto_adapters/java-backend/quarkus/VoiceAgentFilter.java.tpl b/skills/trtc-ai-service/auto_adapters/java-backend/quarkus/VoiceAgentFilter.java.tpl new file mode 100644 index 0000000..7520a2b --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/java-backend/quarkus/VoiceAgentFilter.java.tpl @@ -0,0 +1,64 @@ +// VoiceAgentFilter.java (Quarkus, JAX-RS ContainerRequestFilter) +// Auto-generated by conversation-core/auto_adapters/java-backend +package com.example.voiceagent; + +import jakarta.ws.rs.client.Client; +import jakarta.ws.rs.client.ClientBuilder; +import jakarta.ws.rs.client.Entity; +import jakarta.ws.rs.container.ContainerRequestContext; +import jakarta.ws.rs.container.ContainerRequestFilter; +import jakarta.ws.rs.container.PreMatching; +import jakarta.ws.rs.core.Response; +import jakarta.ws.rs.ext.Provider; +import org.eclipse.microprofile.config.inject.ConfigProperty; + +import java.io.IOException; +import java.io.InputStream; + +@Provider +@PreMatching +public class VoiceAgentFilter implements ContainerRequestFilter { + + @ConfigProperty(name = "skeleton.base-url", defaultValue = "${SKELETON_BASE_URL}") + String skeletonBaseUrl; + + @ConfigProperty(name = "skeleton.api-prefix", defaultValue = "${API_PREFIX}") + String apiPrefix; + + @ConfigProperty(name = "skeleton.route-prefix", defaultValue = "${ROUTE_PREFIX}") + String routePrefix; + + private final Client client = ClientBuilder.newClient(); + + @Override + public void filter(ContainerRequestContext ctx) throws IOException { + String path = ctx.getUriInfo().getPath(); + if (!("/" + path).startsWith(routePrefix)) { + return; + } + String suffix = ("/" + path).substring(routePrefix.length()); + String upstream = skeletonBaseUrl + apiPrefix + suffix; + + try (InputStream in = ctx.getEntityStream()) { + byte[] body = in.readAllBytes(); + Response.ResponseBuilder rb; + try { + Response resp; + if ("GET".equalsIgnoreCase(ctx.getMethod())) { + resp = client.target(upstream).request().get(); + } else { + resp = client.target(upstream).request() + .method(ctx.getMethod(), + Entity.entity(body, "application/json")); + } + rb = Response.status(resp.getStatus()) + .entity(resp.readEntity(byte[].class)); + } catch (Exception e) { + rb = Response.status(502) + .entity("{\"code\":\"bad_gateway\",\"message\":\"" + + e.getMessage().replace("\"", "'") + "\"}"); + } + ctx.abortWith(rb.type("application/json").build()); + } + } +} diff --git a/skills/trtc-ai-service/auto_adapters/java-backend/springboot/VoiceAgentFilter.java.tpl b/skills/trtc-ai-service/auto_adapters/java-backend/springboot/VoiceAgentFilter.java.tpl new file mode 100644 index 0000000..244d474 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/java-backend/springboot/VoiceAgentFilter.java.tpl @@ -0,0 +1,91 @@ +// VoiceAgentFilter.java (Spring Boot) +// Auto-generated by conversation-core/auto_adapters/java-backend +// Reverse-proxy ${ROUTE_PREFIX}/* to skeleton ${SKELETON_BASE_URL}${API_PREFIX}/* +package com.example.voiceagent; + +import jakarta.servlet.*; +import jakarta.servlet.http.HttpServletRequest; +import jakarta.servlet.http.HttpServletResponse; +import org.springframework.beans.factory.annotation.Value; +import org.springframework.boot.web.servlet.FilterRegistrationBean; +import org.springframework.context.annotation.Bean; +import org.springframework.stereotype.Component; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.http.HttpClient; +import java.net.http.HttpRequest; +import java.net.http.HttpResponse; +import java.nio.charset.StandardCharsets; +import java.time.Duration; + +@Component +public class VoiceAgentFilter implements Filter { + + @Value("${skeleton.base-url:${SKELETON_BASE_URL}}") + private String skeletonBaseUrl; + + @Value("${skeleton.api-prefix:${API_PREFIX}}") + private String apiPrefix; + + @Value("${skeleton.route-prefix:${ROUTE_PREFIX}}") + private String routePrefix; + + private final HttpClient client = HttpClient.newBuilder() + .connectTimeout(Duration.ofSeconds(3)) + .build(); + + @Override + public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) + throws IOException, ServletException { + HttpServletRequest httpReq = (HttpServletRequest) req; + HttpServletResponse httpRes = (HttpServletResponse) res; + String path = httpReq.getRequestURI(); + if (!path.startsWith(routePrefix)) { + chain.doFilter(req, res); + return; + } + String suffix = path.substring(routePrefix.length()); + String upstream = skeletonBaseUrl + apiPrefix + suffix; + try (InputStream in = httpReq.getInputStream()) { + byte[] body = in.readAllBytes(); + HttpRequest.Builder builder = HttpRequest.newBuilder() + .uri(URI.create(upstream)) + .timeout(Duration.ofSeconds(10)) + .header("Content-Type", "application/json"); + switch (httpReq.getMethod()) { + case "GET": + case "HEAD": + builder.GET(); + break; + case "DELETE": + builder.DELETE(); + break; + default: + builder.method(httpReq.getMethod(), + HttpRequest.BodyPublishers.ofByteArray(body)); + } + HttpResponse resp = client.send(builder.build(), + HttpResponse.BodyHandlers.ofByteArray()); + httpRes.setStatus(resp.statusCode()); + resp.headers().firstValue("Content-Type").ifPresent(httpRes::setContentType); + httpRes.getOutputStream().write(resp.body()); + } catch (Exception e) { + httpRes.setStatus(HttpServletResponse.SC_BAD_GATEWAY); + httpRes.getWriter().write( + "{\"code\":\"bad_gateway\",\"message\":\"" + + e.getMessage().replace("\"", "'") + "\"}" + ); + } + } + + @Bean + public FilterRegistrationBean voiceAgentFilterRegistration() { + FilterRegistrationBean reg = new FilterRegistrationBean<>(this); + reg.addUrlPatterns(routePrefix + "/*"); + reg.setName("voiceAgentFilter"); + reg.setOrder(10); + return reg; + } +} diff --git a/skills/trtc-ai-service/auto_adapters/manifest.yaml b/skills/trtc-ai-service/auto_adapters/manifest.yaml new file mode 100644 index 0000000..3bf925c --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/manifest.yaml @@ -0,0 +1,43 @@ +# auto_adapters index manifest +# For add-capability CLI / Agent to parse adapter-to-tech-stack mappings. + +version: "1.0.0" +description: "Tech stack decoupling adapter component library, Phase 2 initial coverage of 4 adapter types / 12 tech stacks" + +# Rendering variable format ${VAR}, replaced by CLI before writing +default_variables: + SKELETON_BASE_URL: "http://localhost:3000" + API_PREFIX: "/api/v1" + COMPONENT_NAME: "VoiceAgent" + ROUTE_PREFIX: "/voice-agent" + +adapters: + - name: "frontend-spa" + path: "frontend-spa" + tech_stack: ["react", "vue", "angular", "next"] + description: "Frontend SPA adapter, generates join-room component and connects to skeleton /api/v1/*" + target_role: "frontend" + + - name: "node-backend" + path: "node-backend" + tech_stack: ["express", "koa", "fastify"] + description: "Node.js backend adapter, generates reverse proxy middleware" + target_role: "backend" + + - name: "java-backend" + path: "java-backend" + tech_stack: ["spring-boot", "quarkus"] + description: "Java backend adapter, generates Filter / Quarkus Filter" + target_role: "backend" + + - name: "python-backend" + path: "python-backend" + tech_stack: ["flask", "fastapi", "django"] + description: "Python backend adapter, generates decorators / sub-router mounting" + target_role: "backend" + +# General templates referenced by three-level degrader +fallback_templates: + guided_frontend: "integration_templates/generic-frontend.md" + guided_backend: "integration_templates/generic-backend.md" + manual_rest_api: "integration_templates/generic-rest-api.md" diff --git a/skills/trtc-ai-service/auto_adapters/node-backend/README.md b/skills/trtc-ai-service/auto_adapters/node-backend/README.md new file mode 100644 index 0000000..bbdd325 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/node-backend/README.md @@ -0,0 +1,25 @@ +# node-backend Adapter + +Connect the conversation-core skeleton as a reverse proxy into a Node.js backend, +keeping the skeleton address hidden from the frontend and allowing the backend to inject auth / rate limiting / business policies before and after forwarding. + +| Framework | Template | Default Install Location | +|:---|:---|:---| +| Express | `express.js.tpl` | `routes/voice-agent.js` | +| Koa | `koa.js.tpl` | `routes/voice-agent.js` | +| Fastify | `fastify.js.tpl` | `routes/voice-agent.js` | + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `SKELETON_BASE_URL` | `http://localhost:3000` | Skeleton process address | +| `API_PREFIX` | `/api/v1` | Skeleton REST prefix | +| `ROUTE_PREFIX` | `/voice-agent` | Self-mounting path | + +## Security + +- **SSRF Protection**: The template detects whether `SKELETON_BASE_URL` falls within private network ranges (`10/192.168/172.16-31/9/11/21/30/127`), + and will output a warning in production; internal network access requires explicit user confirmation. +- **HTTPS**: Enforced in production deployment. +- **Request Body Limit**: Default `64KB`, preventing malicious large payloads from overwhelming the skeleton ASR/LLM pipeline. diff --git a/skills/trtc-ai-service/auto_adapters/node-backend/express.js.tpl b/skills/trtc-ai-service/auto_adapters/node-backend/express.js.tpl new file mode 100644 index 0000000..61a0f5a --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/node-backend/express.js.tpl @@ -0,0 +1,40 @@ +// routes/voice-agent.js (Express) +// Auto-generated by conversation-core/auto_adapters/node-backend. +// Reverse-proxy user requests to the skeleton process, avoiding direct frontend exposure of the skeleton address. +const express = require('express'); +const fetch = (...args) => import('node-fetch').then(({ default: f }) => f(...args)); + +const SKELETON_BASE_URL = process.env.SKELETON_BASE_URL || '${SKELETON_BASE_URL}'; +const API_PREFIX = process.env.API_PREFIX || '${API_PREFIX}'; + +const router = express.Router(); +router.use(express.json({ limit: '64kb' })); + +// Security: prevent reverse proxy from pointing to private networks (aligned with global SSRF protection) +function isPrivate(host) { + if (!host) return false; + const blocks = [/^10\./, /^192\.168\./, /^172\.(1[6-9]|2\d|3[0-1])\./, /^9\./, /^11\./, /^21\./, /^30\./, /^127\./]; + return blocks.some((re) => re.test(host)); +} + +const target = new URL(SKELETON_BASE_URL); +if (isPrivate(target.hostname) && process.env.NODE_ENV !== 'development') { + console.warn('[voice-agent] WARNING: SKELETON_BASE_URL points to a private network'); +} + +router.all('*', async (req, res) => { + const url = SKELETON_BASE_URL + API_PREFIX + req.path; + try { + const resp = await fetch(url, { + method: req.method, + headers: { 'Content-Type': 'application/json' }, + body: ['GET', 'HEAD'].includes(req.method) ? undefined : JSON.stringify(req.body || {}), + }); + const text = await resp.text(); + res.status(resp.status).type(resp.headers.get('content-type') || 'application/json').send(text); + } catch (err) { + res.status(502).json({ code: 'bad_gateway', message: err.message }); + } +}); + +module.exports = router; diff --git a/skills/trtc-ai-service/auto_adapters/node-backend/fastify.js.tpl b/skills/trtc-ai-service/auto_adapters/node-backend/fastify.js.tpl new file mode 100644 index 0000000..6d7ad68 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/node-backend/fastify.js.tpl @@ -0,0 +1,27 @@ +// routes/voice-agent.js (Fastify plugin) +// Auto-generated by conversation-core/auto_adapters/node-backend. +const fetch = (...args) => import('node-fetch').then(({ default: f }) => f(...args)); + +const SKELETON_BASE_URL = process.env.SKELETON_BASE_URL || '${SKELETON_BASE_URL}'; +const API_PREFIX = process.env.API_PREFIX || '${API_PREFIX}'; + +module.exports = async function voiceAgent(fastify, _opts) { + fastify.all('/*', async (request, reply) => { + const url = SKELETON_BASE_URL + API_PREFIX + request.url.replace(/^\/voice-agent/, ''); + try { + const body = ['GET', 'HEAD'].includes(request.method) + ? undefined + : JSON.stringify(request.body || {}); + const resp = await fetch(url, { + method: request.method, + headers: { 'Content-Type': 'application/json' }, + body, + }); + reply.code(resp.status).type(resp.headers.get('content-type') || 'application/json'); + return await resp.text(); + } catch (err) { + reply.code(502); + return { code: 'bad_gateway', message: err.message }; + } + }); +}; diff --git a/skills/trtc-ai-service/auto_adapters/node-backend/koa.js.tpl b/skills/trtc-ai-service/auto_adapters/node-backend/koa.js.tpl new file mode 100644 index 0000000..840a92a --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/node-backend/koa.js.tpl @@ -0,0 +1,31 @@ +// routes/voice-agent.js (Koa) +// Auto-generated by conversation-core/auto_adapters/node-backend. +const Router = require('@koa/router'); +const fetch = (...args) => import('node-fetch').then(({ default: f }) => f(...args)); + +const SKELETON_BASE_URL = process.env.SKELETON_BASE_URL || '${SKELETON_BASE_URL}'; +const API_PREFIX = process.env.API_PREFIX || '${API_PREFIX}'; +const ROUTE_PREFIX = process.env.ROUTE_PREFIX || '${ROUTE_PREFIX}'; + +const router = new Router({ prefix: ROUTE_PREFIX }); + +router.all('(.*)', async (ctx) => { + const subPath = ctx.path.startsWith(ROUTE_PREFIX) ? ctx.path.slice(ROUTE_PREFIX.length) : ctx.path; + const url = SKELETON_BASE_URL + API_PREFIX + subPath; + try { + const body = ['GET', 'HEAD'].includes(ctx.method) ? undefined : JSON.stringify(ctx.request.body || {}); + const resp = await fetch(url, { + method: ctx.method, + headers: { 'Content-Type': 'application/json' }, + body, + }); + ctx.status = resp.status; + ctx.set('Content-Type', resp.headers.get('content-type') || 'application/json'); + ctx.body = await resp.text(); + } catch (err) { + ctx.status = 502; + ctx.body = { code: 'bad_gateway', message: err.message }; + } +}); + +module.exports = router; diff --git a/skills/trtc-ai-service/auto_adapters/node-backend/manifest.yaml b/skills/trtc-ai-service/auto_adapters/node-backend/manifest.yaml new file mode 100644 index 0000000..27055e2 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/node-backend/manifest.yaml @@ -0,0 +1,47 @@ +name: "node-backend" +version: "1.0.0" +target_role: "backend" +tech_stack: ["express", "koa", "fastify"] +description: "Node.js backend reverse proxy: transparently proxy ${ROUTE_PREFIX}/* to skeleton ${SKELETON_BASE_URL}${API_PREFIX}/*" + +templates: + express: + file: "express.js.tpl" + target_path: "routes/voice-agent.js" + install_hint: | + In Express app: + const voiceAgent = require('./routes/voice-agent'); + app.use('${ROUTE_PREFIX}', voiceAgent); + package_dependencies: + - "express@^4.17.0" + - "node-fetch@^3.3.0" + koa: + file: "koa.js.tpl" + target_path: "routes/voice-agent.js" + install_hint: | + In Koa app: + const Router = require('@koa/router'); + const proxy = require('./routes/voice-agent'); + app.use(proxy.routes()).use(proxy.allowedMethods()); + package_dependencies: + - "@koa/router@^12.0.0" + - "node-fetch@^3.3.0" + fastify: + file: "fastify.js.tpl" + target_path: "routes/voice-agent.js" + install_hint: | + In Fastify app: + const voiceAgent = require('./routes/voice-agent'); + fastify.register(voiceAgent, { prefix: '${ROUTE_PREFIX}' }); + package_dependencies: + - "fastify@^4.0.0" + - "node-fetch@^3.3.0" + +defaults: + SKELETON_BASE_URL: "http://localhost:3000" + API_PREFIX: "/api/v1" + ROUTE_PREFIX: "/voice-agent" + +security: + https_required: true + forbid_internal_ip: true # Reverse proxy target defaults to rejecting private network ranges (10/192.168/172.16-31/9/11/21/30) diff --git a/skills/trtc-ai-service/auto_adapters/python-backend/README.md b/skills/trtc-ai-service/auto_adapters/python-backend/README.md new file mode 100644 index 0000000..cece994 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/python-backend/README.md @@ -0,0 +1,22 @@ +# python-backend Adapter + +Connect the conversation-core skeleton as a reverse proxy into a Python backend. + +| Framework | Template | Default Target | +|:---|:---|:---| +| Flask | `flask.py.tpl` | `voice_agent_proxy.py` (Blueprint) | +| FastAPI | `fastapi.py.tpl` | `voice_agent_proxy.py` (APIRouter) | +| Django | `django.py.tpl` | `voice_agent_proxy/views.py` (function view) | + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `SKELETON_BASE_URL` | `http://localhost:3000` | Skeleton address | +| `API_PREFIX` | `/api/v1` | Skeleton prefix | +| `ROUTE_PREFIX` | `/voice-agent` | Self-mounting path | + +## Notes + +- The Django template uses `@csrf_exempt`, suitable only for reverse proxy scenarios; for CSRF support, integrate DRF separately. +- The FastAPI template is based on `httpx.AsyncClient`, aligning with the skeleton's async pipeline. diff --git a/skills/trtc-ai-service/auto_adapters/python-backend/django.py.tpl b/skills/trtc-ai-service/auto_adapters/python-backend/django.py.tpl new file mode 100644 index 0000000..e4eb02b --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/python-backend/django.py.tpl @@ -0,0 +1,32 @@ +# voice_agent_proxy/views.py (Django function view) +# Auto-generated by conversation-core/auto_adapters/python-backend +import json +import os + +import requests +from django.http import HttpResponse, JsonResponse +from django.views.decorators.csrf import csrf_exempt + +SKELETON_BASE_URL = os.getenv("SKELETON_BASE_URL", "${SKELETON_BASE_URL}") +API_PREFIX = os.getenv("API_PREFIX", "${API_PREFIX}") + + +@csrf_exempt +def proxy_view(request, rest: str = ""): + upstream = f"{SKELETON_BASE_URL}{API_PREFIX}/{rest}" + try: + kwargs = {"timeout": 10} + if request.method != "GET": + try: + kwargs["json"] = json.loads(request.body or b"{}") + except json.JSONDecodeError: + kwargs["json"] = {} + resp = requests.request( + request.method, upstream, + headers={"Content-Type": "application/json"}, + **kwargs, + ) + ctype = resp.headers.get("Content-Type", "application/json") + return HttpResponse(resp.content, status=resp.status_code, content_type=ctype) + except requests.RequestException as exc: + return JsonResponse({"code": "bad_gateway", "message": str(exc)}, status=502) diff --git a/skills/trtc-ai-service/auto_adapters/python-backend/fastapi.py.tpl b/skills/trtc-ai-service/auto_adapters/python-backend/fastapi.py.tpl new file mode 100644 index 0000000..a928ce8 --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/python-backend/fastapi.py.tpl @@ -0,0 +1,35 @@ +# voice_agent_proxy.py (FastAPI sub-router) +# Auto-generated by conversation-core/auto_adapters/python-backend +import os +from typing import Any, Dict, Optional + +import httpx +from fastapi import APIRouter, HTTPException, Request + +SKELETON_BASE_URL = os.getenv("SKELETON_BASE_URL", "${SKELETON_BASE_URL}") +API_PREFIX = os.getenv("API_PREFIX", "${API_PREFIX}") + +router = APIRouter() + + +@router.api_route("/{subpath:path}", methods=["GET", "POST", "PUT", "DELETE"]) +async def proxy(subpath: str, request: Request) -> Any: + upstream = f"{SKELETON_BASE_URL}{API_PREFIX}/{subpath}" + body: Optional[Dict[str, Any]] = None + if request.method != "GET": + try: + body = await request.json() + except Exception: + body = {} + async with httpx.AsyncClient(timeout=10) as client: + try: + resp = await client.request( + request.method, upstream, + headers={"Content-Type": "application/json"}, + json=body if request.method != "GET" else None, + ) + except httpx.HTTPError as exc: + raise HTTPException(status_code=502, detail={"code": "bad_gateway", "message": str(exc)}) + if resp.headers.get("content-type", "").startswith("application/json"): + return resp.json() + return resp.text diff --git a/skills/trtc-ai-service/auto_adapters/python-backend/flask.py.tpl b/skills/trtc-ai-service/auto_adapters/python-backend/flask.py.tpl new file mode 100644 index 0000000..755f33d --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/python-backend/flask.py.tpl @@ -0,0 +1,31 @@ +# voice_agent_proxy.py (Flask Blueprint) +# Auto-generated by conversation-core/auto_adapters/python-backend +import os +from flask import Blueprint, request, Response +import requests + +SKELETON_BASE_URL = os.getenv("SKELETON_BASE_URL", "${SKELETON_BASE_URL}") +API_PREFIX = os.getenv("API_PREFIX", "${API_PREFIX}") + +bp = Blueprint("voice_agent_proxy", __name__) + + +@bp.route("/", methods=["GET", "POST", "PUT", "DELETE"]) +def proxy(subpath: str): + upstream = f"{SKELETON_BASE_URL}{API_PREFIX}/{subpath}" + try: + kwargs = {"timeout": 10} + if request.method != "GET": + kwargs["json"] = request.get_json(silent=True) or {} + resp = requests.request( + request.method, upstream, + headers={"Content-Type": "application/json"}, + **kwargs, + ) + return Response( + resp.content, + status=resp.status_code, + content_type=resp.headers.get("Content-Type", "application/json"), + ) + except requests.RequestException as exc: + return {"code": "bad_gateway", "message": str(exc)}, 502 diff --git a/skills/trtc-ai-service/auto_adapters/python-backend/manifest.yaml b/skills/trtc-ai-service/auto_adapters/python-backend/manifest.yaml new file mode 100644 index 0000000..71f9d4f --- /dev/null +++ b/skills/trtc-ai-service/auto_adapters/python-backend/manifest.yaml @@ -0,0 +1,45 @@ +name: "python-backend" +version: "1.0.0" +target_role: "backend" +tech_stack: ["flask", "fastapi", "django"] +description: "Python backend reverse proxy: decorator / sub-router form to connect to skeleton" + +templates: + flask: + file: "flask.py.tpl" + target_path: "voice_agent_proxy.py" + install_hint: | + In main app: + from voice_agent_proxy import bp + app.register_blueprint(bp, url_prefix='${ROUTE_PREFIX}') + package_dependencies: + - "Flask>=3.0" + - "requests>=2.31" + fastapi: + file: "fastapi.py.tpl" + target_path: "voice_agent_proxy.py" + install_hint: | + In main app: + from voice_agent_proxy import router + app.include_router(router, prefix='${ROUTE_PREFIX}') + package_dependencies: + - "fastapi>=0.110" + - "httpx>=0.27" + django: + file: "django.py.tpl" + target_path: "voice_agent_proxy/views.py" + install_hint: | + In urls.py: + from voice_agent_proxy.views import proxy_view + urlpatterns += [path('${ROUTE_PREFIX}/', proxy_view)] + package_dependencies: + - "Django>=4.2" + - "requests>=2.31" + +defaults: + SKELETON_BASE_URL: "http://localhost:3000" + API_PREFIX: "/api/v1" + ROUTE_PREFIX: "/voice-agent" + +security: + https_required: true diff --git a/skills/trtc-ai-service/capabilities/__init__.py b/skills/trtc-ai-service/capabilities/__init__.py new file mode 100644 index 0000000..d9aa6ce --- /dev/null +++ b/skills/trtc-ai-service/capabilities/__init__.py @@ -0,0 +1,43 @@ +"""capabilities namespace root. + +Subdirectories use hyphenated names (manifest style), but Python modules require underscore names. +This file creates aliases on import as needed (only when the corresponding directory exists). + +Example: + capabilities.knowledge-base/ → import capabilities.knowledge_base +""" +from __future__ import annotations + +import importlib +import sys +from pathlib import Path + +_ROOT = Path(__file__).resolve().parent + +# Hyphenated directory → underscore module alias +_ALIASES = { + "knowledge-base": "knowledge_base", + "tool-calling": "tool_calling", + "human-handoff": "human_handoff", + "session-summary": "session_summary", + "digital-human": "digital_human", +} + + +def _install_alias(dirname: str, modname: str) -> None: + full_dir = _ROOT / dirname + if not full_dir.exists(): + return + full_name = f"{__name__}.{modname}" + if full_name in sys.modules: + return + # Register a namespace package that sub-modules can continue importing + import types + + pkg = types.ModuleType(full_name) + pkg.__path__ = [str(full_dir)] # type: ignore[attr-defined] + sys.modules[full_name] = pkg + + +for _d, _m in _ALIASES.items(): + _install_alias(_d, _m) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/.env.example b/skills/trtc-ai-service/capabilities/conversation-core/.env.example new file mode 100644 index 0000000..a9c047d --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/.env.example @@ -0,0 +1,29 @@ +# conversation-core · 三把 Key 环境变量模板 +# ---------------------------------------------------------- +# 用法: +# 1. 优先通过 `python scripts/setup-credentials.py` 交互式引导生成 .env +# 2. 如需手动配置,复制本文件为 .env 并填入实际值 +# 注意:所有凭证仅来自环境变量(env-only),切勿在代码中明文写入 + +# [1/3] 腾讯云 API 密钥 +TENCENT_CLOUD_SECRET_ID= +TENCENT_CLOUD_SECRET_KEY= +TENCENT_CLOUD_REGION=ap-guangzhou + +# [2/3] TRTC Conversational AI 应用凭据 +# 默认从 https://console.trtc.io 申请的国际站应用(trtc.intl.tencentcloudapi.com) +TRTC_SDK_APP_ID= +TRTC_SDK_SECRET_KEY= +# (高级)若你的应用在国内站 console.cloud.tencent.com/trtc 申请,去掉下行注释: +# TRTC_REGION=cn + +# [3/3] 外部 LLM 接入(OpenAI 兼容协议) +LLM_API_KEY= +LLM_API_URL=https://api.openai.com/v1/chat/completions +LLM_MODEL=gpt-4o-mini +LLM_TYPE=openai + +# 服务监听 +HOST=0.0.0.0 +PORT=3000 +LOG_LEVEL=INFO diff --git a/skills/trtc-ai-service/capabilities/conversation-core/INTEGRATION.md b/skills/trtc-ai-service/capabilities/conversation-core/INTEGRATION.md new file mode 100644 index 0000000..e3d8c63 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/INTEGRATION.md @@ -0,0 +1,134 @@ +# conversation-core · Integration Guide (Agent-readable) + +> This document is for **AI coding assistants / integration agents** to automatically +> merge the conversation-core skeleton into user projects. All instructions are designed for programmatic parsing and execution. + +--- + +## Section 1 · Tech Stack Detection + +At the integration entry point, the Agent detects user project characteristics in the following order, outputting a `tech_stack` label: + +| Signal File | Key Field | Inferred Tech Stack | +|:---|:---|:---| +| `package.json` | `dependencies.react` | `react` | +| `package.json` | `dependencies.vue` | `vue` | +| `package.json` | `dependencies['@angular/core']` | `angular` | +| `package.json` | `dependencies.express` / `koa` / `fastify` | `express` / `koa` / `fastify` | +| `package.json` | `dependencies.next` | `next` | +| `pom.xml` | `spring-boot-starter` | `spring-boot` | +| `build.gradle` | `org.springframework.boot` | `spring-boot` | +| `pom.xml` | `quarkus-core` | `quarkus` | +| `requirements.txt` / `pyproject.toml` | `flask` / `fastapi` / `django` | `flask` / `fastapi` / `django` | + +If multiple tech stack candidates are detected, the most specific one takes priority: +`next > react/vue/angular > express/koa/fastify > spring-boot/quarkus > flask/fastapi/django`. + +--- + +## Section 2 · Adapter Rule Matching + +Read the `integration.auto_adapters` list from this capability's `manifest.yaml`; the first entry whose `tech_stack` matches becomes the target adapter: + +```text +match(tech_stack_detected, manifest.integration.auto_adapters[*].tech_stack) + → adapter_name (e.g. "frontend-spa") +``` + +The mapping from adapter name to actual generator is provided by Phase 2; this skeleton only declares the interface contract. + +--- + +## Section 3 · Code Generation and Merging + +Phase 1 skeleton only exposes REST APIs (default port `3000`). How integrators call the skeleton from their own projects: + +### 3.1 Frontend (any SPA) + +```js +// 1) Health check (top status bar) +const health = await fetch('http://localhost:3000/api/v1/health').then(r => r.json()); + +// 2) Request room credentials +const cfg = await fetch('http://localhost:3000/api/v1/get_config', { + method: 'POST', headers: { 'Content-Type': 'application/json' }, body: '{}' +}).then(r => r.json()); +const { session_id, sdk_app_id, room_id, user_id, user_sig } = cfg.data; + +// 3) Join room via TRTC Web SDK using sdk_app_id / user_sig (encapsulated by frontend capability package) + +// 4) Start AI channel bot +await fetch('http://localhost:3000/api/v1/agent/start', { + method: 'POST', headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ session_id, language: 'zh' }) +}); +``` + +### 3.2 Backend (any runtime) + +| Runtime | Injection Point | Generator Output | +|:---|:---|:---| +| Express / Koa / Fastify | Router layer | Middleware code (reverse proxy `/api/v1/*` to skeleton process) | +| Spring Boot / Quarkus | Filter Chain | Filter code + `@Value("${trtc.voice-agent.endpoint}")` injection | +| Flask / FastAPI / Django | Route handler | Decorators + sub-router mounting | + +### 3.3 Injection Points (declarative) + +`manifest.yaml.injection_points` declares 5 standard injection points. Phase 2 capability packages reference them by `id`, e.g.: + +```yaml +# knowledge-base capability manifest.yaml snippet +extensions: + - inject_at: "agent.before_start" + code_template: "templates/inject_kb_to_instructions.py.tpl" +``` + +--- + +## Section 4 · Three-Level Degradation Path + +| Level | Trigger Condition | Agent Behavior | +|:---:|:---|:---| +| **L1 Full Auto-Merge** | Tech stack detected successfully and code generation has no conflicts | Write directly into user project and auto-run `npm install` / `pip install` | +| **L2 Semi-Auto Guide** | Tech stack detected successfully but code generation fails (syntax / path conflicts) | Output `INTEGRATION_GUIDE.md` with template code + manual injection steps | +| **L3 Manual API Fallback** | Tech stack cannot be identified | Output REST API docs (base path `/api/v1`) + SDK package install commands | + +L2 / L3 output templates are located in `integration-templates/` (provided by Phase 2). + +--- + +## Section 5 · Verification Checks + +After integration, the Agent must execute these checks in order: + +1. **Process alive** — `curl -s http://localhost:3000/api/v1/health | jq .status`, expected `"ok"`. +2. **Three LEDs** — `health.checks.{tencent_cloud,trtc,llm}.status == "ok"`. +3. **Session handshake** — `POST /api/v1/get_config` → returns non-empty `session_id` and `user_sig`. +4. **Text injection** — After starting AI, call `POST /api/v1/agent/control { text: "ping" }` expecting `delivered: true`. +5. **Graceful stop** — `POST /api/v1/agent/stop` returns `status: "stopped"`. + +On any step failure, the Agent must output a diagnostic JSON: + +```json +{ "step": "get_config", "error": "...", "remediation": "Check that TRTC_SDK_APP_ID in .env is an integer" } +``` + +--- + +## Appendix A · Error Code Dictionary + +| Error Code | Meaning | Remediation | +|:---|:---|:---| +| E001 | Tencent Cloud SecretId/SecretKey invalid | Re-run `python scripts/setup-credentials.py` | +| E002 | TRTC SDKAppID/SDKSecretKey invalid or UserSig generation failed | Verify SDKAppID is an integer; SecretKey is complete | +| E003 | LLM API Key invalid | Check that `LLM_API_URL` is an OpenAI-compatible endpoint | +| E004 | Network unreachable | Check egress IP whitelist / proxy | +| E005 | Service not activated | Enable Conversational AI in TRTC Console | + +## Appendix B · Security Compliance + +- Credentials only from environment variables (`security.credential_storage.source = env-only`) +- Credential cache and `.env` file enforced to permission `0600` +- End-to-end HTTPS (`security.network.enforce_https = true`) +- Log redaction filter installed at process startup (see `src/log_filter.py`) +- XSS / Prompt Injection protection switches declared in `security.injection_protection` diff --git a/skills/trtc-ai-service/capabilities/conversation-core/INTERFACE_ADAPT.md b/skills/trtc-ai-service/capabilities/conversation-core/INTERFACE_ADAPT.md new file mode 100644 index 0000000..a29b6d0 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/INTERFACE_ADAPT.md @@ -0,0 +1,111 @@ +# conversation-core Interface Adaptation SOP + +> Skeleton-layer interface adaptation guide. In this release, conversation-core has **not been refactored to ports/adapters/core** (Phase 1 compromise, deferred to Phase 4), +> so this document only explains "which interfaces can be replaced and how", without providing automated generation entry points. + +--- + +## 1. Default Contract Overview + +| Contract | Method | Path | Adaptable? | +|---|---|---|---| +| `llm.chat_completions` | POST | `/v1/chat/completions` (OpenAI-compatible) | **Adaptable** | +| `trtc.start_ai_conversation` | POST | Tencent Cloud TencentCloudAPI | **Not adaptable** (tightly bound to Tencent Cloud) | + +Full field definitions in `manifest.yaml.business_contract.external_apis`. + +--- + +## 2. LLM Interface Replacement (most common) + +The skeleton calls LLM using the OpenAI Chat Completions protocol by default: +- Default `LLM_API_URL = https://api.openai.com/v1/chat/completions` +- Supports any OpenAI-compatible proxy (DeepSeek / Qwen / Tencent Hunyuan OpenAPI / vLLM etc.) + +### 2.1 OpenAI-Compatible Protocol (recommended path) + +Only need to switch environment variables — **no code changes required**: + +```bash +# Switch to DeepSeek +export LLM_API_URL=https://api.deepseek.com/v1/chat/completions +export LLM_API_KEY=sk-xxx +export LLM_MODEL=deepseek-chat + +# Switch to Qwen (DashScope OpenAI-compatible endpoint) +export LLM_API_URL=https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions +export LLM_API_KEY=sk-xxx +export LLM_MODEL=qwen-turbo + +# Switch to self-hosted vLLM +export LLM_API_URL=http://your-vllm:8000/v1/chat/completions +export LLM_API_KEY=any-string +export LLM_MODEL=Qwen2.5-7B-Instruct +``` + +> Security: Self-hosted LLM must use https://; http only allowed for localhost. See `security_rules`. + +### 2.2 Non-OpenAI Protocols (e.g. Claude Anthropic Messages API) + +Requires introducing an "LLM protocol adapter" at the skeleton layer. This mechanism is not delivered in this release; temporary workaround: + +1. Deploy an OpenAI ↔ Anthropic protocol translation gateway (e.g. LiteLLM) in the user project +2. Point the skeleton's `LLM_API_URL` to the gateway +3. The gateway handles protocol translation + +```bash +# Launch LiteLLM gateway (see https://docs.litellm.ai/) +litellm --model anthropic/claude-3-5-sonnet --port 4000 + +# Skeleton configuration +export LLM_API_URL=http://localhost:4000/v1/chat/completions +export LLM_API_KEY=sk-anthropic-xxx +export LLM_MODEL=anthropic/claude-3-5-sonnet +``` + +### 2.3 Phase 4 Plan: LLM Adapter Abstraction + +The skeleton will introduce a `LlmClient` abstraction (same pattern as human-handoff / knowledge-base): + +``` +capabilities/conversation-core/src/ +├── ports/ +│ └── llm_client.py # ABC: chat / stream_chat / count_tokens +└── adapters/ + ├── openai_compat.py # Current default implementation + ├── claude_anthropic.py # Native Anthropic Messages API + ├── tencent_hunyuan.py # Tencent Hunyuan native OpenAPI + └── user_custom.py # User integration wizard generator +``` + +This document will be supplemented with automated adaptation workflows at that time. + +--- + +## 3. TRTC Conversational AI Control Plane (Not Adaptable) + +`trtc.start_ai_conversation` / `StopAIConversation` / `ControlAIConversation` / +`ServerPushText` and other control plane interfaces are **tightly bound to the Tencent Cloud protocol**. If the user's business does not use TRTC, +they should not continue using this capability package; suggest switching to a text-only conversation approach (using conversation-core's +`text_input` / `text_output` channels, bypassing the TRTC control plane). + +--- + +## 4. ASR / TTS Service Replacement + +The skeleton uses TRTC's built-in ASR/TTS by default (declared via `STTConfig` / `TTSConfig` in StartAIConversation +requests). To switch to your own ASR/TTS, replace the provider name in the manifest's +`config.io_modality.voice_input.provider` / `voice_output.provider` fields and implement +a custom provider extension in the user project per the TRTC ConversationAI documentation. + +Custom provider scaffolding is not provided in this release. + +--- + +## 5. Security Checklist + +- [ ] All 3 keys come from environment variables only — **no hardcoding** +- [ ] `LLM_API_URL` must use https:// or http://localhost +- [ ] Reject private network addresses for self-hosted LLM (except localhost) +- [ ] `LLM_API_KEY` / `Authorization` headers auto-redacted in logs (handled by skeleton `log_redaction`) +- [ ] Credential cache file permissions enforced to 600 diff --git a/skills/trtc-ai-service/capabilities/conversation-core/QUICK_START.md b/skills/trtc-ai-service/capabilities/conversation-core/QUICK_START.md new file mode 100644 index 0000000..f28159b --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/QUICK_START.md @@ -0,0 +1,62 @@ +# conversation-core · Quick Start + +> Configure → Run → Verify, done in three steps. + +## 0. Prerequisites + +- Python ≥ 3.9 +- Activated: Tencent Cloud account + TRTC Conversational AI application + any OpenAI-compatible LLM service + +## 1. Install + +```bash +# From repo root +pip install -r capabilities/conversation-core/requirements.txt +``` + +## 2. Configure the 3 Keys + +```bash +python scripts/setup-credentials.py +``` + +The script interactively guides you through `[1/3] Tencent Cloud → [2/3] TRTC → [3/3] LLM` in order, +running a self-check immediately after each key is entered. On failure, it won't proceed to the next key; +if interrupted mid-way, re-running it will auto-skip keys that already passed (checkpoint resume). + +Output artifacts on success: + +| Path | Contents | Permissions | +|:---|:---|:---:| +| `.env` | Environment variable declarations for the 3 keys | 600 | +| `.credentials_cache` | SHA256 hashes of verified keys | 600 | +| `config-report.json` | Verification timestamp / latency / status for each key | 644 | + +## 3. Launch Web Demo + +```bash +bash start.sh +# Equivalent to: +# cd capabilities/conversation-core && python -m src.server +``` + +Open in your browser. + +## 4. Acceptance Criteria + +- [x] ASR/LLM/TTS pipeline has no hard-coded business logic (protocol passthrough only) +- [x] `setup-credentials.py` supports real-time connectivity self-check and checkpoint resume +- [x] Web Demo top status bar: all three indicator LEDs green +- [x] manifest.yaml includes skeleton type / injection points / modality / security declarations +- [x] INTEGRATION.md provides Agent-readable detection logic and three-level degradation path +- [x] `.credentials_cache` / `.env` permissions 600; no plain-text keys in logs + +## 5. Next Steps + +Overlay business capability packages at the 5 injection points declared in `manifest.yaml.injection_points`: + +```bash +voice-agent add knowledge-base +voice-agent add tool-calling +voice-agent add human-handoff +``` diff --git a/skills/trtc-ai-service/capabilities/conversation-core/manifest.yaml b/skills/trtc-ai-service/capabilities/conversation-core/manifest.yaml new file mode 100644 index 0000000..dd6edac --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/manifest.yaml @@ -0,0 +1,250 @@ +# conversation-core capability self-describing manifest +# Type: skeleton (always installed by default; no built-in business scenarios) + +name: "conversation-core" +version: "1.0.0" +type: "skeleton" +description: "Generic Voice Agent skeleton providing ASR/LLM/TTS/session management core capabilities without any built-in business logic" + +# --------------------------------------------------------------------------- +# Dependency declaration +# --------------------------------------------------------------------------- +dependencies: [] # Skeleton layer has no dependencies + +# --------------------------------------------------------------------------- +# Injection points (for Phase 2 capability overlay) +# Convention: position field uses before:function_name / after:function_name / replace:function_name +# --------------------------------------------------------------------------- +injection_points: + - id: "agent.before_start" + target: "src/agent.py" + position: "before:_ext_before_start_" + description: "Inject code inside the start_agent method body (e.g. prepend KB search results to instructions); anchor is inside start_agent method body to ensure access to local variables like config/info" + + - id: "agent.after_start" + target: "src/agent.py" + position: "before:_ext_after_start_" + description: "Inject session-level hooks after the AI task is successfully launched (e.g. initialize human handoff state machine); anchor is inside start_agent method body" + + - id: "agent.before_push_text" + target: "src/agent.py" + position: "before:_ext_before_push_text_" + description: "Intercept before text injection is sent (e.g. keyword-triggered handoff / tool calling); anchor is inside push_text method body" + + - id: "server.router_extension" + target: "src/server.py" + position: "after:app.include_router" + description: "Append business capability FastAPI sub-routers" + + - id: "modality.channel_resolver" + target: "src/modality.py" + position: "replace:resolve_input_channel" + description: "Override default channel selection strategy, e.g. disable voice input in e-commerce scenarios" + +# --------------------------------------------------------------------------- +# Configuration interface +# --------------------------------------------------------------------------- +config: + credentials: + - key: "TENCENT_CLOUD_SECRET_ID" + required: true + description: "Tencent Cloud API SecretId (Key 1)" + - key: "TENCENT_CLOUD_SECRET_KEY" + required: true + description: "Tencent Cloud API SecretKey (Key 1)" + - key: "TENCENT_CLOUD_REGION" + required: false + default: "ap-guangzhou" + description: "Tencent Cloud API region" + - key: "TRTC_SDK_APP_ID" + required: true + description: "TRTC Conversational AI application SDKAppID (Key 2)" + - key: "TRTC_SDK_SECRET_KEY" + required: true + description: "TRTC Conversational AI application SDKSecretKey (Key 2)" + - key: "LLM_API_KEY" + required: true + description: "External LLM access key (Key 3)" + - key: "LLM_API_URL" + required: false + default: "https://api.openai.com/v1/chat/completions" + description: "OpenAI-compatible protocol endpoint" + - key: "LLM_MODEL" + required: false + default: "gpt-4o-mini" + description: "Default LLM model name" + + io_modality: + voice_input: + enabled: true + provider: "trtc-asr" + fallback: "text_input" + timeout_ms: 5000 + text_input: + enabled: true + provider: null + fallback: null + timeout_ms: 0 + voice_output: + enabled: true + provider: "trtc-tts" + fallback: "text_output" + timeout_ms: 3000 + text_output: + enabled: true + provider: null + fallback: null + timeout_ms: 0 + +# --------------------------------------------------------------------------- +# Exposed API endpoints +# --------------------------------------------------------------------------- +endpoints: + - method: GET + path: /api/v1/health + description: Real-time connectivity self-check for 3 keys (Web Demo status bar data source) + - method: POST + path: /api/v1/get_config + description: Issue room number / UserSig / modality configuration + - method: POST + path: /api/v1/agent/start + description: Start AI channel bot (StartAIConversation) + - method: POST + path: /api/v1/agent/stop + description: Stop AI channel bot (StopAIConversation) + - method: POST + path: /api/v1/agent/control + description: Text injection / interruption (ControlAIConversation/ServerPushText) + - method: GET + path: /api/v1/sessions + description: In-memory session list (debugging only) + +# --------------------------------------------------------------------------- +# Tech stack integration rules +# --------------------------------------------------------------------------- +integration: + mode: "auto" + auto_adapters: + - tech_stack: ["react", "vue", "angular"] + adapter: "frontend-spa" + description: "Frontend SPA adapter, generates join-room component and embeds web-demo to call skeleton API" + - tech_stack: ["express", "koa", "fastify", "next"] + adapter: "node-backend" + description: "Node.js backend adapter, generates proxy routes forwarding to skeleton /api/v1/*" + - tech_stack: ["spring-boot", "quarkus"] + adapter: "java-backend" + description: "Java backend adapter, generates Filter forwarding to skeleton" + - tech_stack: ["flask", "fastapi", "django"] + adapter: "python-backend" + description: "Python backend adapter, generates decorators or sub-app mounting" + fallback: + guided_templates: + - "integration-templates/generic-frontend.md" + - "integration-templates/generic-backend.md" + manual_api: + rest_endpoint: "/api/v1" + sdk_packages: + - npm: "@trtc/voice-agent-sdk" + - maven: "com.tencent.trtc:voice-agent-sdk" + - pypi: "trtc-voice-agent" + +# --------------------------------------------------------------------------- +# Business contract (Phase 3; follows references/business-contract-spec.md v1.0) +# Note: This is a skeleton capability; source code is not refactored in this phase; +# only declares the external contract for contract-adapt.py consumption. +# port_class / default_adapter fields remain null, to be filled in Phase 4 refactor. +# --------------------------------------------------------------------------- +business_contract: + port_class: null + default_adapter: null + mock_adapter: null + customization_sop: "INTERFACE_ADAPT.md" + external_apis: + # LLM: OpenAI-compatible protocol is the default contract; can be replaced with Claude / Qwen / DeepSeek etc. + - name: llm.chat_completions + direction: outbound + method: POST + path: /v1/chat/completions + description: "Call external LLM for text response (OpenAI Chat Completions compatible protocol)" + request_schema: + model: string + messages: + - role: enum[system, user, assistant, tool] + content: string + temperature: float + max_tokens: int + stream: bool + response_schema: + choices: + - message: + role: string + content: string + finish_reason: string + usage: + prompt_tokens: int + completion_tokens: int + adapter_slots: + - request.model + - request.messages + - response.choices + auth: + type: bearer + location: header + name: Authorization + timeout_ms: 30000 + + # Tencent Cloud TRTC Conversational AI (control plane): StartAIConversation etc. + # Unlike LLM, the TRTC control plane contract is not adaptable (tightly bound to Tencent Cloud); + # declared here only for assembly-phase awareness + - name: trtc.start_ai_conversation + direction: outbound + method: POST + path: "tencentcloudapi.com/?Action=StartAIConversation" + description: "Tencent Cloud TRTC control plane: launch AI conversation (tightly bound to Tencent Cloud protocol, not adaptable)" + request_schema: + SdkAppId: int + RoomId: string + AgentConfig: object + STTConfig: object + LLMConfig: object + TTSConfig: object + response_schema: + TaskId: string + SessionId: string + adapter_slots: [] # No remapping allowed + timeout_ms: 10000 + +# --------------------------------------------------------------------------- +# Security declarations (P0 mitigations in place) +# --------------------------------------------------------------------------- +security: + log_redaction: + enabled: true + patterns: + - "secret_id" + - "secret_key" + - "api_key" + - "app_key" + - "token" + - "usersig" + - "credential" + - "authorization" + injection_protection: + xss_guard: true + prompt_injection_guard: true + credential_storage: + source: "env-only" # Credentials only from environment variables + cache_file: ".credentials_cache" + cache_permission: "0600" + env_file_permission: "0600" + network: + enforce_https: true # End-to-end HTTPS + +# --------------------------------------------------------------------------- +# Acceptance criteria (aligned with dev guide §11.1) +# --------------------------------------------------------------------------- +acceptance: + - "ASR/LLM/TTS pipeline has no hard-coded business logic" + - "3-key connectivity self-check executed immediately upon input" + - "Web Demo top status bar: all 3 indicator LEDs green" + - "Credential cache file permissions set to 600; no plain-text keys in logs" diff --git a/skills/trtc-ai-service/capabilities/conversation-core/requirements.txt b/skills/trtc-ai-service/capabilities/conversation-core/requirements.txt new file mode 100644 index 0000000..b2c660a --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/requirements.txt @@ -0,0 +1,6 @@ +fastapi>=0.110.0 +uvicorn[standard]>=0.27.0 +pydantic>=2.5.0 +python-dotenv>=1.0.0 +requests>=2.31.0 +PyYAML>=6.0 diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/__init__.py b/skills/trtc-ai-service/capabilities/conversation-core/src/__init__.py new file mode 100644 index 0000000..b208864 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/__init__.py @@ -0,0 +1,10 @@ +"""conversation-core: Voice Agent generic skeleton. + +This package implements the pipeline orchestration for ASR / LLM / TTS / session management only, +with no built-in industry knowledge bases, FAQ templates, or business rules. +All business capabilities are overlaid via external standalone capability packages +using manifest.yaml injection points. +""" + +__version__ = "1.0.0" +__all__ = ["__version__"] diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/_capability_loader.py b/skills/trtc-ai-service/capabilities/conversation-core/src/_capability_loader.py new file mode 100644 index 0000000..bce92dd --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/_capability_loader.py @@ -0,0 +1,218 @@ +"""Dynamic loader for sibling capability packages (independent of cwd / repo directory name / hyphens). + +Why this module? +================ +Capability directories use hyphenated names (e.g. ``knowledge-base``, ``human-handoff``), +but Python ``import`` syntax cannot recognize hyphens. Additionally, the ``start.sh`` +process working directory is ``capabilities/conversation-core/``, so the project root +is NOT in ``sys.path``. Therefore, the import style in manifest.yaml: + + from capabilities.knowledge_base.src.retriever import attach_faq_to_instructions + +will **never work** — it implicitly assumes: +1. Directory names use underscores (they actually use hyphens); +2. The project root is in ``sys.path`` (it is not). + +This module uses ``importlib.util`` to proactively register each directory level as a +valid Python package, bypassing package name restrictions; the project root is derived +from ``__file__``, so **renaming the repo directory has no effect**. Relative imports +such as ``from .x import y`` inside sub-modules also work correctly. + +Usage +----- + from ._capability_loader import load_capability + + retriever = load_capability("knowledge-base", "src/retriever.py") + new_text = retriever.attach_faq_to_instructions(text) + + router_mod = load_capability("knowledge-base", "src/router.py") + app.include_router(router_mod.router, prefix="/api/v1/kb") +""" +from __future__ import annotations + +import importlib.util +import logging +import sys +from pathlib import Path +from threading import RLock +from types import ModuleType +from typing import Optional + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Path resolution: derive repo_root from __file__, independent of cwd / repo directory name +# This file is at /capabilities/conversation-core/src/_capability_loader.py +# parents[3] = +# --------------------------------------------------------------------------- +_HERE = Path(__file__).resolve() +_REPO_ROOT = _HERE.parents[3] +_CAPABILITIES_ROOT = _REPO_ROOT / "capabilities" + +_CAPS_NAMESPACE = "_capabilities" + +_lock = RLock() +_module_cache: dict[str, ModuleType] = {} + + +def repo_root() -> Path: + """Return the repo root directory (the level containing ``capabilities/``).""" + return _REPO_ROOT + + +def capabilities_root() -> Path: + return _CAPABILITIES_ROOT + + +def _safe_name(part: str) -> str: + """Convert a directory segment to a valid Python identifier (hyphens → underscores).""" + return part.replace("-", "_") + + +def _ensure_namespace_root() -> ModuleType: + """Register the ``_capabilities`` top-level namespace package in ``sys.modules``.""" + mod = sys.modules.get(_CAPS_NAMESPACE) + if mod is not None: + return mod + spec = importlib.util.spec_from_loader(_CAPS_NAMESPACE, loader=None, is_package=True) + if spec is None: + raise RuntimeError("failed to build namespace spec") + mod = importlib.util.module_from_spec(spec) + mod.__path__ = [str(_CAPABILITIES_ROOT)] # Let importlib find sub-packages under this directory + sys.modules[_CAPS_NAMESPACE] = mod + return mod + + +def _ensure_package(qualified_name: str, dir_path: Path) -> ModuleType: + """Register ``dir_path`` as a Python package named ``qualified_name``. + + If a ``__init__.py`` with the same name exists, exec it normally; otherwise treat as a namespace package. + Idempotent: if already in ``sys.modules``, returns immediately. + """ + cached = sys.modules.get(qualified_name) + if cached is not None: + return cached + + init_file = dir_path / "__init__.py" + if init_file.is_file(): + spec = importlib.util.spec_from_file_location( + qualified_name, + init_file, + submodule_search_locations=[str(dir_path)], + ) + else: + spec = importlib.util.spec_from_loader(qualified_name, loader=None, is_package=True) + if spec is None: + raise ModuleNotFoundError(f"failed to build spec for package: {qualified_name}") + + pkg = importlib.util.module_from_spec(spec) + if not hasattr(pkg, "__path__"): + pkg.__path__ = [str(dir_path)] # type: ignore[attr-defined] + sys.modules[qualified_name] = pkg + + if init_file.is_file() and spec.loader is not None: + try: + spec.loader.exec_module(pkg) + except Exception: + sys.modules.pop(qualified_name, None) + raise + return pkg + + +def load_capability(cap_name: str, module_rel: str) -> ModuleType: + """Load a Python file under a given capability package and return its module object. + + Parameters + ---------- + cap_name + Capability directory name, e.g. ``"knowledge-base"`` (with hyphens). + module_rel + Python file path relative to the capability root, e.g. ``"src/retriever.py"``. + + Returns + ------- + ModuleType + The executed module object. Raises :class:`ModuleNotFoundError` on failure. + + Notes + ----- + - In-process cache: the same ``(cap_name, module_rel)`` is loaded only once. + - Full module name is e.g. ``_capabilities...``, + so relative imports like ``from .x import y`` inside capabilities work correctly. + """ + cache_key = f"{cap_name}::{module_rel}" + with _lock: + cached = _module_cache.get(cache_key) + if cached is not None: + return cached + + cap_dir = _CAPABILITIES_ROOT / cap_name + file_path = cap_dir / module_rel + if not file_path.is_file(): + raise ModuleNotFoundError( + f"capability '{cap_name}' module '{module_rel}' not found at {file_path}" + ) + + # 1) Top-level namespace _capabilities.* + _ensure_namespace_root() + + # 2) Capability package name _capabilities. + cap_safe = _safe_name(cap_name) + cap_qual = f"{_CAPS_NAMESPACE}.{cap_safe}" + _ensure_package(cap_qual, cap_dir) + + # 3) Register each intermediate directory level as a sub-package + rel_parts = Path(module_rel).parts + *dir_parts, leaf = rel_parts + parent_qual = cap_qual + parent_dir = cap_dir + for part in dir_parts: + parent_dir = parent_dir / part + parent_qual = f"{parent_qual}.{_safe_name(part)}" + _ensure_package(parent_qual, parent_dir) + + # 4) Load leaf module + leaf_basename = Path(leaf).stem + leaf_qual = f"{parent_qual}.{_safe_name(leaf_basename)}" + + cached_leaf = sys.modules.get(leaf_qual) + if cached_leaf is not None: + with _lock: + _module_cache[cache_key] = cached_leaf + return cached_leaf + + spec = importlib.util.spec_from_file_location(leaf_qual, file_path) + if spec is None or spec.loader is None: + raise ModuleNotFoundError( + f"failed to build spec for capability '{cap_name}' / '{module_rel}'" + ) + module = importlib.util.module_from_spec(spec) + sys.modules[leaf_qual] = module + try: + spec.loader.exec_module(module) + except Exception: + sys.modules.pop(leaf_qual, None) + raise + + with _lock: + _module_cache[cache_key] = module + logger.debug("capability loaded: %s -> %s", leaf_qual, file_path) + return module + + +def try_load_capability( + cap_name: str, module_rel: str +) -> Optional[ModuleType]: + """Same as :func:`load_capability`, but returns ``None`` on failure instead of raising. + + Suitable for "capability is optionally installed" scenarios: silently degrades + on missing, without affecting skeleton operation. + """ + try: + return load_capability(cap_name, module_rel) + except Exception as exc: # noqa: BLE001 + logger.info( + "capability '%s' module '%s' not loaded (skipped): %s", + cap_name, module_rel, exc, + ) + return None diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/agent.py b/skills/trtc-ai-service/capabilities/conversation-core/src/agent.py new file mode 100644 index 0000000..fa0bb64 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/agent.py @@ -0,0 +1,231 @@ +"""Voice Agent session orchestration (unified ASR / LLM / TTS / session management pipeline). + +The skeleton only handles protocol orchestration: +1) Client obtains RoomId / user UserSig via /api/v1/get_config +2) Frontend SDK joins the room and calls /api/v1/agent/start +3) Server uses trtc_client.start() to launch the AI channel bot + ↳ TRTC ConversationAI internally chains ASR → LLM → TTS full pipeline +4) /api/v1/agent/stop closes the task +5) /api/v1/agent/control is used for text injection / interrupt (covering text_input modality) + +Note: This module does not introduce any business prompts, industry knowledge bases, or FAQ templates. +All business logic is overlaid via external capability packages using manifest.yaml injection points. +""" +from __future__ import annotations + +import logging +import os +import secrets +import time +from dataclasses import dataclass, field +from threading import RLock +from typing import Any, Dict, Optional + +from .credentials import Credentials +from .modality import IoModality +from .trtc_client import AgentLifecycleConfig, TrtcConversationClient +from .usersig import gen_user_sig + +logger = logging.getLogger(__name__) + + +@dataclass +class SessionInfo: + session_id: str + room_id: str + user_id: str + agent_user_id: str + user_sig: str + agent_user_sig: str + task_id: Optional[str] = None + started_at: float = field(default_factory=time.time) + request_id: Optional[str] = None + + +class ConversationAgent: + """Voice Agent session manager. + + Singleton-style, instantiated and reused by server at startup. Only maintains in-memory + session mapping, no persistence (production persistence handled by external capability packages). + """ + + def __init__(self, credentials: Credentials, io_modality: IoModality) -> None: + if not credentials.fully_configured: + raise ValueError( + f"credentials missing: {credentials.missing()}; " + "please run scripts/setup-credentials.py to complete configuration first" + ) + self._cred = credentials + self._io = io_modality + self._client = TrtcConversationClient( + tencent=credentials.tencent_cloud, + trtc=credentials.trtc, + llm=credentials.llm, + ) + self._sessions: Dict[str, SessionInfo] = {} + self._lock = RLock() + + # ------------------------------------------------------------------ + # /api/v1/get_config + # ------------------------------------------------------------------ + def issue_config( + self, + user_id: Optional[str] = None, + room_id: Optional[str] = None, + ) -> Dict[str, Any]: + """Generate room credentials (room number / user UserSig / AI bot UserSig). + + Defaults to **numeric room IDs** (matching TRTC console default applications), + avoiding ``InvalidParameter.UserSig`` false positives with apps that have + PrivateMapKey disabled. UserId names use only ``[A-Za-z0-9_-]``, length ≤ 32 (TRTC hard constraint). + """ + # Numeric room ID: random positive integer within 32-bit range + room = str(room_id) if room_id else str(secrets.randbelow(2_000_000_000) + 1) + u_id = str(user_id) if user_id else f"u_{secrets.token_hex(6)}" + agent_u_id = f"ai_{secrets.token_hex(6)}" + # TRTC UserId max length is 32; defensive truncation + u_id = u_id[:32] + agent_u_id = agent_u_id[:32] + user_sig = gen_user_sig( + sdk_app_id=self._cred.trtc.sdk_app_id, + sdk_secret_key=self._cred.trtc.sdk_secret_key, + user_id=u_id, + ) + agent_sig = gen_user_sig( + sdk_app_id=self._cred.trtc.sdk_app_id, + sdk_secret_key=self._cred.trtc.sdk_secret_key, + user_id=agent_u_id, + ) + sid = secrets.token_urlsafe(12) + info = SessionInfo( + session_id=sid, + room_id=room, + user_id=u_id, + agent_user_id=agent_u_id, + user_sig=user_sig, + agent_user_sig=agent_sig, + ) + with self._lock: + self._sessions[sid] = info + logger.info("issue_config session=%s room=%s user=%s", sid, room, u_id) + return { + "session_id": sid, + "sdk_app_id": self._cred.trtc.sdk_app_id, + "room_id": room, + "room_id_type": 0, # Numeric room ID + "user_id": u_id, + "user_sig": user_sig, + "agent_user_id": agent_u_id, + "io_modality": self._io.to_dict(), + } + + # ------------------------------------------------------------------ + # /api/v1/agent/start + # ------------------------------------------------------------------ + def start_agent( + self, + session_id: str, + config: Optional[AgentLifecycleConfig] = None, + ) -> Dict[str, Any]: + info = self._require_session(session_id) + # _ext_before_start_ (capability extension anchor; do not remove) + # Capabilities (e.g. knowledge-base) injected via add-capability.py land here, + # inside the start_agent method body, where `config` and `info` are in scope. + # + # [knowledge-base] If knowledge-base capability is installed, prepend matched FAQ to instructions + # via dynamic loading through _capability_loader, independent of cwd / repo directory name / hyphenated directories + if config is not None and getattr(config, "instructions", None): + from ._capability_loader import try_load_capability + _kb = try_load_capability("knowledge-base", "src/retriever.py") + if _kb is not None: + try: + config.instructions = _kb.attach_faq_to_instructions(config.instructions) + except Exception as exc: # noqa: BLE001 + logger.warning("knowledge-base FAQ injection failed: %s", exc) + result = self._client.start( + room_id=info.room_id, + agent_user_id=info.agent_user_id, + agent_user_sig=info.agent_user_sig, + target_user_id=info.user_id, + config=config, + room_id_type=0, # Numeric room ID (consistent with issue_config) + ) + with self._lock: + info.task_id = result.get("task_id") + info.request_id = result.get("request_id") + logger.info("start_agent session=%s task=%s", session_id, info.task_id) + # _ext_after_start_ (capability extension anchor; do not remove) + # Capabilities (e.g. human-handoff) injected via add-capability.py land here, + # inside the method body, where `session_id` and `info` are in scope. + return { + "session_id": session_id, + "task_id": info.task_id, + "request_id": info.request_id, + "status": "started", + } + + # ------------------------------------------------------------------ + # /api/v1/agent/stop + # ------------------------------------------------------------------ + def stop_agent(self, session_id: str) -> Dict[str, Any]: + info = self._require_session(session_id) + if info.task_id: + self._client.stop(info.task_id) + with self._lock: + self._sessions.pop(session_id, None) + logger.info("stop_agent session=%s task=%s", session_id, info.task_id) + return {"session_id": session_id, "status": "stopped"} + + # ------------------------------------------------------------------ + # /api/v1/agent/control + # ------------------------------------------------------------------ + def push_text( + self, + session_id: str, + text: str, + interrupt: bool = True, + ) -> Dict[str, Any]: + """Text input channel: inject text into a running AI task.""" + info = self._require_session(session_id) + if not info.task_id: + raise RuntimeError("session has no active task; call start_agent first") + if not text or not text.strip(): + raise ValueError("text cannot be empty") + # _ext_before_push_text_ (capability extension anchor; do not remove) + # Capabilities (human-handoff / tool-calling / session-summary) injected + # via add-capability.py land here, inside push_text's body, where the + # locals `session_id` and `text` are in scope. + self._client.control( + task_id=info.task_id, + command="ServerPushText", + text=text, + interrupt=interrupt, + ) + return {"session_id": session_id, "task_id": info.task_id, "delivered": True} + + # ------------------------------------------------------------------ + # Helpers + # ------------------------------------------------------------------ + def list_sessions(self) -> Dict[str, Any]: + with self._lock: + return { + "sessions": [ + { + "session_id": s.session_id, + "room_id": s.room_id, + "user_id": s.user_id, + "task_id": s.task_id, + "started_at": s.started_at, + } + for s in self._sessions.values() + ] + } + + def _require_session(self, session_id: str) -> SessionInfo: + if not session_id: + raise ValueError("session_id is required") + with self._lock: + info = self._sessions.get(session_id) + if not info: + raise ValueError(f"session not found: {session_id}") + return info diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/credentials.py b/skills/trtc-ai-service/capabilities/conversation-core/src/credentials.py new file mode 100644 index 0000000..7de42a3 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/credentials.py @@ -0,0 +1,132 @@ +"""3-Key credential reading and encapsulation. + +Credentials come only from environment variables (P0 Secrets spec: env-only); never +read from code or configuration files in plain text. After reading, they are exposed +to upper layers as dataclasses. Callers should not log the entire object — instead, +rely on log_filter.RedactingFilter for fallback redaction. +""" +from __future__ import annotations + +import os +from dataclasses import dataclass, field +from typing import Optional + + +@dataclass(frozen=True) +class TencentCloudCredential: + """Key 1: Tencent Cloud API keys (used for STS / TRTC control plane REST calls).""" + + secret_id: str + secret_key: str + region: str = "ap-guangzhou" + + @property + def configured(self) -> bool: + return bool(self.secret_id and self.secret_key) + + +@dataclass(frozen=True) +class TrtcCredential: + """Key 2: TRTC Conversational AI application credentials. + + SDKAppID and SDKSecretKey are used to generate UserSig and call ConversationAI. + region determines whether to call the international or China endpoint: + - "intl" → Application registered at https://console.trtc.io (default) + - "cn" → Application registered at https://console.cloud.tencent.com/trtc + """ + + sdk_app_id: int + sdk_secret_key: str + region: str = "intl" # intl | cn + + @property + def configured(self) -> bool: + return bool(self.sdk_app_id and self.sdk_secret_key) + + @property + def trtc_endpoint(self) -> str: + return ( + "trtc.intl.tencentcloudapi.com" + if self.region == "intl" + else "trtc.tencentcloudapi.com" + ) + + @property + def trtc_region(self) -> str: + return "ap-singapore" if self.region == "intl" else "ap-guangzhou" + + +@dataclass(frozen=True) +class LlmCredential: + """Key 3: External LLM access key (OpenAI-compatible protocol).""" + + api_key: str + api_url: str = "https://api.openai.com/v1/chat/completions" + model: str = "gpt-4o-mini" + llm_type: str = "openai" + + @property + def configured(self) -> bool: + return bool(self.api_key and self.api_url and self.model) + + +@dataclass(frozen=True) +class Credentials: + """3-Key aggregate container.""" + + tencent_cloud: TencentCloudCredential + trtc: TrtcCredential + llm: LlmCredential + + @property + def fully_configured(self) -> bool: + return all( + ( + self.tencent_cloud.configured, + self.trtc.configured, + self.llm.configured, + ) + ) + + def missing(self) -> list[str]: + miss: list[str] = [] + if not self.tencent_cloud.configured: + miss.append("tencent_cloud") + if not self.trtc.configured: + miss.append("trtc") + if not self.llm.configured: + miss.append("llm") + return miss + + +def _int_env(key: str, default: int = 0) -> int: + raw = os.getenv(key, "") + try: + return int(raw) + except (TypeError, ValueError): + return default + + +def load_from_env() -> Credentials: + """Load the 3 keys from environment variables. + + All key names match .env.example / setup-credentials.py output. + """ + return Credentials( + tencent_cloud=TencentCloudCredential( + secret_id=os.getenv("TENCENT_CLOUD_SECRET_ID", ""), + secret_key=os.getenv("TENCENT_CLOUD_SECRET_KEY", ""), + region=os.getenv("TENCENT_CLOUD_REGION", "ap-guangzhou"), + ), + trtc=TrtcCredential( + sdk_app_id=_int_env("TRTC_SDK_APP_ID", 0), + sdk_secret_key=os.getenv("TRTC_SDK_SECRET_KEY", ""), + region=os.getenv("TRTC_REGION", "intl"), + ), + llm=LlmCredential( + api_key=os.getenv("LLM_API_KEY", ""), + api_url=os.getenv("LLM_API_URL", "https://api.openai.com/v1/chat/completions"), + model=os.getenv("LLM_MODEL", "gpt-4o-mini"), + llm_type=os.getenv("LLM_TYPE", "openai"), + ), + ) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/health.py b/skills/trtc-ai-service/capabilities/conversation-core/src/health.py new file mode 100644 index 0000000..3f547d5 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/health.py @@ -0,0 +1,355 @@ +"""Real-time connectivity self-check for the 3 keys. + +Each key is validated immediately after input; failures get instant feedback without proceeding to the next key. +""" +from __future__ import annotations + +import base64 +import hashlib +import hmac +import json +import time +from dataclasses import dataclass +from datetime import datetime, timezone +from typing import Tuple +from urllib.parse import urlparse + +import requests + +from .credentials import LlmCredential, TencentCloudCredential, TrtcCredential +from .usersig import gen_user_sig + + +@dataclass +class CheckResult: + ok: bool + latency_ms: int + error_code: str = "" + detail: str = "" + + def to_dict(self) -> dict: + return { + "status": "ok" if self.ok else "failed", + "latency_ms": self.latency_ms, + "error_code": self.error_code, + "detail": self.detail, + } + + +# --------------------------------------------------------------------------- +# 1) Tencent Cloud API Key: Call STS GetFederationToken +# --------------------------------------------------------------------------- +_STS_HOST = "sts.tencentcloudapi.com" +_STS_SERVICE = "sts" +_STS_VERSION = "2018-08-13" +_STS_ACTION = "GetFederationToken" + + +def _sign_tc3(secret_key: str, date: str, service: str, string_to_sign: str) -> str: + k_date = hmac.new(("TC3" + secret_key).encode("utf-8"), date.encode("utf-8"), hashlib.sha256).digest() + k_service = hmac.new(k_date, service.encode("utf-8"), hashlib.sha256).digest() + k_signing = hmac.new(k_service, b"tc3_request", hashlib.sha256).digest() + return hmac.new(k_signing, string_to_sign.encode("utf-8"), hashlib.sha256).hexdigest() + + +def check_tencent_cloud(cred: TencentCloudCredential, timeout: float = 5.0) -> CheckResult: + """Call STS GetFederationToken to verify SecretId/SecretKey validity.""" + if not cred.configured: + return CheckResult(ok=False, latency_ms=0, error_code="E001", detail="empty credential") + + payload = json.dumps( + { + "Name": "trtc-voice-agent-credential-check", + "Policy": json.dumps( + { + "version": "2.0", + "statement": [ + { + "effect": "deny", + "action": ["*"], + "resource": ["*"], + } + ], + } + ), + "DurationSeconds": 1800, + }, + separators=(",", ":"), + ) + timestamp = int(time.time()) + date = datetime.fromtimestamp(timestamp, tz=timezone.utc).strftime("%Y-%m-%d") + + canonical_headers = ( + f"content-type:application/json; charset=utf-8\n" + f"host:{_STS_HOST}\n" + f"x-tc-action:{_STS_ACTION.lower()}\n" + ) + signed_headers = "content-type;host;x-tc-action" + hashed_payload = hashlib.sha256(payload.encode("utf-8")).hexdigest() + canonical_request = ( + f"POST\n/\n\n{canonical_headers}\n{signed_headers}\n{hashed_payload}" + ) + credential_scope = f"{date}/{_STS_SERVICE}/tc3_request" + hashed_canonical = hashlib.sha256(canonical_request.encode("utf-8")).hexdigest() + string_to_sign = ( + f"TC3-HMAC-SHA256\n{timestamp}\n{credential_scope}\n{hashed_canonical}" + ) + signature = _sign_tc3(cred.secret_key, date, _STS_SERVICE, string_to_sign) + authorization = ( + f"TC3-HMAC-SHA256 Credential={cred.secret_id}/{credential_scope}, " + f"SignedHeaders={signed_headers}, Signature={signature}" + ) + headers = { + "Authorization": authorization, + "Content-Type": "application/json; charset=utf-8", + "Host": _STS_HOST, + "X-TC-Action": _STS_ACTION, + "X-TC-Timestamp": str(timestamp), + "X-TC-Version": _STS_VERSION, + "X-TC-Region": cred.region, + } + started = time.perf_counter() + try: + resp = requests.post( + f"https://{_STS_HOST}", + headers=headers, + data=payload.encode("utf-8"), + timeout=timeout, + ) + elapsed = int((time.perf_counter() - started) * 1000) + body = resp.json() if resp.content else {} + err = body.get("Response", {}).get("Error") + if resp.status_code == 200 and not err: + return CheckResult(ok=True, latency_ms=elapsed) + if err and err.get("Code", "").startswith("AuthFailure"): + return CheckResult( + ok=False, + latency_ms=elapsed, + error_code="E001", + detail=err.get("Message", "AuthFailure"), + ) + return CheckResult( + ok=False, + latency_ms=elapsed, + error_code="E001", + detail=(err or {}).get("Message") or f"HTTP {resp.status_code}", + ) + except requests.Timeout: + return CheckResult(ok=False, latency_ms=int(timeout * 1000), error_code="E004", detail="timeout") + except requests.RequestException as exc: + return CheckResult(ok=False, latency_ms=0, error_code="E004", detail=str(exc)) + + +# --------------------------------------------------------------------------- +# 2) TRTC Application Credentials: Call TRTC OpenAPI DescribeAppStatistics to verify +# that SDKAppID + Tencent Cloud API Key combination actually works. +# Note: During StartAIConversation, TRTC server also validates SDKSecretKey +# against the room ID, so local UserSig generation alone cannot detect SecretKey +# misconfiguration. Hence we do a lightweight real OpenAPI call as a fallback +# (rate limit 50/s won't trigger throttling). +# endpoint switches by trtc.region: intl → trtc.intl.tencentcloudapi.com +# --------------------------------------------------------------------------- +_TRTC_SERVICE = "trtc" +_TRTC_VERSION = "2019-07-22" + + +def _trtc_sign_tc3(secret_key: str, date: str, string_to_sign: str) -> str: + k_date = hmac.new( + ("TC3" + secret_key).encode("utf-8"), + date.encode("utf-8"), + hashlib.sha256, + ).digest() + k_service = hmac.new(k_date, _TRTC_SERVICE.encode("utf-8"), hashlib.sha256).digest() + k_signing = hmac.new(k_service, b"tc3_request", hashlib.sha256).digest() + return hmac.new(k_signing, string_to_sign.encode("utf-8"), hashlib.sha256).hexdigest() + + +def check_trtc( + cred: TrtcCredential, + tencent: TencentCloudCredential | None = None, + timeout: float = 5.0, +) -> CheckResult: + """Verify TRTC credentials. + + 1. Required: Local UserSig generation (verify SDKAppID + SDKSecretKey self-consistency) + 2. Recommended: Call TRTC OpenAPI ``DescribeTRTCRealTimeQualityData`` + to verify SDKAppID really exists under the Tencent Cloud account (depends on ``tencent`` cred) + endpoint switches by cred.region (intl / cn) + """ + if not cred.configured: + return CheckResult(ok=False, latency_ms=0, error_code="E002", detail="empty credential") + + started = time.perf_counter() + # —— Step 1: Local UserSig generation —— + try: + sig = gen_user_sig( + sdk_app_id=cred.sdk_app_id, + sdk_secret_key=cred.sdk_secret_key, + user_id="credential_check", + expire_seconds=60, + ) + except Exception as exc: + return CheckResult(ok=False, latency_ms=0, error_code="E002", detail=str(exc)) + if not sig or len(sig) < 32: + return CheckResult( + ok=False, + latency_ms=int((time.perf_counter() - started) * 1000), + error_code="E002", + detail="invalid usersig length", + ) + + # —— Step 2: Real OpenAPI verification (requires Tencent Cloud API credentials) —— + if tencent is None or not tencent.configured: + elapsed = int((time.perf_counter() - started) * 1000) + return CheckResult(ok=True, latency_ms=elapsed, detail="local-only (no tencent cred)") + + trtc_host = cred.trtc_endpoint + trtc_region = cred.trtc_region + + # Use DescribeTRTCRealTimeQualityData for a minimal probe: pass SdkAppId + very short time window + now_ts = int(time.time()) + payload = json.dumps( + { + "SdkAppId": cred.sdk_app_id, + "StartTime": now_ts - 60, + "EndTime": now_ts, + }, + separators=(",", ":"), + ) + timestamp = now_ts + date = datetime.fromtimestamp(timestamp, tz=timezone.utc).strftime("%Y-%m-%d") + canonical_headers = ( + f"content-type:application/json; charset=utf-8\n" + f"host:{trtc_host}\n" + f"x-tc-action:describetrtcrealtimequalitydata\n" + ) + signed_headers = "content-type;host;x-tc-action" + hashed_payload = hashlib.sha256(payload.encode("utf-8")).hexdigest() + canonical_request = ( + f"POST\n/\n\n{canonical_headers}\n{signed_headers}\n{hashed_payload}" + ) + credential_scope = f"{date}/{_TRTC_SERVICE}/tc3_request" + hashed_canonical = hashlib.sha256(canonical_request.encode("utf-8")).hexdigest() + string_to_sign = ( + f"TC3-HMAC-SHA256\n{timestamp}\n{credential_scope}\n{hashed_canonical}" + ) + signature = _trtc_sign_tc3(tencent.secret_key, date, string_to_sign) + authorization = ( + f"TC3-HMAC-SHA256 Credential={tencent.secret_id}/{credential_scope}, " + f"SignedHeaders={signed_headers}, Signature={signature}" + ) + headers = { + "Authorization": authorization, + "Content-Type": "application/json; charset=utf-8", + "Host": trtc_host, + "X-TC-Action": "DescribeTRTCRealTimeQualityData", + "X-TC-Timestamp": str(timestamp), + "X-TC-Version": _TRTC_VERSION, + "X-TC-Region": trtc_region, + } + try: + resp = requests.post( + f"https://{trtc_host}", + headers=headers, + data=payload.encode("utf-8"), + timeout=timeout, + ) + elapsed = int((time.perf_counter() - started) * 1000) + body = resp.json() if resp.content else {} + err = body.get("Response", {}).get("Error") + if resp.status_code == 200 and not err: + return CheckResult( + ok=True, + latency_ms=elapsed, + detail=f"region={cred.region}, endpoint={trtc_host}", + ) + # Distinguish two error types: SdkAppId not under this account vs others + if err: + code = err.get("Code", "") + if "SdkAppId" in code or "AuthFailure" in code or "ResourceNotFound" in code: + return CheckResult( + ok=False, + latency_ms=elapsed, + error_code="E002", + detail=f"{code}: {err.get('Message', '')} (region={cred.region})", + ) + # Other business errors (e.g. a sub-capability not enabled) don't affect SdkAppId ownership; treat as pass + return CheckResult( + ok=True, + latency_ms=elapsed, + detail=f"sdkappid valid; api warning: {code}", + ) + return CheckResult( + ok=False, + latency_ms=elapsed, + error_code="E002", + detail=f"HTTP {resp.status_code}", + ) + except requests.Timeout: + return CheckResult( + ok=False, + latency_ms=int(timeout * 1000), + error_code="E004", + detail="trtc api timeout", + ) + except requests.RequestException as exc: + return CheckResult(ok=False, latency_ms=0, error_code="E004", detail=str(exc)) + + +# --------------------------------------------------------------------------- +# 3) External LLM: Send a minimal prompt to verify the key is valid. +# --------------------------------------------------------------------------- +def check_llm(cred: LlmCredential, timeout: float = 10.0) -> CheckResult: + if not cred.configured: + return CheckResult(ok=False, latency_ms=0, error_code="E003", detail="empty credential") + + parsed = urlparse(cred.api_url) + if parsed.scheme not in ("http", "https"): + return CheckResult(ok=False, latency_ms=0, error_code="E003", detail="invalid api_url scheme") + + headers = { + "Authorization": f"Bearer {cred.api_key}", + "Content-Type": "application/json", + } + body = { + "model": cred.model, + "messages": [{"role": "user", "content": "ping"}], + "max_tokens": 1, + "temperature": 0, + "stream": False, + } + started = time.perf_counter() + try: + resp = requests.post(cred.api_url, headers=headers, json=body, timeout=timeout) + elapsed = int((time.perf_counter() - started) * 1000) + if resp.status_code == 200: + return CheckResult(ok=True, latency_ms=elapsed) + if resp.status_code in (401, 403): + return CheckResult( + ok=False, + latency_ms=elapsed, + error_code="E003", + detail=f"unauthorized: {resp.status_code}", + ) + return CheckResult( + ok=False, + latency_ms=elapsed, + error_code="E003", + detail=f"HTTP {resp.status_code}: {resp.text[:200]}", + ) + except requests.Timeout: + return CheckResult(ok=False, latency_ms=int(timeout * 1000), error_code="E004", detail="timeout") + except requests.RequestException as exc: + return CheckResult(ok=False, latency_ms=0, error_code="E004", detail=str(exc)) + + +def check_all( + tencent: TencentCloudCredential, + trtc: TrtcCredential, + llm: LlmCredential, +) -> Tuple[CheckResult, CheckResult, CheckResult]: + return ( + check_tencent_cloud(tencent), + check_trtc(trtc, tencent=tencent), + check_llm(llm), + ) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/log_filter.py b/skills/trtc-ai-service/capabilities/conversation-core/src/log_filter.py new file mode 100644 index 0000000..572c28f --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/log_filter.py @@ -0,0 +1,76 @@ +"""Log redaction filter (P0 security requirement implemented). + +Performs irreversible masking of credential fields in log records before logging, +based on the keywords declared in manifest.yaml security.log_redaction.patterns. +""" +from __future__ import annotations + +import logging +import re +from typing import Iterable + +# Default sensitive field names to match (aligned with manifest.yaml security.log_redaction.patterns) +_DEFAULT_PATTERNS = ( + "secret_id", + "secret_key", + "api_key", + "app_key", + "token", + "usersig", + "credential", + "authorization", +) + + +def _build_regex(patterns: Iterable[str]) -> re.Pattern[str]: + # Matches three common formats: key=value / "key": "value" / key: value + keys = "|".join(re.escape(p) for p in patterns) + pattern = ( + r"(?i)(?P" + keys + r")" + r"(?P\s*[:=]\s*\"?)" + r"(?P[A-Za-z0-9_\-\.\+/=]{4,})" + ) + return re.compile(pattern) + + +def _mask(value: str) -> str: + if len(value) <= 8: + return "***" + return f"{value[:2]}***{value[-2:]}" + + +class RedactingFilter(logging.Filter): + """Mask sensitive fields in log messages / args.""" + + def __init__(self, patterns: Iterable[str] = _DEFAULT_PATTERNS) -> None: + super().__init__() + self._regex = _build_regex(patterns) + + def filter(self, record: logging.LogRecord) -> bool: # noqa: A003 + try: + if isinstance(record.msg, str): + record.msg = self._regex.sub( + lambda m: f"{m.group('key')}{m.group('sep')}{_mask(m.group('val'))}", + record.msg, + ) + if record.args: + record.args = tuple( + self._regex.sub( + lambda m: f"{m.group('key')}{m.group('sep')}{_mask(m.group('val'))}", + str(a), + ) + if isinstance(a, str) + else a + for a in record.args + ) + except Exception: # Redaction failure must not affect the main logging flow + pass + return True + + +def install_redacting_filter(logger: logging.Logger | None = None) -> None: + """Attach the redacting filter to the specified Logger (defaults to root Logger).""" + target = logger or logging.getLogger() + if any(isinstance(f, RedactingFilter) for f in target.filters): + return + target.addFilter(RedactingFilter()) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/modality.py b/skills/trtc-ai-service/capabilities/conversation-core/src/modality.py new file mode 100644 index 0000000..77b653f --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/modality.py @@ -0,0 +1,109 @@ +"""I/O modality configuration and degradation strategy (conversation-core built-in). + +Four channels (voice input / voice output / text input / text output) are independently configurable. +When a channel service is unavailable, the system automatically degrades to an available channel +according to the strategy declared in this module, ensuring session continuity. +""" +from __future__ import annotations + +from dataclasses import dataclass, field +from enum import Enum +from typing import Optional + + +class Channel(str, Enum): + VOICE_INPUT = "voice_input" + TEXT_INPUT = "text_input" + VOICE_OUTPUT = "voice_output" + TEXT_OUTPUT = "text_output" + + +@dataclass +class ChannelConfig: + enabled: bool = True + provider: Optional[str] = None + fallback: Optional[Channel] = None + timeout_ms: int = 0 + + +@dataclass +class IoModality: + voice_input: ChannelConfig = field( + default_factory=lambda: ChannelConfig( + enabled=True, provider="trtc-asr", fallback=Channel.TEXT_INPUT, timeout_ms=5000 + ) + ) + text_input: ChannelConfig = field(default_factory=lambda: ChannelConfig(enabled=True)) + voice_output: ChannelConfig = field( + default_factory=lambda: ChannelConfig( + enabled=True, provider="trtc-tts", fallback=Channel.TEXT_OUTPUT, timeout_ms=3000 + ) + ) + text_output: ChannelConfig = field(default_factory=lambda: ChannelConfig(enabled=True)) + + def to_dict(self) -> dict: + def _dump(c: ChannelConfig) -> dict: + return { + "enabled": c.enabled, + "provider": c.provider, + "fallback": c.fallback.value if c.fallback else None, + "timeout_ms": c.timeout_ms, + } + + return { + Channel.VOICE_INPUT.value: _dump(self.voice_input), + Channel.TEXT_INPUT.value: _dump(self.text_input), + Channel.VOICE_OUTPUT.value: _dump(self.voice_output), + Channel.TEXT_OUTPUT.value: _dump(self.text_output), + } + + def resolve_input_channel(self, voice_available: bool) -> Channel: + """Return the input channel to use, based on enabled status + service availability.""" + if self.voice_input.enabled and voice_available: + return Channel.VOICE_INPUT + if self.voice_input.fallback and self.text_input.enabled: + return Channel.TEXT_INPUT + if self.text_input.enabled: + return Channel.TEXT_INPUT + # Edge case: all input channels unavailable -> upper layer enters silent wait + raise RuntimeError("no usable input channel") + + def resolve_output_channel(self, voice_available: bool) -> Channel: + if self.voice_output.enabled and voice_available: + return Channel.VOICE_OUTPUT + if self.voice_output.fallback and self.text_output.enabled: + return Channel.TEXT_OUTPUT + if self.text_output.enabled: + return Channel.TEXT_OUTPUT + raise RuntimeError("no usable output channel") + + +def from_dict(data: dict) -> IoModality: + """Construct IoModality instance from the io_modality section of manifest.yaml.""" + if not data: + return IoModality() + + def _channel(value: Optional[str]) -> Optional[Channel]: + if not value: + return None + return Channel(value) + + def _build(cfg: dict | None, default: ChannelConfig) -> ChannelConfig: + if not cfg: + return default + return ChannelConfig( + enabled=cfg.get("enabled", default.enabled), + provider=cfg.get("provider", default.provider), + fallback=_channel(cfg.get("fallback")) + if "fallback" in cfg + else default.fallback, + timeout_ms=cfg.get("timeout_ms", default.timeout_ms), + ) + + base = IoModality() + return IoModality( + voice_input=_build(data.get("voice_input"), base.voice_input), + text_input=_build(data.get("text_input"), base.text_input), + voice_output=_build(data.get("voice_output"), base.voice_output), + text_output=_build(data.get("text_output"), base.text_output), + ) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/server.py b/skills/trtc-ai-service/capabilities/conversation-core/src/server.py new file mode 100644 index 0000000..c307032 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/server.py @@ -0,0 +1,312 @@ +"""FastAPI entry point: exposes skeleton REST API + static Web Demo. + +Routes: + GET /api/v1/health —— Real-time connectivity check for 3 keys + POST /api/v1/get_config —— Generate RoomId / UserSig / modality config + POST /api/v1/agent/start —— Start AI conversation task + POST /api/v1/agent/stop —— Stop AI conversation task + POST /api/v1/agent/control —— Text injection / interrupt + GET / —— Web Demo static page + +Design principles (aligned with §3.3): + - Zero industry assumptions: all routes only do protocol orchestration, no built-in business prompts + - Configuration as verification: health endpoint provides data source for Web Demo's three LEDs + - Security compliance: log redaction filter installed at startup; credentials from env vars only +""" +from __future__ import annotations + +import logging +import os +from pathlib import Path +from typing import Any, Dict, Optional + +from dotenv import load_dotenv + +# Load .env before importing business modules to ensure credentials module reads correct env vars +_BASE_DIR = Path(__file__).resolve().parent.parent +load_dotenv(_BASE_DIR / ".env.local") +load_dotenv(_BASE_DIR / ".env") + +from fastapi import APIRouter, FastAPI, HTTPException +from fastapi.middleware.cors import CORSMiddleware +from fastapi.responses import FileResponse +from fastapi.staticfiles import StaticFiles +from pydantic import BaseModel, Field + +from .agent import ConversationAgent +from .credentials import load_from_env +from .health import check_all +from .log_filter import install_redacting_filter +from .modality import IoModality +from .trtc_client import AgentLifecycleConfig + +logger = logging.getLogger("conversation_core") + +# Install log redaction filter (P0 security requirement) +logging.basicConfig( + level=os.getenv("LOG_LEVEL", "INFO"), + format="%(asctime)s [%(levelname)s] %(name)s: %(message)s", +) +install_redacting_filter(logging.getLogger()) + + +# --------------------------------------------------------------------------- +# Global Agent singleton (startup failure does not prevent /api/v1/health from giving clear diagnostics) +# --------------------------------------------------------------------------- +_credentials = load_from_env() +_io_modality = IoModality() # Phase 1 default: all modalities enabled +_agent: Optional[ConversationAgent] = None +_init_error: Optional[str] = None +try: + _agent = ConversationAgent(_credentials, _io_modality) + logger.info("ConversationAgent initialized") +except Exception as exc: # Credential missing etc. must not crash the process + _init_error = str(exc) + logger.warning("ConversationAgent not initialized: %s", _init_error) + + +# --------------------------------------------------------------------------- +# Pydantic Models +# --------------------------------------------------------------------------- +class GetConfigRequest(BaseModel): + user_id: Optional[str] = None + room_id: Optional[str] = None + + +class StartAgentRequest(BaseModel): + session_id: str = Field(..., description="session_id returned by get_config") + instructions: Optional[str] = None + greeting: Optional[str] = None + language: Optional[str] = "en" # en | zh + voice_id: Optional[str] = None # Leave empty to use DEFAULT_VOICE_IDS selected by language + max_idle_time: Optional[int] = 60 + + +class StopAgentRequest(BaseModel): + session_id: str + + +class ControlRequest(BaseModel): + session_id: str + text: str + interrupt: bool = True + + +# --------------------------------------------------------------------------- +# FastAPI App +# --------------------------------------------------------------------------- +app = FastAPI( + title="conversation-core", + version="1.0.0", + description="TRTC Voice Agent generic skeleton (no business assumptions)", +) +app.add_middleware( + CORSMiddleware, + allow_origins=["*"], + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], +) + +api = APIRouter(prefix="/api/v1") + + +def _to_http_error(exc: Exception) -> HTTPException: + if isinstance(exc, ValueError): + return HTTPException(status_code=400, detail=str(exc)) + if isinstance(exc, RuntimeError): + return HTTPException(status_code=500, detail=str(exc)) + return HTTPException(status_code=500, detail=f"internal: {exc}") + + +def _require_agent() -> ConversationAgent: + if _agent is None: + raise HTTPException( + status_code=503, + detail={ + "code": "credentials_missing", + "message": _init_error or "credentials not configured", + "hint": "run scripts/setup-credentials.py first", + }, + ) + return _agent + + +# --------------------------------------------------------------------------- +# Health +# --------------------------------------------------------------------------- +@api.get("/health") +def health() -> Dict[str, Any]: + """Real-time probe of 3 keys' connectivity, used by Web Demo top status bar.""" + cred = load_from_env() + tc, trtc, llm = check_all(cred.tencent_cloud, cred.trtc, cred.llm) + overall = "ok" if tc.ok and trtc.ok and llm.ok else "partial_failure" + return { + "status": overall, + "checks": { + "tencent_cloud": tc.to_dict(), + "trtc": trtc.to_dict(), + "llm": llm.to_dict(), + }, + "configured": cred.fully_configured, + "missing": cred.missing(), + "io_modality": _io_modality.to_dict(), + } + + +# --------------------------------------------------------------------------- +# Config / Lifecycle +# --------------------------------------------------------------------------- +@api.post("/get_config") +def get_config(req: GetConfigRequest) -> Dict[str, Any]: + agent = _require_agent() + try: + data = agent.issue_config(user_id=req.user_id, room_id=req.room_id) + return {"code": 0, "msg": "success", "data": data} + except Exception as exc: + raise _to_http_error(exc) + + +@api.post("/agent/start") +def agent_start(req: StartAgentRequest) -> Dict[str, Any]: + agent = _require_agent() + try: + defaults = AgentLifecycleConfig() + cfg = AgentLifecycleConfig( + instructions=req.instructions or defaults.instructions, + greeting=req.greeting or defaults.greeting, + language=req.language or "en", + voice_id=req.voice_id or "", + max_idle_time=req.max_idle_time or 60, + ) + return {"code": 0, "msg": "success", "data": agent.start_agent(req.session_id, cfg)} + except Exception as exc: + raise _to_http_error(exc) + + +@api.post("/agent/stop") +def agent_stop(req: StopAgentRequest) -> Dict[str, Any]: + agent = _require_agent() + try: + return {"code": 0, "msg": "success", "data": agent.stop_agent(req.session_id)} + except Exception as exc: + raise _to_http_error(exc) + + +@api.post("/agent/control") +def agent_control(req: ControlRequest) -> Dict[str, Any]: + agent = _require_agent() + try: + return { + "code": 0, + "msg": "success", + "data": agent.push_text(req.session_id, req.text, req.interrupt), + } + except Exception as exc: + raise _to_http_error(exc) + + +@api.get("/sessions") +def sessions_list() -> Dict[str, Any]: + agent = _require_agent() + return {"code": 0, "data": agent.list_sessions()} + + +# --------------------------------------------------------------------------- +# Debug endpoint: for troubleshooting InvalidParameter.UserSig etc. +# Outputs current config + a test UserSig for comparison against the TRTC official tool: +# https://console.cloud.tencent.com/trtc/usersigtools +# Security: only returns SDKAppID / region / endpoint / test UserSig; never exposes plaintext SecretKey +# --------------------------------------------------------------------------- +@api.get("/debug/usersig") +def debug_usersig(user_id: str = "test_user_001") -> Dict[str, Any]: + cred = load_from_env() + if not cred.trtc.configured: + raise HTTPException(status_code=503, detail="TRTC credential not configured") + from .usersig import gen_user_sig + + sig = gen_user_sig( + sdk_app_id=cred.trtc.sdk_app_id, + sdk_secret_key=cred.trtc.sdk_secret_key, + user_id=user_id, + expire_seconds=86400, + ) + return { + "sdk_app_id": cred.trtc.sdk_app_id, + "region": cred.trtc.region, + "trtc_endpoint": cred.trtc.trtc_endpoint, + "test_user_id": user_id, + "test_user_sig": sig, + "user_sig_length": len(sig), + "verify_url": "https://console.cloud.tencent.com/trtc/usersigtools", + "hint": ( + "Paste sdk_app_id / test_user_id / test_user_sig into the TRTC console official verification tool. " + "If the tool shows UserSig verification passed → SDKSecretKey is correct; " + "If the tool shows verification failed → the TRTC_SDK_SECRET_KEY you entered does not match this SDKAppID. " + "Please re-check the SDKSecretKey in the TRTC console (note: this is NOT the Tencent Cloud API SecretKey)." + ), + } + + +app.include_router(api) +# [human-handoff] mount sub-router +from ._capability_loader import try_load_capability as _try_load_capability +_hh_router_mod = _try_load_capability("human-handoff", "src/router.py") +if _hh_router_mod is not None and hasattr(_hh_router_mod, "router"): + app.include_router( + _hh_router_mod.router, prefix="/api/v1/handoff", tags=["human-handoff"] + ) + +# [session-summary] mount sub-router (default installed; supplies ticket context summaries) +_ss_router_mod = _try_load_capability("session-summary", "src/router.py") +if _ss_router_mod is not None and hasattr(_ss_router_mod, "router"): + app.include_router( + _ss_router_mod.router, prefix="/api/v1/summary", tags=["session-summary"] + ) + +# --------------------------------------------------------------------------- +# Capability route mounting (optional; dynamically loaded via _capability_loader, silently +# skipped if the capability package is not installed). +# Injected by add-capability; all use try_load_capability to avoid hyphenated import failures. +# --------------------------------------------------------------------------- +from ._capability_loader import try_load_capability as _try_load_capability # noqa: E402 + +# [knowledge-base] mount sub-router +_kb_router_mod = _try_load_capability("knowledge-base", "src/router.py") +if _kb_router_mod is not None and hasattr(_kb_router_mod, "router"): + app.include_router( + _kb_router_mod.router, prefix="/api/v1/kb", tags=["knowledge-base"] + ) + + +# --------------------------------------------------------------------------- +# Web Demo static pages (minimal verification page, no business content) +# Can point to a custom demo directory (e.g. Path A artifact directory) via +# the WEB_DEMO_DIR environment variable. Defaults to conversation-core's own web-demo self-check page. +# --------------------------------------------------------------------------- +_DEMO_DIR = Path(os.getenv("WEB_DEMO_DIR", str(_BASE_DIR / "web-demo"))) +if _DEMO_DIR.exists(): + app.mount( + "/static", + StaticFiles(directory=str(_DEMO_DIR), html=True), + name="static", + ) + + @app.get("/") + def index() -> FileResponse: + return FileResponse(str(_DEMO_DIR / "index.html")) + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- +def main() -> None: + import uvicorn + + port = int(os.getenv("PORT", "3000")) + host = os.getenv("HOST", "0.0.0.0") + uvicorn.run(app, host=host, port=port) + + +if __name__ == "__main__": + main() diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/trtc_client.py b/skills/trtc-ai-service/capabilities/conversation-core/src/trtc_client.py new file mode 100644 index 0000000..57ac685 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/trtc_client.py @@ -0,0 +1,315 @@ +"""TRTC Conversational AI control plane client. + +Encapsulates the minimal call chain for three REST APIs: +StartAIConversation / StopAIConversation / ControlAIConversation. +The skeleton layer only does "protocol encapsulation + credential signing", +with no built-in business prompts, industry knowledge bases, or FAQ templates. + +API docs: +- StartAIConversation: https://cloud.tencent.com/document/product/647/108514 +- StopAIConversation: https://cloud.tencent.com/document/product/647/108513 +- ControlAIConversation: https://cloud.tencent.com/document/product/647/109408 +""" +from __future__ import annotations + +import hashlib +import hmac +import json +import logging +import time +from dataclasses import dataclass +from datetime import datetime, timezone +from typing import Any, Dict, Optional + +import requests + +from .credentials import LlmCredential, TencentCloudCredential, TrtcCredential + +logger = logging.getLogger(__name__) + +_SERVICE = "trtc" +_VERSION = "2019-07-22" + + +def _sign_tc3(secret_key: str, date: str, string_to_sign: str) -> str: + k_date = hmac.new( + ("TC3" + secret_key).encode("utf-8"), + date.encode("utf-8"), + hashlib.sha256, + ).digest() + k_service = hmac.new(k_date, _SERVICE.encode("utf-8"), hashlib.sha256).digest() + k_signing = hmac.new(k_service, b"tc3_request", hashlib.sha256).digest() + return hmac.new(k_signing, string_to_sign.encode("utf-8"), hashlib.sha256).hexdigest() + + +def _signed_request( + cred: TencentCloudCredential, + host: str, + region: str, + action: str, + payload: Dict[str, Any], + timeout: float = 5.0, +) -> Dict[str, Any]: + body = json.dumps(payload, separators=(",", ":"), ensure_ascii=False) + timestamp = int(time.time()) + date = datetime.fromtimestamp(timestamp, tz=timezone.utc).strftime("%Y-%m-%d") + + canonical_headers = ( + f"content-type:application/json; charset=utf-8\n" + f"host:{host}\n" + f"x-tc-action:{action.lower()}\n" + ) + signed_headers = "content-type;host;x-tc-action" + hashed_payload = hashlib.sha256(body.encode("utf-8")).hexdigest() + canonical_request = f"POST\n/\n\n{canonical_headers}\n{signed_headers}\n{hashed_payload}" + credential_scope = f"{date}/{_SERVICE}/tc3_request" + hashed_canonical = hashlib.sha256(canonical_request.encode("utf-8")).hexdigest() + string_to_sign = f"TC3-HMAC-SHA256\n{timestamp}\n{credential_scope}\n{hashed_canonical}" + signature = _sign_tc3(cred.secret_key, date, string_to_sign) + authorization = ( + f"TC3-HMAC-SHA256 Credential={cred.secret_id}/{credential_scope}, " + f"SignedHeaders={signed_headers}, Signature={signature}" + ) + headers = { + "Authorization": authorization, + "Content-Type": "application/json; charset=utf-8", + "Host": host, + "X-TC-Action": action, + "X-TC-Timestamp": str(timestamp), + "X-TC-Version": _VERSION, + "X-TC-Region": region, + } + resp = requests.post( + f"https://{host}", + headers=headers, + data=body.encode("utf-8"), + timeout=timeout, + ) + if resp.status_code != 200: + raise RuntimeError(f"TRTC API HTTP {resp.status_code}: {resp.text[:200]}") + parsed = resp.json() + response_obj = parsed.get("Response", {}) + err = response_obj.get("Error") + request_id = response_obj.get("RequestId", "n/a") + if err: + raise RuntimeError( + f"TRTC API error {err.get('Code')}: {err.get('Message')} " + f"[endpoint={host}, action={action}, RequestId={request_id}]" + ) + return response_obj + + +# Generic voice-assistant guardrails (NO industry/business assumptions). +#固化在骨架默认值里,确保:① 任何环境/任何 LLM 模型 ② 前端不传 instructions 时 +# 都能避免 TTS 朗读 markdown 特殊符号,并让 AI 采信系统注入的权威上下文。 +_DEFAULT_INSTRUCTIONS = ( + "You are a helpful voice assistant for an online store's customer service. " + "Always answer in plain spoken language suitable for text-to-speech. " + "Do NOT use any Markdown or formatting symbols such as asterisks (*), underscores (_), " + "pound signs (#), backticks (`), tildes (~) or bullet / numbered-list markup, and never " + "read such symbols aloud. " + "Keep replies concise, ideally one to three sentences. " + "For general questions such as product recommendations or shopping advice, answer helpfully " + "and freely from your own knowledge — never claim you lack a product catalog or cannot help " + "with general guidance; offer concrete, natural suggestions instead. " + "Any information provided to you inside a message that begins with [system] is authoritative " + "context: use it directly to answer the user, and never say you cannot find it or ask the user " + "to repeat an identifier (such as an order number) that was already given to you." +) + + +@dataclass +class AgentLifecycleConfig: + """Session lifecycle parameters (business-logic independent).""" + + instructions: str = _DEFAULT_INSTRUCTIONS + greeting: str = "Hello, how can I help you?" + max_idle_time: int = 60 # seconds + welcome_message: str = "" + language: str = "en" # Default English (widest compatibility; Chinese requires TRTC app to enable corresponding capability) + voice_id: str = "v-female-A4b9KqP2" # TRTC FlowTTS default female voice (English Articulate Narrator) + tts_model: str = "flow_01_turbo" + + +# TRTC FlowTTS verified voice IDs (taken from oral-coach project, confirmed working) +# Full voice list: https://trtc.io/document/79682?product=conversationalai +DEFAULT_VOICE_IDS = { + ("en", "female"): "v-female-p9Xy7Q1L", # Articulate Narrator + ("en", "male"): "v-male-A4b9KqP2", # Scholarly Lecturer + ("zh", "female"): "female-kefu-xiaoyue", + ("zh", "male"): "male-kefu-xiaoxu", +} + + +class TrtcConversationClient: + """Thin wrapper around TRTC ConversationAI control plane. + + Constructor parameters: + tencent: Tencent Cloud API keys (used to sign REST requests). + trtc: TRTC SDKAppID / SDKSecretKey (the SdkAppId in StartAIConversation). + llm: LLM credentials, used to populate LLMConfig (passthrough only, not called within the skeleton). + """ + + def __init__( + self, + tencent: TencentCloudCredential, + trtc: TrtcCredential, + llm: LlmCredential, + ) -> None: + if not tencent.configured: + raise ValueError("tencent cloud credential not configured") + if not trtc.configured: + raise ValueError("trtc credential not configured") + if not llm.configured: + raise ValueError("llm credential not configured") + self.tencent = tencent + self.trtc = trtc + self.llm = llm + + # ------------------------------------------------------------------ + # StartAIConversation + # ------------------------------------------------------------------ + def start( + self, + room_id: str, + agent_user_id: str, + agent_user_sig: str, + target_user_id: str, + config: Optional[AgentLifecycleConfig] = None, + room_id_type: int = 0, + ) -> Dict[str, Any]: + cfg = config or AgentLifecycleConfig() + # Resolve voice_id: user explicit > default per language + voice_id = cfg.voice_id or DEFAULT_VOICE_IDS.get( + (cfg.language, "female"), + DEFAULT_VOICE_IDS[("en", "female")], + ) + payload: Dict[str, Any] = { + "SdkAppId": self.trtc.sdk_app_id, + "RoomId": str(room_id), + "RoomIdType": room_id_type, + "AgentConfig": { + "UserId": agent_user_id, + "UserSig": agent_user_sig, + "MaxIdleTime": cfg.max_idle_time, + "TargetUserId": target_user_id, + "WelcomeMessage": cfg.welcome_message or cfg.greeting, + # Smart interrupt (critical): + # InterruptMode 2 = auto + manual dual-track + # • Auto: user speaks beyond InterruptSpeechDuration ms → stop TTS + # • Manual: frontend sends type:20001 custom message → immediately stop TTS + # (for text input: send interrupt before text, then type:20000 triggers new turn) + "InterruptMode": 2, + "InterruptSpeechDuration": 500, + # Subtitle mode: 1 = deliver both user and AI subtitles to client + "SubtitleMode": 1, + # Single-word filter: prevent ASR from splitting filler sounds like "um/ah" into single words + "FilterOneWord": True, + # Turn detection: 3 = semantic + VAD dual-signal to detect when user has finished speaking + "TurnDetectionMode": 3, + "TurnDetection": {"SemanticEagerness": "auto"}, + }, + "STTConfig": { + "Language": cfg.language, + "VadLevel": 3, + "VadSilenceTime": 1000, + }, + "LLMConfig": json.dumps( + { + "LLMType": self.llm.llm_type, + "Model": self.llm.model, + "APIKey": self.llm.api_key, + "APIUrl": self.llm.api_url, + "Streaming": True, + "SystemPrompt": cfg.instructions, + "History": 20, + "Temperature": 0.4, + }, + separators=(",", ":"), + ensure_ascii=False, + ), + "TTSConfig": json.dumps( + { + "TTSType": "flow", + "Model": cfg.tts_model, + "VoiceId": voice_id, + "Speed": 1.0, + "Volume": 1.0, + "Pitch": 0, + "Language": cfg.language, + }, + separators=(",", ":"), + ensure_ascii=False, + ), + } + # Log key diagnostics before starting (UserSig redacted) + logger.info( + "StartAIConversation: endpoint=%s region=%s SdkAppId=%s RoomId=%s " + "agent=%s target=%s userSig=%s...%s(len=%d) lang=%s voice=%s", + self.trtc.trtc_endpoint, + self.trtc.trtc_region, + self.trtc.sdk_app_id, + room_id, + agent_user_id, + target_user_id, + agent_user_sig[:6], + agent_user_sig[-4:], + len(agent_user_sig), + cfg.language, + voice_id, + ) + resp = _signed_request( + self.tencent, + host=self.trtc.trtc_endpoint, + region=self.trtc.trtc_region, + action="StartAIConversation", + payload=payload, + timeout=10.0, + ) + return { + "task_id": resp.get("TaskId"), + "request_id": resp.get("RequestId"), + } + + # ------------------------------------------------------------------ + # StopAIConversation + # ------------------------------------------------------------------ + def stop(self, task_id: str) -> None: + if not task_id: + raise ValueError("task_id is required") + _signed_request( + self.tencent, + host=self.trtc.trtc_endpoint, + region=self.trtc.trtc_region, + action="StopAIConversation", + payload={"TaskId": task_id}, + timeout=5.0, + ) + + # ------------------------------------------------------------------ + # ControlAIConversation: used for text injection / interrupt + # ------------------------------------------------------------------ + def control( + self, + task_id: str, + command: str, + text: Optional[str] = None, + interrupt: bool = True, + ) -> Dict[str, Any]: + """Inject text or issue a control command to a running conversation task.""" + if not task_id or not command: + raise ValueError("task_id and command are required") + payload: Dict[str, Any] = {"TaskId": task_id, "Command": command} + if text is not None: + payload["ServerPushText"] = { + "Text": text, + "Interrupt": interrupt, + } + return _signed_request( + self.tencent, + host=self.trtc.trtc_endpoint, + region=self.trtc.trtc_region, + action="ControlAIConversation", + payload=payload, + timeout=5.0, + ) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/src/usersig.py b/skills/trtc-ai-service/capabilities/conversation-core/src/usersig.py new file mode 100644 index 0000000..ce69def --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/src/usersig.py @@ -0,0 +1,90 @@ +"""TLS-SIG-API-v2 UserSig generator (pure Python, no third-party dependencies). + +TRTC room authentication uses SDKAppID + SDKSecretKey to sign the UserId via HMAC-SHA256, +then compresses with zlib + base64url encodes to produce the UserSig. This implementation +matches the official ``TLSSigAPIv2`` behavior, enabling usage in a minimal skeleton without +additional SDKs. + +Reference: https://cloud.tencent.com/document/product/647/17275 +""" +from __future__ import annotations + +import base64 +import hashlib +import hmac +import json +import time +import zlib + + +def _base64_encode(data: bytes) -> str: + s = base64.b64encode(data).decode("utf-8") + # TRTC custom base64url: + → *, / → -, = → _ + return s.replace("+", "*").replace("/", "-").replace("=", "_") + + +def _hmac_sha256( + sdk_app_id: int, + user_id: str, + secret_key: str, + current_ts: int, + expire: int, + base64_userbuf: str | None = None, +) -> str: + raw_to_sign = ( + f"TLS.identifier:{user_id}\n" + f"TLS.sdkappid:{sdk_app_id}\n" + f"TLS.time:{current_ts}\n" + f"TLS.expire:{expire}\n" + ) + if base64_userbuf is not None: + raw_to_sign += f"TLS.userbuf:{base64_userbuf}\n" + digest = hmac.new( + secret_key.encode("utf-8"), + raw_to_sign.encode("utf-8"), + hashlib.sha256, + ).digest() + return base64.b64encode(digest).decode("utf-8") + + +def gen_user_sig( + sdk_app_id: int, + sdk_secret_key: str, + user_id: str, + expire_seconds: int = 86400, +) -> str: + """Generate a UserSig. + + Args: + sdk_app_id: TRTC SDKAppID (integer). + sdk_secret_key: TRTC SDKSecretKey. + user_id: User identifier within the room; must remain stable. + expire_seconds: Validity duration in seconds, default 24 hours. + + Returns: + A UserSig string ready for use with the TRTC Web SDK for room entry. + """ + if not sdk_app_id or not sdk_secret_key: + raise ValueError("sdk_app_id and sdk_secret_key are required") + if not user_id: + raise ValueError("user_id is required") + + current_ts = int(time.time()) + sig = _hmac_sha256( + sdk_app_id=sdk_app_id, + user_id=user_id, + secret_key=sdk_secret_key, + current_ts=current_ts, + expire=expire_seconds, + ) + + payload = { + "TLS.ver": "2.0", + "TLS.identifier": str(user_id), + "TLS.sdkappid": int(sdk_app_id), + "TLS.expire": int(expire_seconds), + "TLS.time": int(current_ts), + "TLS.sig": sig, + } + compressed = zlib.compress(json.dumps(payload).encode("utf-8")) + return _base64_encode(compressed) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/tests/__init__.py b/skills/trtc-ai-service/capabilities/conversation-core/tests/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/skills/trtc-ai-service/capabilities/conversation-core/tests/test_skeleton.py b/skills/trtc-ai-service/capabilities/conversation-core/tests/test_skeleton.py new file mode 100644 index 0000000..939787c --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/tests/test_skeleton.py @@ -0,0 +1,216 @@ +"""Skeleton core pipeline unit tests. + +Verification targets: + - 3-key credential encapsulation loads correctly from environment variables + - I/O modality channel selection and degradation strategy work correctly + - Log redaction filter masks common sensitive fields + - UserSig generator produces non-empty, reasonably-sized signatures for valid input + - Skeleton source code contains no hardcoded business logic (FAQ / industry prompts etc.) +""" +from __future__ import annotations + +import logging +import os +import sys +from pathlib import Path + +import pytest + +_HERE = Path(__file__).resolve().parent +_CORE = _HERE.parent +sys.path.insert(0, str(_CORE)) + +from src.credentials import load_from_env # noqa: E402 +from src.log_filter import RedactingFilter # noqa: E402 +from src.modality import Channel, IoModality, from_dict # noqa: E402 +from src.usersig import gen_user_sig # noqa: E402 + + +# --------------------------------------------------------------------------- +# credentials +# --------------------------------------------------------------------------- +def test_credentials_from_env(monkeypatch): + monkeypatch.setenv("TENCENT_CLOUD_SECRET_ID", "AKID_xxx") + monkeypatch.setenv("TENCENT_CLOUD_SECRET_KEY", "secret_xxx") + monkeypatch.setenv("TRTC_SDK_APP_ID", "1400000000") + monkeypatch.setenv("TRTC_SDK_SECRET_KEY", "trtc_secret") + monkeypatch.setenv("LLM_API_KEY", "sk-xxx") + + cred = load_from_env() + assert cred.fully_configured is True + assert cred.tencent_cloud.secret_id == "AKID_xxx" + assert cred.trtc.sdk_app_id == 1400000000 + assert cred.llm.model == "gpt-4o-mini" # default + assert cred.missing() == [] + + +def test_credentials_missing(monkeypatch): + for k in ( + "TENCENT_CLOUD_SECRET_ID", "TENCENT_CLOUD_SECRET_KEY", + "TRTC_SDK_APP_ID", "TRTC_SDK_SECRET_KEY", "LLM_API_KEY", + ): + monkeypatch.delenv(k, raising=False) + cred = load_from_env() + assert cred.fully_configured is False + assert set(cred.missing()) == {"tencent_cloud", "trtc", "llm"} + + +# --------------------------------------------------------------------------- +# modality +# --------------------------------------------------------------------------- +def test_modality_default_resolve(): + mod = IoModality() + assert mod.resolve_input_channel(voice_available=True) == Channel.VOICE_INPUT + assert mod.resolve_output_channel(voice_available=True) == Channel.VOICE_OUTPUT + + +def test_modality_fallback_to_text_when_voice_unavailable(): + mod = IoModality() + assert mod.resolve_input_channel(voice_available=False) == Channel.TEXT_INPUT + assert mod.resolve_output_channel(voice_available=False) == Channel.TEXT_OUTPUT + + +def test_modality_text_only_scenario(): + mod = from_dict( + { + "voice_input": {"enabled": False}, + "voice_output": {"enabled": False}, + "text_input": {"enabled": True}, + "text_output": {"enabled": True}, + } + ) + assert mod.resolve_input_channel(voice_available=True) == Channel.TEXT_INPUT + assert mod.resolve_output_channel(voice_available=True) == Channel.TEXT_OUTPUT + + +def test_modality_all_disabled_raises(): + mod = from_dict( + { + "voice_input": {"enabled": False}, + "voice_output": {"enabled": False}, + "text_input": {"enabled": False}, + "text_output": {"enabled": False}, + } + ) + with pytest.raises(RuntimeError): + mod.resolve_input_channel(voice_available=False) + + +# --------------------------------------------------------------------------- +# log redaction (P0 security) +# --------------------------------------------------------------------------- +def test_log_redacting_filter(): + rec = logging.LogRecord( + name="t", level=logging.INFO, pathname="", lineno=1, + msg="boot with secret_key=ABCDEFGHIJKLMNOP and api_key: sk-1234567890abcdef", + args=(), exc_info=None, + ) + flt = RedactingFilter() + assert flt.filter(rec) is True + masked = rec.getMessage() + assert "ABCDEFGHIJKLMNOP" not in masked + assert "sk-1234567890abcdef" not in masked + assert "secret_key" in masked # Field name preserved + assert "api_key" in masked + + +# --------------------------------------------------------------------------- +# usersig +# --------------------------------------------------------------------------- +def test_usersig_basic(): + sig = gen_user_sig( + sdk_app_id=1400000000, + sdk_secret_key="dummy_secret_for_unit_test", + user_id="user_123", + expire_seconds=60, + ) + assert isinstance(sig, str) and len(sig) > 32 + # TRTC custom base64url charset (+ → *, / → -, = → _) + allowed = set( + "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789*-_" + ) + assert all(c in allowed for c in sig) + + +def test_usersig_input_validation(): + with pytest.raises(ValueError): + gen_user_sig(0, "k", "user") + with pytest.raises(ValueError): + gen_user_sig(1400000000, "", "user") + with pytest.raises(ValueError): + gen_user_sig(1400000000, "k", "") + + +# --------------------------------------------------------------------------- +# Skeleton purity: source code should not contain industry keywords +# (e-commerce / orders / restaurant / reservation etc.). +# Check target: after stripping comments and docstrings, actual code must not contain hardcoded business logic. +# --------------------------------------------------------------------------- +import ast +import io +import tokenize + + +def _strip_comments_and_docstrings(source: str) -> str: + """Return code after stripping comments and docstrings.""" + # 1) Remove # comments + out_tokens = [] + g = tokenize.generate_tokens(io.StringIO(source).readline) + for tok_type, tok_val, *_ in g: + if tok_type == tokenize.COMMENT: + continue + out_tokens.append((tok_type, tok_val)) + no_comments = tokenize.untokenize(out_tokens) + # 2) Remove docstrings: rewrite AST without Expr(Constant(str)) nodes + try: + tree = ast.parse(no_comments) + except SyntaxError: + return no_comments + + class _DocstringRemover(ast.NodeTransformer): + def _strip(self, node): + if ( + node.body + and isinstance(node.body[0], ast.Expr) + and isinstance(node.body[0].value, ast.Constant) + and isinstance(node.body[0].value.value, str) + ): + node.body = node.body[1:] or [ast.Pass()] + return node + + def visit_Module(self, node): + self.generic_visit(node) + return self._strip(node) + + def visit_FunctionDef(self, node): + self.generic_visit(node) + return self._strip(node) + + def visit_AsyncFunctionDef(self, node): + self.generic_visit(node) + return self._strip(node) + + def visit_ClassDef(self, node): + self.generic_visit(node) + return self._strip(node) + + cleaned = _DocstringRemover().visit(tree) + ast.fix_missing_locations(cleaned) + return ast.unparse(cleaned) + + +def test_skeleton_purity_no_business_keywords(): + forbidden = [] + forbidden = [ + + "FAQ", + ] + src_dir = _CORE / "src" + offenders = [] + for py in src_dir.glob("*.py"): + raw = py.read_text(encoding="utf-8") + code_only = _strip_comments_and_docstrings(raw) + for kw in forbidden: + if kw in code_only: + offenders.append((py.name, kw)) + assert offenders == [], f"Skeleton contains business keywords: {offenders}" diff --git a/skills/trtc-ai-service/capabilities/conversation-core/web-demo/README.md b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/README.md new file mode 100644 index 0000000..7982430 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/README.md @@ -0,0 +1,48 @@ +# Web Demo · 3-Step Quick Start Guide + +> This directory is the minimal runnable verification page for conversation-core, **containing no business logic**. + +## Three Steps to Start + +```bash +# 1. Install dependencies (first time) +pip install -r capabilities/conversation-core/requirements.txt + +# 2. Configure three keys (interactive guide) +python scripts/setup-credentials.py + +# 3. Launch the Demo +bash start.sh +# or: cd capabilities/conversation-core && python -m src.server +``` + +Open your browser and visit . + +## Verification Checklist + +After opening the page, check in the following order: + +1. The three indicator LEDs in the top status bar go from `gray` → `yellow (pending)` → `green`. +2. Once all three LEDs are green, the "Start Conversation" button becomes clickable. +3. Clicking it automatically calls `/api/v1/get_config` and `/api/v1/agent/start`; the console will output `task_id`. +4. Send any text in the text input box; you can see the ServerPushText injection record in the TRTC console. + +## Troubleshooting + +Click "Recheck" in the top-right corner to force a connectivity refresh. On failure, the browser console will output structured diagnostics, e.g.: + +```json +{ + "tencent_cloud": { "status": "ok", "latency_ms": 120 }, + "trtc": { "status": "ok", "latency_ms": 12 }, + "llm": { "status": "failed", "error_code": "E003", "detail": "unauthorized: 401" } +} +``` + +Cross-reference the `error_code` with the `INTEGRATION.md` troubleshooting dictionary to locate the issue. + +## Not in Scope for This Demo + +- Real audio capture and TRTC RTC room entry (handled by Phase 2 `frontend-spa` adapter or the integrator) +- Business knowledge base / FAQ / tool calling (overlaid by standalone capability packages) +- Digital human rendering, handoff, session summaries, etc. (overlaid by standalone capability packages) diff --git a/skills/trtc-ai-service/capabilities/conversation-core/web-demo/app.js b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/app.js new file mode 100644 index 0000000..b94b4b3 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/app.js @@ -0,0 +1,415 @@ +/* TRTC Conversational AI · Web Demo + * Health check → request room credentials (get_config) → TRTC enterRoom + start AI + * Subtitles come from TRTC custom messages: type=10000 (subtitle), type=10001 (AI state) + * User text messages sent to the AI bot: sendCustomMessage(cmdId:2, type:20000) + * → AI treats it as "user speech" and responds (voice + subtitle) + */ +(function () { + 'use strict'; + + const elIndicators = document.querySelectorAll('.indicator'); + const btnRecheck = document.getElementById('btn-recheck'); + const btnStart = document.getElementById('btn-start'); + const btnMic = document.getElementById('btn-mic'); + const btnStop = document.getElementById('btn-stop'); + const btnSend = document.getElementById('btn-send'); + const txt = document.getElementById('text-input'); + const conv = document.getElementById('conversation'); + const micLabel = document.getElementById('mic-label'); + const micIcon = document.getElementById('mic-icon'); + const agentStatusEl = document.getElementById('agent-status'); + + const state = { + healthy: false, + sessionId: null, + sdkAppId: 0, + roomId: null, + userId: null, + userSig: null, + agentUserId: null, + taskId: null, + trtcClient: null, + micEnabled: false, + aiRoundText: {}, // roundid -> 累积文本(可能是增量也可能累积) + aiRoundLast: {}, // roundid -> 上一帧 + aiRoundBubbleId: {}, // roundid -> dom id + userTurnText: '', + userBubbleId: null, + }; + + // ====================== HTTP helpers ====================== + async function api(path, options = {}) { + const resp = await fetch(path, { + method: options.method || 'GET', + headers: { 'Content-Type': 'application/json' }, + body: options.body ? JSON.stringify(options.body) : undefined, + }); + let data; + try { data = await resp.json(); } catch { data = {}; } + if (!resp.ok) { + const msg = (data && (data.detail?.message || data.detail || data.msg)) || resp.statusText; + throw new Error(typeof msg === 'string' ? msg : JSON.stringify(msg)); + } + return data; + } + + // ====================== UI helpers ====================== + function setIndicator(key, ok, latency) { + elIndicators.forEach((el) => { + if (el.dataset.key !== key) return; + const led = el.querySelector('.led'); + const lat = el.querySelector('.latency'); + led.classList.remove('led-unknown', 'led-ok', 'led-fail', 'led-pending'); + if (ok === 'pending') { + led.classList.add('led-pending'); + lat.textContent = '...'; + } else if (ok) { + led.classList.add('led-ok'); + lat.textContent = latency != null ? `${latency}ms` : 'ok'; + } else { + led.classList.add('led-fail'); + lat.textContent = latency != null ? `${latency}ms` : 'fail'; + } + }); + } + + function clearPlaceholder() { + const p = conv.querySelector('.placeholder'); + if (p) p.remove(); + } + + function makeBubble(role, text) { + clearPlaceholder(); + const div = document.createElement('div'); + div.className = `bubble ${role}`; + div.textContent = text || ''; + conv.appendChild(div); + div.scrollIntoView({ behavior: 'smooth', block: 'end' }); + return div; + } + + function updateBubble(el, text) { + if (!el) return; + el.textContent = text; + el.scrollIntoView({ behavior: 'smooth', block: 'end' }); + } + + function setAgentStatus(label, cls = '') { + agentStatusEl.innerHTML = `${label}`; + } + + function updateMicUI() { + btnMic.classList.toggle('btn-active', state.micEnabled); + micLabel.textContent = state.micEnabled ? 'Mic on' : 'Mic off'; + micIcon.textContent = state.micEnabled ? '🔴' : '🎙️'; + } + + // ====================== Health ====================== + async function runHealthCheck() { + ['tencent_cloud', 'trtc', 'llm'].forEach((k) => setIndicator(k, 'pending')); + btnStart.disabled = true; + try { + const data = await api('/api/v1/health'); + const checks = data.checks || {}; + let allOk = true; + for (const k of ['tencent_cloud', 'trtc', 'llm']) { + const c = checks[k] || {}; + const ok = c.status === 'ok'; + if (!ok) allOk = false; + setIndicator(k, ok, c.latency_ms); + if (!ok && c.detail) console.warn(`[health] ${k}:`, c.error_code, c.detail); + } + state.healthy = allOk; + btnStart.disabled = !allOk; + if (!data.configured) { + clearPlaceholder(); + makeBubble( + 'agent', + `Credentials missing: ${(data.missing || []).join(', ')}\nRun: python scripts/setup-credentials.py` + ); + } + } catch (err) { + ['tencent_cloud', 'trtc', 'llm'].forEach((k) => setIndicator(k, false)); + console.error('[health] error', err); + } + } + + // ====================== TRTC custom message handling ====================== + // type:10000 字幕(user / ai),type:10001 AI 状态 + function handleCustomMessage(data, eventUserId) { + if (!data || typeof data !== 'object') return; + + if (data.type === 10001) { + const stateCode = data.payload && data.payload.state; + const map = { + 1: ['listening', 'listening'], + 2: ['thinking', 'thinking'], + 3: ['speaking', 'speaking'], + 4: ['interrupted', 'idle'], + }; + const m = map[stateCode] || ['idle', 'idle']; + setAgentStatus(m[0], m[1]); + return; + } + + if (data.type !== 10000) return; + const text = (data.payload && data.payload.text) || ''; + const sender = data.sender || eventUserId || ''; + const end = data.payload && data.payload.end === true; + const roundid = (data.payload && data.payload.roundid) || ''; + const isUser = sender === state.userId; + + if (isUser) { + // 用户语音 ASR:仅 end=true 才落入气泡,避免半句乱码 + if (end && text.trim()) { + if (!state.userBubbleId) { + state.userBubbleId = makeBubble('user', text.trim()); + } else { + const cur = state.userBubbleId.textContent || ''; + updateBubble(state.userBubbleId, cur ? `${cur} ${text.trim()}` : text.trim()); + } + // 一次 user finalize 后,下一次新 ASR 进入新气泡 + setTimeout(() => { state.userBubbleId = null; }, 1500); + } + return; + } + + // AI 字幕:增量/累积自适应,按 roundid 聚合到同一气泡 + if (text.trim() && roundid) { + const last = state.aiRoundLast[roundid] || ''; + const cur = state.aiRoundText[roundid] || ''; + const isAccumulative = last && text.startsWith(last); + state.aiRoundText[roundid] = isAccumulative ? text : (cur + text); + state.aiRoundLast[roundid] = text; + + let bubble = state.aiRoundBubbleId[roundid]; + if (!bubble) { + bubble = makeBubble('agent', state.aiRoundText[roundid]); + state.aiRoundBubbleId[roundid] = bubble; + } else { + updateBubble(bubble, state.aiRoundText[roundid]); + } + } + + if (end && roundid) { + const bubble = state.aiRoundBubbleId[roundid]; + const finalText = state.aiRoundText[roundid] || text; + if (bubble) updateBubble(bubble, finalText); + delete state.aiRoundText[roundid]; + delete state.aiRoundLast[roundid]; + delete state.aiRoundBubbleId[roundid]; + } + } + + // ====================== Start / Stop ====================== + async function startConversation() { + btnStart.disabled = true; + setAgentStatus('connecting...', 'thinking'); + try { + // 1) get_config + const cfg = await api('/api/v1/get_config', { method: 'POST', body: {} }); + const data = cfg.data || {}; + Object.assign(state, { + sessionId: data.session_id, + sdkAppId: data.sdk_app_id, + roomId: parseInt(data.room_id, 10), // TRTC enterRoom 要求 roomId 是 number(数字房间号) + userId: data.user_id, + userSig: data.user_sig, + agentUserId: data.agent_user_id, + }); + + // 2) TRTC create + enterRoom + if (typeof TRTC === 'undefined') throw new Error('TRTC Web SDK not loaded'); + state.trtcClient = TRTC.create(); + + state.trtcClient.on(TRTC.EVENT.CUSTOM_MESSAGE, (event) => { + try { + const txt = new TextDecoder().decode(event.data); + const parsed = JSON.parse(txt); + handleCustomMessage(parsed, event.userId); + } catch (e) { console.warn('parse custom msg failed', e); } + }); + + state.trtcClient.on(TRTC.EVENT.ERROR, (err) => { + console.error('[trtc] error', err); + }); + + await state.trtcClient.enterRoom({ + roomId: state.roomId, + scene: 'rtc', + sdkAppId: state.sdkAppId, + userId: state.userId, + userSig: state.userSig, + }); + + // 3) start local audio (默认静音;用户点 Mic 后才开) + try { + await state.trtcClient.startLocalAudio(); + await state.trtcClient.updateLocalAudio({ mute: true }); + state.micEnabled = false; + } catch (e) { + console.warn('mic init failed (continue text-only):', e); + } + + // 4) StartAIConversation + await api('/api/v1/agent/start', { + method: 'POST', + body: { session_id: state.sessionId, language: 'en' }, + }); + + btnStop.disabled = false; + btnSend.disabled = false; + btnMic.disabled = false; + txt.disabled = false; + txt.focus(); + updateMicUI(); + setAgentStatus('ready', 'idle'); + } catch (err) { + console.error('[start] error', err); + makeBubble('agent', `Start failed: ${err.message || err}`); + setAgentStatus('error', 'idle'); + await safeExitRoom(); + btnStart.disabled = !state.healthy; + } + } + + async function stopConversation() { + btnStop.disabled = true; + btnMic.disabled = true; + btnSend.disabled = true; + txt.disabled = true; + txt.value = ''; + try { + if (state.sessionId) { + await api('/api/v1/agent/stop', { + method: 'POST', + body: { session_id: state.sessionId }, + }); + } + } catch (e) { + console.warn('[stop] api error', e); + } + await safeExitRoom(); + state.sessionId = null; + state.taskId = null; + state.micEnabled = false; + updateMicUI(); + setAgentStatus('idle', 'idle'); + btnStart.disabled = !state.healthy; + } + + async function safeExitRoom() { + try { + if (state.trtcClient) { + await state.trtcClient.exitRoom(); + state.trtcClient.destroy(); + } + } catch (e) { + console.warn('exitRoom failed', e); + } finally { + state.trtcClient = null; + } + } + + // ====================== Mic toggle ====================== + async function toggleMic() { + if (!state.trtcClient) return; + state.micEnabled = !state.micEnabled; + try { + await state.trtcClient.updateLocalAudio({ mute: !state.micEnabled }); + // 用户刚开 mic 即将说话 → 立刻打断 AI 当前 TTS(智能打断) + if (state.micEnabled) sendInterrupt(); + } catch (e) { + console.error('toggleMic failed', e); + state.micEnabled = !state.micEnabled; // revert + } + updateMicUI(); + } + + // ====================== Text injection (端侧 → AI bot) ====================== + // 协议参考:https://cloud.tencent.com/document/product/647/115412 + // - type:20000 = 把文字当作"用户说的话"喂给 AI bot,触发一轮 LLM → TTS + 字幕 + // payload 字段名必须是 `message`(不是 text) + // - type:20001 = 立即打断 AI 当前 TTS(用户开始新一轮输入时触发智能打断) + // 通过 TRTC sendCustomMessage(cmdId:2) 端到端发送。 + + function uuid() { + return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, (c) => { + const r = (Math.random() * 16) | 0; + const v = c === 'x' ? r : (r & 0x3) | 0x8; + return v.toString(16); + }); + } + + async function sendCustomToAgent(message) { + if (!state.trtcClient) return false; + try { + await state.trtcClient.sendCustomMessage({ + cmdId: 2, + data: new TextEncoder().encode(JSON.stringify(message)).buffer, + }); + return true; + } catch (e) { + console.warn('sendCustomMessage failed', e); + return false; + } + } + + function sendInterrupt() { + return sendCustomToAgent({ + type: 20001, + sender: state.userId, + receiver: [state.agentUserId], + payload: { id: uuid(), timestamp: Date.now() }, + }); + } + + async function sendText() { + const text = (txt.value || '').trim(); + if (!text || !state.trtcClient) return; + txt.value = ''; + // 立即渲染用户气泡 + makeBubble('user', text); + + // 1) 先打断 AI 当前 TTS(智能打断) + sendInterrupt(); + + // 2) 略延后一点点,确保 interrupt 生效,再推用户回合给 AI + // AI bot 收到 type:20000 后会跳过 ASR、直接走 LLM → TTS + 字幕回放 + setTimeout(() => { + sendCustomToAgent({ + type: 20000, + sender: state.userId, + receiver: [state.agentUserId], + payload: { + id: uuid(), + message: text, // ← 协议规范字段名(不是 text) + timestamp: Date.now(), + }, + }); + }, 120); + } + + // ====================== Bindings ====================== + btnRecheck.addEventListener('click', runHealthCheck); + btnStart.addEventListener('click', startConversation); + btnStop.addEventListener('click', stopConversation); + btnMic.addEventListener('click', toggleMic); + btnSend.addEventListener('click', sendText); + + // IME(中文输入法)兼容:拼音上屏时按空格/回车不应触发发送 + // - compositionstart / compositionend 跟踪输入法是否在编辑中 + // - event.isComposing 是 W3C 标准属性 + // - event.keyCode === 229 是 IME 状态下 Enter 的兜底信号(旧浏览器) + let imeComposing = false; + txt.addEventListener('compositionstart', () => { imeComposing = true; }); + txt.addEventListener('compositionend', () => { imeComposing = false; }); + txt.addEventListener('keydown', (e) => { + if (e.key !== 'Enter') return; + if (imeComposing || e.isComposing || e.keyCode === 229) return; + e.preventDefault(); + sendText(); + }); + + runHealthCheck(); +})(); diff --git a/skills/trtc-ai-service/capabilities/conversation-core/web-demo/index.html b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/index.html new file mode 100644 index 0000000..fef499d --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/index.html @@ -0,0 +1,68 @@ + + + + + + TRTC Conversational AI + + + +
+
+ + TRTC Conversational AI +
+
+
+ + Tencent Cloud + -- +
+
+ + TRTC + -- +
+
+ + LLM + -- +
+ +
+
+ +
+
+
+

Voice Agent Demo

+

All three indicators must be green before starting.

+
    +
  • Click Start to join the room and bring up the AI agent.
  • +
  • Click Mic to talk; the AI will answer with voice + live captions.
  • +
  • Or type a message — the AI will reply by voice + caption too.
  • +
+
+
+
+ +
+ + + +
+ + +
+
+ idle +
+
+ + + + + + diff --git a/skills/trtc-ai-service/capabilities/conversation-core/web-demo/styles.css b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/styles.css new file mode 100644 index 0000000..5e0e3b4 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/conversation-core/web-demo/styles.css @@ -0,0 +1,136 @@ +:root { + --bg: #0e1116; + --panel: #161b22; + --border: #30363d; + --text: #e6edf3; + --muted: #8b949e; + --green: #3fb950; + --red: #f85149; + --yellow: #d29922; + --accent: #2f81f7; +} +* { box-sizing: border-box; } +html, body { height: 100%; margin: 0; } +body { + font-family: -apple-system, BlinkMacSystemFont, "PingFang SC", "Helvetica Neue", Arial, sans-serif; + background: var(--bg); + color: var(--text); + display: flex; + flex-direction: column; +} + +/* ---------------- Status bar ---------------- */ +.status-bar { + display: flex; + align-items: center; + justify-content: space-between; + padding: 12px 24px; + background: var(--panel); + border-bottom: 1px solid var(--border); +} +.brand { display: flex; align-items: center; gap: 8px; font-size: 15px; } +.brand .dot { width: 8px; height: 8px; border-radius: 50%; background: var(--accent); } +.indicators { display: flex; align-items: center; gap: 12px; } +.indicator { + display: flex; align-items: center; gap: 6px; + background: rgba(255,255,255,0.03); + padding: 6px 10px; border-radius: 999px; + border: 1px solid var(--border); + font-size: 13px; +} +.led { + width: 10px; height: 10px; border-radius: 50%; + display: inline-block; + box-shadow: 0 0 6px currentColor; +} +.led-unknown { background: var(--muted); color: var(--muted); } +.led-ok { background: var(--green); color: var(--green); } +.led-fail { background: var(--red); color: var(--red); } +.led-pending { background: var(--yellow); color: var(--yellow); animation: blink 1s infinite; } +@keyframes blink { 50% { opacity: .35; } } +.indicator .latency { color: var(--muted); font-variant-numeric: tabular-nums; min-width: 38px; text-align: right; } + +/* ---------------- Conversation ---------------- */ +main { flex: 1; overflow: auto; padding: 24px; } +.conversation { max-width: 880px; margin: 0 auto; } +.placeholder { + border: 1px dashed var(--border); + border-radius: 12px; + padding: 32px; + color: var(--muted); +} +.placeholder h1 { color: var(--text); margin-top: 0; } +.hint-list { line-height: 1.8; padding-left: 20px; } +.hint-list em { color: var(--accent); font-style: normal; } + +.bubble { + max-width: 70%; + padding: 10px 14px; + border-radius: 14px; + margin: 8px 0; + white-space: pre-wrap; + word-break: break-word; + line-height: 1.5; + font-size: 14px; +} +.bubble.user { + background: var(--accent); + margin-left: auto; + border-bottom-right-radius: 4px; +} +.bubble.agent { + background: var(--panel); + border: 1px solid var(--border); + border-bottom-left-radius: 4px; +} + +/* ---------------- Control bar ---------------- */ +.control-bar { + display: flex; + flex-wrap: wrap; + align-items: center; + gap: 12px; + padding: 12px 24px; + border-top: 1px solid var(--border); + background: var(--panel); +} +.btn { + border: 1px solid var(--border); + background: transparent; + color: var(--text); + padding: 8px 16px; + border-radius: 8px; + font-size: 14px; + cursor: pointer; + transition: opacity .15s, background .15s; + display: inline-flex; align-items: center; gap: 6px; +} +.btn:disabled { opacity: .4; cursor: not-allowed; } +.btn-primary { background: var(--accent); border-color: var(--accent); } +.btn-danger { background: var(--red); border-color: var(--red); } +.btn-ghost:hover:not(:disabled) { background: rgba(255,255,255,.05); } +.btn-active { background: var(--red); border-color: var(--red); color: #fff; } + +.text-input-group { flex: 1; display: flex; gap: 8px; min-width: 240px; } +.text-input-group input { + flex: 1; + padding: 8px 12px; + border-radius: 8px; + border: 1px solid var(--border); + background: var(--bg); + color: var(--text); + font-size: 14px; +} +.text-input-group input:focus { border-color: var(--accent); outline: none; } + +.agent-status { font-size: 12px; color: var(--muted); } +.agent-state { + padding: 4px 10px; + border-radius: 999px; + border: 1px solid var(--border); + text-transform: lowercase; +} +.agent-state-idle { color: var(--muted); } +.agent-state-listening { color: var(--green); border-color: var(--green); } +.agent-state-thinking { color: var(--yellow); border-color: var(--yellow); animation: blink 1.2s infinite; } +.agent-state-speaking { color: var(--accent); border-color: var(--accent); } diff --git a/skills/trtc-ai-service/capabilities/digital-human/README.md b/skills/trtc-ai-service/capabilities/digital-human/README.md new file mode 100644 index 0000000..1a24f8b --- /dev/null +++ b/skills/trtc-ai-service/capabilities/digital-human/README.md @@ -0,0 +1,31 @@ +# digital-human · Digital Human Capability (Placeholder) + +> Phase 2 only declares the interface contract. Rendering / lip-sync / expression driving +> and other rendering layer implementations are deferred to future iterations (Phase 3+). + +## Current Capabilities + +- Register placeholder REST endpoints via manifest: `/api/v1/digital-human/*` +- Does not modify skeleton runtime behavior; only serves as an integration anchor for the future rendering layer + +## REST Placeholders + +| Method | Path | Behavior | +|:---|:---|:---| +| GET | `/api/v1/digital-human/status` | Returns current status / roadmap | +| POST | `/api/v1/digital-human/render` | Always returns `501 Not Implemented` | + +## Roadmap + +1. Integrate third-party rendering SDKs (Avatar / Lipsync / Expression) +2. Push rendering driver data via WebRTC datachannel +3. Align frame output with conversation-core TTS + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `DH_ENABLED` | `false` | Keep false before real enablement to avoid accidental use | +| `DH_AVATAR_ID` | _(empty)_ | Avatar ID | +| `DH_LIPSYNC_PROVIDER` | `tencent-cloud-vmp` | Lip-sync provider | +| `DH_EXPRESSION_PROVIDER` | `internal-rule` | Expression driver provider | diff --git a/skills/trtc-ai-service/capabilities/digital-human/manifest.yaml b/skills/trtc-ai-service/capabilities/digital-human/manifest.yaml new file mode 100644 index 0000000..abb7bbe --- /dev/null +++ b/skills/trtc-ai-service/capabilities/digital-human/manifest.yaml @@ -0,0 +1,64 @@ +# digital-human capability self-describing manifest +# Type: capability (reserved placeholder) +# +# Phase 2 goal: only declare manifest, provide "integration interface contract"; +# actual rendering / lip-sync / expression driving etc. deferred to future iterations. +# After injection into skeleton, this package does no code injection, only exposes REST endpoint stubs. + +name: "digital-human" +version: "0.1.0" +type: "capability" +description: "Digital human capability placeholder (rendering / lip-sync / expression driving). Only declares interface contract; rendering layer not enabled" + +dependencies: + - name: "conversation-core" + version: ">=1.0.0,<2.0.0" + +extensions: + - inject_at: "server.router_extension" + inline_code: | + # [digital-human] mount sub-router + from ._capability_loader import try_load_capability as _try_load_capability + _dh_router_mod = _try_load_capability("digital-human", "src/router.py") + if _dh_router_mod is not None and hasattr(_dh_router_mod, "router"): + app.include_router( + _dh_router_mod.router, prefix="/api/v1/digital-human", tags=["digital-human"] + ) + +config: + enabled: + description: "Whether to enable (stub implementation, defaults to false to avoid accidental use)" + default: false + avatar_id: + description: "Default digital human avatar ID (used when integrating third-party rendering service)" + default: "" + lipsync_provider: + description: "Lip-sync provider placeholder" + default: "tencent-cloud-vmp" + expression_provider: + description: "Expression driving provider placeholder" + default: "internal-rule" + +endpoints: + - method: GET + path: /api/v1/digital-human/status + description: Returns placeholder status and future plans + - method: POST + path: /api/v1/digital-human/render + description: Rendering interface contract (currently returns 501 Not Implemented) + +integration: + mode: "manual" + fallback: + manual_api: + rest_endpoint: "/api/v1/digital-human" + sdk_packages: [] + +security: + log_redaction: + enabled: true + patterns: ["api_key", "token"] + +acceptance: + - "Manifest passes manifest_resolver validation (dependencies / injection points valid)" + - "REST stub endpoints return explicit 'not implemented' marker" diff --git a/skills/trtc-ai-service/capabilities/digital-human/src/__init__.py b/skills/trtc-ai-service/capabilities/digital-human/src/__init__.py new file mode 100644 index 0000000..508693f --- /dev/null +++ b/skills/trtc-ai-service/capabilities/digital-human/src/__init__.py @@ -0,0 +1,2 @@ +"""digital-human capability placeholder. Phase 2 only declares interface contract.""" +__version__ = "0.1.0" diff --git a/skills/trtc-ai-service/capabilities/digital-human/src/router.py b/skills/trtc-ai-service/capabilities/digital-human/src/router.py new file mode 100644 index 0000000..325e315 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/digital-human/src/router.py @@ -0,0 +1,43 @@ +"""digital-human FastAPI placeholder router. + +Interface contract fixed: +- GET /status Returns placeholder status + roadmap +- POST /render Returns 501 Not Implemented, deferred to future iterations +""" +from __future__ import annotations + +import os + +from fastapi import APIRouter, HTTPException + +router = APIRouter() + + +@router.get("/status") +def status() -> dict: + return { + "code": 0, + "data": { + "enabled": os.getenv("DH_ENABLED", "false").lower() == "true", + "avatar_id": os.getenv("DH_AVATAR_ID", ""), + "lipsync_provider": os.getenv("DH_LIPSYNC_PROVIDER", "tencent-cloud-vmp"), + "expression_provider": os.getenv("DH_EXPRESSION_PROVIDER", "internal-rule"), + "phase": "placeholder", + "roadmap": [ + "Phase 3+: Integrate third-party rendering SDK (avatar / lipsync / expression)", + "Support WebRTC datachannel driver data push", + ], + }, + } + + +@router.post("/render") +def render() -> dict: + raise HTTPException( + status_code=501, + detail={ + "code": "not_implemented", + "message": "digital-human render is a placeholder; rendering layer not shipped in Phase 2", + "hint": "follow capabilities/digital-human/README.md for integration roadmap", + }, + ) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/INTERFACE_ADAPT.md b/skills/trtc-ai-service/capabilities/human-handoff/INTERFACE_ADAPT.md new file mode 100644 index 0000000..467fcb2 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/INTERFACE_ADAPT.md @@ -0,0 +1,353 @@ +# human-handoff Interface Adaptation SOP + +> When the user's existing ticket / agent dispatch system interface differs from this capability's default contract, follow this document for scenario-specific operations. +> Recommended: use `python scripts/contract-adapt.py human-handoff` for automated generation; this document is the manual fallback. + +--- + +## 1. Default Contract Overview + +This capability **calls** the user's ticket system interfaces (outbound): + +| Contract | Method | Path | Purpose | +|---|---|---|---| +| `ticket.create` | POST | `/tickets` | Create ticket | +| `ticket.status_query` | GET | `/tickets/{ticket_id}` | Query ticket status | +| `ticket.cancel` | POST | `/tickets/{ticket_id}/cancel` | Cancel ticket | +| `ticket.status_callback` | POST | `/api/v1/handoff/callback/ticket-status` | Business callback (inbound) | + +Full field definitions in `manifest.yaml` `business_contract.external_apis`. + +--- + +## 2. Three-Layer Defense Mechanism + +| Layer | Artifact Location | Applicable Scenario | +|---|---|---| +| **L1 Field Mapping** | Field name / simple type differences only | 90% of common cases | +| **L2 Adapter Subclass** | Auth, transport headers, error codes, URL template differences | Different auth mechanism / path/routing style | +| **L3 Full Custom Implementation** | Protocol-level differences (webhook / MQ / gRPC) | Non-REST protocols | + +All three layers land in `capabilities/human-handoff/src/adapters/user_custom.py` and are enabled via `HH_ADAPTER=user_custom`. + +--- + +## 3. L1 Field Mapping (Most Common) + +### 3.1 Applicability + +- User interface is still REST + JSON +- Only field name / field path differences (within `adapter_slots` scope) +- Field types are consistent (string ↔ string, int ↔ int) + +### 3.2 Steps + +**Step 1**: Paste the user's curl or OpenAPI + +```bash +# User's ticket creation interface +curl -X POST https://crm.example.com/api/v2/work_orders \ + -H 'X-Auth-Token: xxx' \ + -d '{ + "customer_id": "u001", + "title": "Refund issue", + "level": "P2", + "messages": ["..."] + }' +# Response: { "id": "WO123", "rank": 5, "wait_estimate": 150 } +``` + +**Step 2**: Write mapping file `capabilities/human-handoff/src/adapters/user_custom_mapping.yaml` + +```yaml +# Field path mapping: left = default contract field, right = user's actual field +ticket.create: + request: + user_id: customer_id + subject: title + priority: level # Value mapping below + transcript: messages + response: + ticket_id: id + queue_position: rank + eta_seconds: wait_estimate + # Enum value mapping + enum_map: + request.priority: + low: P3 + normal: P2 + high: P1 + urgent: P0 + +ticket.status_query: + request: + ticket_id: id + response: + ticket_id: id + status: state + enum_map: + response.status: + pending: queued + processing: in_progress + closed: done + canceled: cancelled +``` + +**Step 3**: Generate adapter (via tool) + +```bash +python scripts/contract-adapt.py human-handoff \ + --base-url https://crm.example.com \ + --auth-header "X-Auth-Token" \ + --mapping capabilities/human-handoff/src/adapters/user_custom_mapping.yaml +``` + +The tool generates `user_custom.py` based on the mapping, automatically inheriting `DefaultRestHandoffClient` and overriding field mapping logic. + +**Step 4**: Enable + +```bash +export HH_ADAPTER=user_custom +export HH_REST_BASE_URL=https://crm.example.com +export HH_REST_TOKEN= # Optional; leave empty if no token +``` + +--- + +## 4. L2 Adapter Subclass (Auth / Path Style Differences) + +### 4.1 Applicability + +- Auth method is not Bearer (e.g. `X-Auth-Token`, `HMAC-SHA256` signature, dual token) +- Different path templates (e.g. `/tickets/{id}` vs `/work-orders/by-id/{id}`) +- Error codes are not HTTP standard (e.g. returns 200 but body has `code != 0`) + +### 4.2 Template Code + +```python +# capabilities/human-handoff/src/adapters/user_custom.py +from typing import List, Optional + +from ..core.models import Ticket, TicketStatus +from .default_rest import DefaultRestHandoffClient + + +class UserCustomHandoffClient(DefaultRestHandoffClient): + """User ticket system adapter (L2).""" + + def _headers(self) -> dict: + # Override auth method + h = {"Content-Type": "application/json"} + if self._token: + h["X-Auth-Token"] = self._token # Not Bearer + return h + + def create_ticket( + self, + *, + user_id: str, + subject: str = "", + description: str = "", + priority: str = "normal", + transcript: Optional[List[str]] = None, + ) -> Ticket: + # TODO Field remapping + payload = { + "customer_id": user_id, + "title": subject, + "level": {"low": "P3", "normal": "P2", "high": "P1", "urgent": "P0"}[priority], + "messages": list(transcript or []), + } + data = self._post("/api/v2/work_orders", payload) + return Ticket( + ticket_id=str(data["id"]), + user_id=user_id, + subject=subject, + description=description, + priority=priority, + queue_position=int(data.get("rank", 0)), + eta_seconds=int(data.get("wait_estimate", 0)), + transcript=list(transcript or []), + ) + + def query_status(self, ticket_id: str) -> Optional[TicketStatus]: + # TODO Path template remapping + data = self._get(f"/api/v2/work_orders/by-id/{ticket_id}", optional=True) + if data is None: + return None + # TODO Status enum remapping + status_map = {"queued": "pending", "in_progress": "processing", "done": "closed"} + return TicketStatus( + ticket_id=str(data["id"]), + status=status_map.get(data.get("state", ""), data.get("state", "pending")), + agent_id=data.get("operator"), + ) + + +def from_env() -> Optional["UserCustomHandoffClient"]: + import os + base = os.getenv("HH_REST_BASE_URL") + if not base: + return None + return UserCustomHandoffClient( + base_url=base, + token=os.getenv("HH_REST_TOKEN"), + timeout_ms=int(os.getenv("HH_REST_TIMEOUT_MS", "5000")), + ) +``` + +### 4.3 Enable + +```bash +export HH_ADAPTER=user_custom +export HH_REST_BASE_URL=https://crm.example.com +export HH_REST_TOKEN= +``` + +--- + +## 5. L3 Full Custom (Protocol Differences) + +### 5.1 Applicability + +- Business side uses webhooks (you push messages; business side async callbacks) +- Business side uses message queues (Kafka / RocketMQ / RabbitMQ) +- Business side uses gRPC / gRPC-Web +- User system is fully custom; no "generic ticket interface" concept + +### 5.2 Template Code + +```python +# capabilities/human-handoff/src/adapters/user_custom.py +from typing import List, Optional + +from ..core.models import OverallStatus, Ticket, TicketStatus, TicketStatusEnum, now_ts +from ..ports.handoff_client import HandoffClient + + +class UserCustomHandoffClient(HandoffClient): + """User custom protocol adapter (L3: directly implements HandoffClient).""" + + def __init__(self, **kwargs): + # TODO Initialize your client: Kafka producer / gRPC channel / webhook poster etc. + ... + + def create_ticket(self, *, user_id, subject="", description="", + priority="normal", transcript=None) -> Ticket: + # TODO Send ticket creation message using your own protocol + # e.g.: self._kafka.send("ticket.create", {...}) + ticket_id = Ticket.new_id() + return Ticket( + ticket_id=ticket_id, + user_id=user_id, + subject=subject, + description=description, + priority=priority, + status=TicketStatusEnum.PENDING.value, + transcript=list(transcript or []), + created_at=now_ts(), + ) + + def query_status(self, ticket_id: str) -> Optional[TicketStatus]: + # TODO Query status from your storage / API + ... + + def cancel_ticket(self, ticket_id: str, reason: str = "") -> Optional[Ticket]: + # TODO + ... + + def overall_status(self) -> OverallStatus: + return OverallStatus( + agent_pool_size=-1, available_agents=-1, waiting=-1, connected=-1, capacity=-1 + ) + + def list_tickets(self, *, limit=50, status=None) -> List[Ticket]: + # Dashboard may use this; return empty if remote backend doesn't support enumeration + return [] + + +def from_env(): + return UserCustomHandoffClient( + broker=__import__("os").getenv("HH_BROKER_URL", ""), + ) +``` + +--- + +## 6. Inbound Callback Integration (`ticket.status_callback`) + +If the user's ticket system supports proactive callbacks, enabling inbound mode is recommended: + +### 6.1 Our Exposed Callback Endpoint + +``` +POST /api/v1/handoff/callback/ticket-status +Content-Type: application/json +{ + "ticket_id": "WO123", + "status": "processing", + "agent_id": "alice" +} +``` + +Returns `{"code": 0, "message": "ok"}`. + +> **Note**: This release's router.py has **not implemented** this inbound endpoint; using inbound mode requires registering a FastAPI route in user_custom.py and implementing it yourself, or wait for Phase 4 auto-generation by contract-adapt.py. + +### 6.2 Inbound Field Mapping (different callback field names) + +If the user system's callback uses field names like `id` / `state` / `operator`, add inbound mapping in user_custom.py: + +```python +# Register the callback endpoint in the router and call this method to convert the payload +def _map_inbound(payload: dict) -> dict: + return { + "ticket_id": payload.get("id") or payload.get("ticket_id"), + "status": {"queued": "pending", "in_progress": "processing"}.get( + payload.get("state"), payload.get("status") + ), + "agent_id": payload.get("operator") or payload.get("agent_id"), + } +``` + +--- + +## 7. Switch / Verify + +### 7.1 Enable user_custom + +```bash +export HH_ADAPTER=user_custom +# Takes effect after service restart +``` + +### 7.2 Unit Self-Check + +```bash +python -c " +from capabilities.human_handoff.src.adapters.factory import build_default +c = build_default() +print('adapter:', type(c).__name__) +t = c.create_ticket(user_id='u_test', subject='ping') +print('created:', t.to_dict()) +print('queried:', c.query_status(t.ticket_id)) +" +``` + +### 7.3 End-to-End + +```bash +curl -X POST http://localhost:3000/api/v1/handoff/request \ + -H 'Content-Type: application/json' \ + -d '{"session_id":"u_test","reason":"I want to complain"}' +``` + +--- + +## 8. Security Checklist + +- [ ] `HH_REST_BASE_URL` must use https:// (localhost excepted) +- [ ] Default reject private network addresses (9.* / 10.* / 172.16-31.* / 192.168.* / 169.254.*) +- [ ] Auth token only from environment variables — **no hardcoding** in user_custom.py +- [ ] Remote exceptions do not print response bodies (may contain PII) +- [ ] `Authorization` / `X-Auth-Token` headers auto-redacted in logs (handled by skeleton `log_redaction`) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/README.md b/skills/trtc-ai-service/capabilities/human-handoff/README.md new file mode 100644 index 0000000..0041585 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/README.md @@ -0,0 +1,44 @@ +# human-handoff · Handoff + Queue Status Sync + +> Provides semantic-triggered human handoff + queue status sync + agent connection capabilities on top of conversation-core. + +## Install + +```bash +python scripts/add-capability.py human-handoff +``` + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `HH_TRIGGERS` | See below | Strong trigger keywords, CSV | +| `HH_INTENT_KEYWORDS` | See below | Weak intent keywords, CSV | +| `HH_QUEUE_CAPACITY` | 50 | Queue capacity | +| `HH_AGENT_POOL_SIZE` | 1 | Available agent count | +| `HH_WAIT_PER_SLOT` | 30 | Estimated wait seconds per slot | + +Default strong triggers: `talk to agent / real person / human support` +Default weak triggers: `complain / manager / supervisor / not working` (negative context excluded) + +## REST API + +| Method | Path | Purpose | +|:---|:---|:---| +| GET | `/api/v1/handoff/status` | Overall queue status | +| GET | `/api/v1/handoff/{session_id}` | Single session status | +| POST | `/api/v1/handoff/request` | Explicit handoff request | +| POST | `/api/v1/handoff/connect` | Simulate agent connection | +| POST | `/api/v1/handoff/cancel` | Cancel request | + +## State Machine + +``` + idle ──request──▶ waiting ──connect──▶ connected + │ ▲ │ + │ │ ▼ + cancel/timeout cancel +``` + +Integrators should subscribe to `/handoff/status` or `/handoff/{id}` in their agent system for sync push. + diff --git a/skills/trtc-ai-service/capabilities/human-handoff/manifest.yaml b/skills/trtc-ai-service/capabilities/human-handoff/manifest.yaml new file mode 100644 index 0000000..4898998 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/manifest.yaml @@ -0,0 +1,227 @@ +# human-handoff capability self-describing manifest +# Type: capability (optional install) + +name: "human-handoff" +version: "1.1.0" +type: "capability" +description: "Semantic trigger for human handoff + ticket lifecycle (Phase 3 refactored to ports/adapters/core three-layer architecture)" + +dependencies: + - name: "conversation-core" + version: ">=1.0.0,<2.0.0" + +# --------------------------------------------------------------------------- +# Injection points (maintains full Phase 2 compatibility; trigger.py / queue.py are now facades) +# --------------------------------------------------------------------------- +extensions: + # Detect whether to trigger "handoff" before text injection; if matched, route to human channel + - inject_at: "agent.before_push_text" + inline_code: | + # [human-handoff] semantic trigger detection + from ._capability_loader import try_load_capability + _hh_trigger = try_load_capability("human-handoff", "src/trigger.py") + if _hh_trigger is not None: + _hand = _hh_trigger.maybe_handoff(session_id, text) + if _hand is not None: + text = _hand # Replace original text with queue/connect messaging to prevent LLM self-answering + # Release queue slot after agent stops + - inject_at: "agent.after_start" + inline_code: | + # [human-handoff] register session for handoff state machine + from ._capability_loader import try_load_capability + _hh_queue = try_load_capability("human-handoff", "src/queue.py") + if _hh_queue is not None: + _hh_queue.attach_session(session_id, info=info if 'info' in locals() else None) + - inject_at: "server.router_extension" + inline_code: | + # [human-handoff] mount sub-router + from ._capability_loader import try_load_capability as _try_load_capability + _hh_router_mod = _try_load_capability("human-handoff", "src/router.py") + if _hh_router_mod is not None and hasattr(_hh_router_mod, "router"): + app.include_router( + _hh_router_mod.router, prefix="/api/v1/handoff", tags=["human-handoff"] + ) + +# --------------------------------------------------------------------------- +# Configuration +# --------------------------------------------------------------------------- +config: + triggers: + description: "Strong trigger keyword list (exact match / case-insensitive)" + default: + - "talk to agent" + - "human support" + - "real person" + - "talk to agent" + - "speak to a human" + intent_keywords: + description: "Weak trigger keywords (subtracted from negative context)" + default: + - "complain" + - "manager" + - "supervisor" + - "not working" + queue_capacity: + description: "Local queue capacity (effective only for local_queue / mock adapter)" + default: 50 + agent_pool_size: + description: "Number of available agents (auto-connect when >0)" + default: 1 + estimated_wait_seconds_per_slot: + default: 30 + retention_seconds: + description: "Retention duration for completed sessions (used for summary queries)" + default: 600 + adapter: + description: "Which HandoffClient implementation to use: local_queue | mock | default_rest | user_custom" + default: "local_queue" + env: "HH_ADAPTER" + +# --------------------------------------------------------------------------- +# API endpoints +# --------------------------------------------------------------------------- +endpoints: + - method: GET + path: /api/v1/handoff/status + description: Query overall queue status + - method: GET + path: /api/v1/handoff/{session_id} + description: Query handoff status for a single session (legacy field format, backward compatible) + - method: POST + path: /api/v1/handoff/request + description: "Explicit handoff request submission (body shape: {session_id, reason?})" + - method: POST + path: /api/v1/handoff/connect + description: "Simulate agent connection (body shape: {session_id, agent_id})" + - method: POST + path: /api/v1/handoff/cancel + description: "Cancel queue (body shape: {session_id})" + # ----- Phase 3 new: Ticket agent dashboard endpoints ----- + - method: GET + path: /api/v1/handoff/admin/tickets + description: "Ticket list (supports ?status=&limit= filtering)" + - method: GET + path: /api/v1/handoff/admin/tickets/{ticket_id} + description: "Ticket details" + - method: POST + path: /api/v1/handoff/admin/tickets/{ticket_id}/status + description: "Agent manually switches ticket status (pending|processing|closed|canceled|timeout)" + +# --------------------------------------------------------------------------- +# Business contract (Phase 3 new; follows references/business-contract-spec.md v1.0) +# --------------------------------------------------------------------------- +business_contract: + port_class: "src.ports.handoff_client.HandoffClient" + default_adapter: "src.adapters.local_queue.LocalQueueHandoffClient" + mock_adapter: "src.adapters.mock.MockHandoffClient" + customization_sop: "INTERFACE_ADAPT.md" + external_apis: + - name: ticket.create + direction: outbound + method: POST + path: /tickets + description: "Create a new ticket in the ticketing system when user triggers handoff" + request_schema: + user_id: string + subject: string + description: string + priority: enum[low, normal, high, urgent] + transcript: string[] + response_schema: + ticket_id: string + queue_position: int + eta_seconds: int + adapter_slots: + - request.subject + - request.priority + - response.ticket_id + - response.queue_position + - response.eta_seconds + auth: + type: bearer + location: header + name: Authorization + timeout_ms: 5000 + + - name: ticket.status_query + direction: outbound + method: GET + path: /tickets/{ticket_id} + description: "Poll ticket status for queue progress updates" + request_schema: + ticket_id: string + response_schema: + ticket_id: string + status: enum[pending, processing, closed, canceled] + agent_id: string + updated_at: int + adapter_slots: + - response.status + - response.agent_id + timeout_ms: 3000 + + - name: ticket.cancel + direction: outbound + method: POST + path: /tickets/{ticket_id}/cancel + description: "Notify ticketing system when user cancels handoff" + request_schema: + ticket_id: string + reason: string + response_schema: + ticket_id: string + canceled: bool + adapter_slots: + - request.reason + timeout_ms: 3000 + + - name: ticket.status_callback + direction: inbound + method: POST + path: /api/v1/handoff/callback/ticket-status + description: "Callback from ticketing system to notify status changes (optional; when disabled, status_query polling is used instead)" + request_schema: + ticket_id: string + status: enum[pending, processing, closed, canceled] + agent_id: string + response_schema: + code: int + message: string + adapter_slots: + - request.status + - request.agent_id + +# --------------------------------------------------------------------------- +# Integration +# --------------------------------------------------------------------------- +integration: + mode: "auto" + auto_adapters: + - tech_stack: ["react", "vue", "angular"] + adapter: "frontend-spa" + description: "Append queue status indicator component to SPA" + - tech_stack: ["express", "koa", "fastify", "next"] + adapter: "node-backend" + - tech_stack: ["flask", "fastapi", "django"] + adapter: "python-backend" + - tech_stack: ["spring-boot", "quarkus"] + adapter: "java-backend" + fallback: + guided_templates: + - "../../auto_adapters/integration_templates/generic-frontend.md" + manual_api: + rest_endpoint: "/api/v1/handoff" + +security: + log_redaction: + enabled: true + patterns: ["phone", "mobile", "user_id", "credential", "authorization"] + network: + enforce_https: true # default_rest adapter enforces HTTPS for non-localhost + +acceptance: + - "Keyword match immediately enters ticket flow, status pollable by frontend" + - "Connects instantly when agent pool > 0; returns estimated wait time when = 0" + - "Correctly releases slot when session is canceled / connected" + - "Switching HH_ADAPTER requires no business code changes (local_queue / mock / default_rest / user_custom)" + - "default_rest adapter rejects private network addresses (except localhost)" diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/__init__.py b/skills/trtc-ai-service/capabilities/human-handoff/src/__init__.py new file mode 100644 index 0000000..5fe55cb --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/__init__.py @@ -0,0 +1,2 @@ +"""human-handoff capability: semantic-triggered handoff + queue status sync.""" +__version__ = "1.0.0" diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/__init__.py b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/__init__.py new file mode 100644 index 0000000..ceecf7f --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/__init__.py @@ -0,0 +1,9 @@ +"""human-handoff adapter implementations.""" +from .factory import build_default, get_client, reset_client, set_client + +__all__ = [ + "build_default", + "get_client", + "reset_client", + "set_client", +] diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/default_rest.py b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/default_rest.py new file mode 100644 index 0000000..5b7eff3 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/default_rest.py @@ -0,0 +1,242 @@ +"""DefaultRestHandoffClient — call external ticketing system per business_contract default contract. + +Corresponding contracts: +- POST /tickets ticket.create +- GET /tickets/{ticket_id} ticket.status_query +- POST /tickets/{ticket_id}/cancel ticket.cancel + +Environment variables: +- HH_REST_BASE_URL Ticketing system base URL (required; must not point to private network, see §Security) +- HH_REST_TOKEN Bearer Token (optional) +- HH_REST_TIMEOUT_MS Timeout (default 5000) + +Security constraints (aligned with project security_rules): +- Only allow https:// or http://localhost / 127.0.0.1 +- Default reject common private network ranges (9.* / 10.* / 11.* / 21.* / 30.* / 169.254.* / 172.16-31.* / 192.168.*) +- Log redaction auto-masks Authorization + +Dependency: only requests (already in conversation-core/requirements.txt), no new deps. +""" +from __future__ import annotations + +import logging +import os +import re +from typing import List, Optional +from urllib.parse import urlparse + +try: + import requests # type: ignore +except ImportError: # pragma: no cover + requests = None # type: ignore + +from ..core.models import ( + OverallStatus, + Ticket, + TicketStatus, + TicketStatusEnum, + now_ts, +) +from ..ports.handoff_client import HandoffClient + + +logger = logging.getLogger(__name__) + + +# --------------------------------------------------------------------------- +# Security: private network and loopback check +# --------------------------------------------------------------------------- +_PRIVATE_PATTERNS = [ + re.compile(r"^9\."), + re.compile(r"^10\."), + re.compile(r"^11\."), + re.compile(r"^21\."), + re.compile(r"^30\."), + re.compile(r"^169\.254\."), + re.compile(r"^172\.(1[6-9]|2[0-9]|3[01])\."), + re.compile(r"^192\.168\."), +] + + +def _is_localhost(host: str) -> bool: + return host in {"localhost", "127.0.0.1", "::1"} + + +def _is_private(host: str) -> bool: + return any(p.match(host) for p in _PRIVATE_PATTERNS) + + +def _validate_base_url(url: str) -> str: + parsed = urlparse(url) + if parsed.scheme not in {"http", "https"}: + raise ValueError(f"unsupported scheme: {parsed.scheme}") + host = parsed.hostname or "" + if not host: + raise ValueError("empty host in HH_REST_BASE_URL") + # Non-HTTPS only allowed for localhost + if parsed.scheme == "http" and not _is_localhost(host): + raise ValueError( + "non-HTTPS HH_REST_BASE_URL only allowed for localhost" + ) + # Reject private network ranges (prevent SSRF) + if _is_private(host): + raise ValueError( + f"access to private network host '{host}' is denied; " + "set HH_REST_ALLOW_PRIVATE=1 to override (not recommended)" + ) + return url.rstrip("/") + + +# --------------------------------------------------------------------------- +# Client implementation +# --------------------------------------------------------------------------- +class DefaultRestHandoffClient(HandoffClient): + """Call external ticketing system per default REST contract.""" + + def __init__( + self, + *, + base_url: str, + token: Optional[str] = None, + timeout_ms: int = 5000, + ) -> None: + if requests is None: + raise RuntimeError( + "requests library is required for DefaultRestHandoffClient" + ) + self._base = _validate_base_url(base_url) + self._token = token + self._timeout = max(0.5, timeout_ms / 1000.0) + self._session = requests.Session() + + # ------------------------------------------------------------------ + # HandoffClient required implementations + # ------------------------------------------------------------------ + def create_ticket( + self, + *, + user_id: str, + subject: str = "", + description: str = "", + priority: str = "normal", + transcript: Optional[List[str]] = None, + ) -> Ticket: + payload = { + "user_id": user_id, + "subject": subject, + "description": description, + "priority": priority or "normal", + "transcript": list(transcript or []), + } + data = self._post("/tickets", payload) + ticket_id = str(data.get("ticket_id") or "").strip() + if not ticket_id: + raise RuntimeError("remote ticket service did not return ticket_id") + return Ticket( + ticket_id=ticket_id, + user_id=user_id, + subject=subject, + description=description, + priority=priority or "normal", + status=TicketStatusEnum.PENDING.value, + queue_position=int(data.get("queue_position") or 0), + eta_seconds=int(data.get("eta_seconds") or 0), + transcript=list(transcript or []), + reason=description[:128] if description else "", + created_at=now_ts(), + updated_at=now_ts(), + ) + + def query_status(self, ticket_id: str) -> Optional[TicketStatus]: + data = self._get(f"/tickets/{ticket_id}", optional=True) + if data is None: + return None + return TicketStatus( + ticket_id=str(data.get("ticket_id") or ticket_id), + status=str(data.get("status") or TicketStatusEnum.PENDING.value), + agent_id=data.get("agent_id"), + updated_at=float(data.get("updated_at") or 0.0) or None, + ) + + def cancel_ticket(self, ticket_id: str, reason: str = "") -> Optional[Ticket]: + data = self._post( + f"/tickets/{ticket_id}/cancel", + {"ticket_id": ticket_id, "reason": reason}, + optional=True, + ) + if data is None: + return None + return Ticket( + ticket_id=ticket_id, + user_id="", # remote may not return user_id + status=TicketStatusEnum.CANCELED.value, + reason=reason, + updated_at=now_ts(), + closed_at=now_ts(), + ) + + def overall_status(self) -> OverallStatus: + # Remote backend does not expose overall status; return placeholder (for /api/v1/handoff/status compatibility) + return OverallStatus( + agent_pool_size=-1, + available_agents=-1, + waiting=-1, + connected=-1, + capacity=-1, + ) + + # ------------------------------------------------------------------ + # Internal:HTTP + # ------------------------------------------------------------------ + def _headers(self) -> dict: + h = {"Content-Type": "application/json"} + if self._token: + h["Authorization"] = f"Bearer {self._token}" + return h + + def _get(self, path: str, *, optional: bool = False): + url = self._base + path + resp = self._session.get(url, headers=self._headers(), timeout=self._timeout) + return self._handle(resp, optional=optional) + + def _post(self, path: str, payload: dict, *, optional: bool = False): + url = self._base + path + resp = self._session.post( + url, + json=payload, + headers=self._headers(), + timeout=self._timeout, + ) + return self._handle(resp, optional=optional) + + @staticmethod + def _handle(resp, *, optional: bool): + if resp.status_code == 404 and optional: + return None + if resp.status_code >= 400: + # Do not print response body (may contain sensitive info) + raise RuntimeError( + f"remote ticket service returned HTTP {resp.status_code}" + ) + try: + data = resp.json() + except ValueError as exc: + raise RuntimeError("remote ticket service returned non-JSON") from exc + # Response may be {"data": {...}} or flat {...} + if isinstance(data, dict) and "data" in data and isinstance(data["data"], dict): + return data["data"] + return data + + +# --------------------------------------------------------------------------- +# Factory +# --------------------------------------------------------------------------- +def from_env() -> Optional[DefaultRestHandoffClient]: + base = os.getenv("HH_REST_BASE_URL") + if not base: + return None + return DefaultRestHandoffClient( + base_url=base, + token=os.getenv("HH_REST_TOKEN"), + timeout_ms=int(os.getenv("HH_REST_TIMEOUT_MS", "5000")), + ) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/factory.py b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/factory.py new file mode 100644 index 0000000..ed9a664 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/factory.py @@ -0,0 +1,89 @@ +"""adapter factory: selects HandoffClient implementation based on environment variable. + +Environment variable `HH_ADAPTER`: + local_queue Default local in-memory queue (production-ready, zero dependencies) + mock Demo data (includes several preset tickets for video recording) + default_rest Call remote ticketing system per business_contract default contract + user_custom User integration wizard (contract-adapt.py) generated implementation + +When not set or invalid, fall back to local_queue (keeps behavior compatibility with Phase 2). +""" +from __future__ import annotations + +import logging +import os +from typing import Optional + +from ..ports.handoff_client import HandoffClient + + +logger = logging.getLogger(__name__) + + +_VALID = ("local_queue", "mock", "default_rest", "user_custom") + + +def _build(name: str) -> Optional[HandoffClient]: + if name == "local_queue": + from .local_queue import from_env as build_local + return build_local() + if name == "mock": + from .mock import from_env as build_mock + return build_mock() + if name == "default_rest": + from .default_rest import from_env as build_rest + c = build_rest() + if c is None: + logger.warning( + "HH_ADAPTER=default_rest but HH_REST_BASE_URL is empty; " + "falling back to local_queue" + ) + return c + if name == "user_custom": + try: + from .user_custom import from_env as build_custom # type: ignore + except ImportError: + logger.warning( + "HH_ADAPTER=user_custom but src/adapters/user_custom.py is missing; " + "run scripts/contract-adapt.py human-handoff to generate it" + ) + return None + return build_custom() + return None + + +def build_default() -> HandoffClient: + """Build default client from environment variables; invalid config falls back to local_queue.""" + name = (os.getenv("HH_ADAPTER") or "local_queue").strip().lower() + if name not in _VALID: + logger.warning("HH_ADAPTER=%s is not recognised; using local_queue", name) + name = "local_queue" + client = _build(name) + if client is None: + from .local_queue import from_env as build_local + client = build_local() + return client + + +# --------------------------------------------------------------------------- +# Global singleton +# --------------------------------------------------------------------------- +_singleton: Optional[HandoffClient] = None + + +def get_client() -> HandoffClient: + global _singleton + if _singleton is None: + _singleton = build_default() + return _singleton + + +def set_client(client: HandoffClient) -> None: + """For testing only: inject a custom client.""" + global _singleton + _singleton = client + + +def reset_client() -> None: + global _singleton + _singleton = None diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/local_queue.py b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/local_queue.py new file mode 100644 index 0000000..e7379d0 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/local_queue.py @@ -0,0 +1,258 @@ +"""LocalQueueHandoffClient — default local implementation. + +Zero external dependencies, in-process queuing + agent allocation. Migrated from the original queue.py implementation as the "default out-of-the-box" version. + +Implementation notes: +- user_id also serves as ticket_id (keeping behavior consistent with old session_id) +- State machine: + PENDING ──connect──▶ PROCESSING + │ ▲ │ + │ │ ▼ + cancel/timeout cancel/close +- Single-process RLock protection; cross-process sync handled by integrator upper layer (e.g. Redis) +- Capacity and agent count read from environment variables; can be overridden via constructor +""" +from __future__ import annotations + +import os +import threading +from typing import Dict, List, Optional + +from ..core.models import ( + OverallStatus, + Ticket, + TicketStatus, + TicketStatusEnum, + now_ts, +) +from ..ports.handoff_client import HandoffClient + + +class LocalQueueHandoffClient(HandoffClient): + """In-process in-memory queuing HandoffClient implementation.""" + + def __init__( + self, + *, + capacity: int = 50, + agent_pool_size: int = 1, + estimated_wait_per_slot: int = 30, + ) -> None: + self._lock = threading.RLock() + self._tickets: Dict[str, Ticket] = {} # ticket_id -> Ticket + self._waiting: List[str] = [] # ticket_id list, FIFO + self._connected: Dict[str, str] = {} # ticket_id -> agent_id + self._capacity = int(capacity) + self._pool = int(agent_pool_size) + self._wait_per_slot = max(1, int(estimated_wait_per_slot)) + + # ------------------------------------------------------------------ + # HandoffClient required implementations + # ------------------------------------------------------------------ + def create_ticket( + self, + *, + user_id: str, + subject: str = "", + description: str = "", + priority: str = "normal", + transcript: Optional[List[str]] = None, + ) -> Ticket: + if not user_id: + raise ValueError("user_id is required") + with self._lock: + # Only one in-progress ticket per user; if exists, refresh position and return + existing = self._find_active_by_user(user_id) + if existing is not None: + if existing.status == TicketStatusEnum.PROCESSING.value: + return existing + self._refresh_position(existing) + return existing + + ticket_id = user_id # Compatible with old behavior: session_id is ticket_id + t = Ticket( + ticket_id=ticket_id, + user_id=user_id, + subject=subject, + description=description, + priority=priority or "normal", + transcript=list(transcript or []), + reason=description[:128] if description else "", + created_at=now_ts(), + updated_at=now_ts(), + ) + + # Queue full and no available agents: mark TIMEOUT + if ( + len(self._waiting) >= self._capacity + and self._available_agents() == 0 + ): + t.status = TicketStatusEnum.TIMEOUT.value + t.closed_at = now_ts() + self._tickets[ticket_id] = t + return t + + t.status = TicketStatusEnum.PENDING.value + self._tickets[ticket_id] = t + self._waiting.append(ticket_id) + + # Auto-connect if an agent is available + if self._available_agents() > 0: + self._auto_connect() + self._refresh_position(t) + return t + + def query_status(self, ticket_id: str) -> Optional[TicketStatus]: + with self._lock: + t = self._tickets.get(ticket_id) + if t is None: + return None + return TicketStatus.from_ticket(t) + + def cancel_ticket(self, ticket_id: str, reason: str = "") -> Optional[Ticket]: + with self._lock: + t = self._tickets.get(ticket_id) + if t is None: + return None + if t.status == TicketStatusEnum.PROCESSING.value: + self._connected.pop(ticket_id, None) + self._waiting = [s for s in self._waiting if s != ticket_id] + t.status = TicketStatusEnum.CANCELED.value + t.reason = reason or t.reason + t.closed_at = now_ts() + t.updated_at = now_ts() + self._refresh_all_positions() + return t + + def overall_status(self) -> OverallStatus: + with self._lock: + return OverallStatus( + agent_pool_size=self._pool, + available_agents=self._available_agents(), + waiting=len(self._waiting), + connected=len(self._connected), + capacity=self._capacity, + ) + + # ------------------------------------------------------------------ + # Dashboard helper methods + # ------------------------------------------------------------------ + def list_tickets( + self, + *, + limit: int = 50, + status: Optional[str] = None, + ) -> List[Ticket]: + with self._lock: + items = list(self._tickets.values()) + if status: + items = [t for t in items if t.status == status] + items.sort( + key=lambda x: (x.created_at or 0.0), + reverse=True, + ) + return items[: max(1, int(limit))] + + def update_status( + self, + ticket_id: str, + status: str, + *, + agent_id: Optional[str] = None, + ) -> Optional[Ticket]: + try: + new_status = TicketStatusEnum(status).value + except ValueError as exc: + raise ValueError(f"invalid status: {status}") from exc + + with self._lock: + t = self._tickets.get(ticket_id) + if t is None: + return None + + old_status = t.status + t.status = new_status + t.updated_at = now_ts() + + if new_status == TicketStatusEnum.PROCESSING.value: + if old_status != TicketStatusEnum.PROCESSING.value: + if self._available_agents() <= 0 and ticket_id not in self._connected: + # Force connect (manual): open new slot outside agent pool + pass + t.agent_id = agent_id or t.agent_id or f"agent_{ticket_id[-4:]}" + self._connected[ticket_id] = t.agent_id + self._waiting = [s for s in self._waiting if s != ticket_id] + elif new_status in ( + TicketStatusEnum.CLOSED.value, + TicketStatusEnum.CANCELED.value, + TicketStatusEnum.TIMEOUT.value, + ): + t.closed_at = now_ts() + self._connected.pop(ticket_id, None) + self._waiting = [s for s in self._waiting if s != ticket_id] + elif new_status == TicketStatusEnum.PENDING.value: + self._connected.pop(ticket_id, None) + if ticket_id not in self._waiting: + self._waiting.append(ticket_id) + + self._refresh_all_positions() + if self._available_agents() > 0: + self._auto_connect() + return t + + def get_or_attach(self, user_id: str) -> Optional[Ticket]: + with self._lock: + return self._find_active_by_user(user_id) + + # ------------------------------------------------------------------ + # Internal + # ------------------------------------------------------------------ + def _available_agents(self) -> int: + return max(0, self._pool - len(self._connected)) + + def _find_active_by_user(self, user_id: str) -> Optional[Ticket]: + for t in self._tickets.values(): + if t.user_id == user_id and t.status in ( + TicketStatusEnum.PENDING.value, + TicketStatusEnum.PROCESSING.value, + ): + return t + return None + + def _auto_connect(self) -> None: + while self._waiting and self._available_agents() > 0: + tid = self._waiting.pop(0) + t = self._tickets.get(tid) + if t is None: + continue + t.status = TicketStatusEnum.PROCESSING.value + t.updated_at = now_ts() + t.agent_id = f"agent_auto_{int(t.updated_at)}" + self._connected[tid] = t.agent_id + + def _refresh_position(self, t: Ticket) -> None: + if ( + t.status == TicketStatusEnum.PENDING.value + and t.ticket_id in self._waiting + ): + pos = self._waiting.index(t.ticket_id) + 1 + t.queue_position = pos + t.eta_seconds = pos * self._wait_per_slot + else: + t.queue_position = 0 + t.eta_seconds = 0 + + def _refresh_all_positions(self) -> None: + for t in self._tickets.values(): + self._refresh_position(t) + + +# --------------------------------------------------------------------------- +# Factory: build default parameters from environment variables +# --------------------------------------------------------------------------- +def from_env() -> LocalQueueHandoffClient: + return LocalQueueHandoffClient( + capacity=int(os.getenv("HH_QUEUE_CAPACITY", "50")), + agent_pool_size=int(os.getenv("HH_AGENT_POOL_SIZE", "1")), + estimated_wait_per_slot=int(os.getenv("HH_WAIT_PER_SLOT", "30")), + ) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/mock.py b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/mock.py new file mode 100644 index 0000000..2fc9245 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/adapters/mock.py @@ -0,0 +1,132 @@ +"""MockHandoffClient — mock implementation for Recipe demo recording. + +Inherits LocalQueueHandoffClient, pre-seeds several sample tickets on startup so the agent dashboard has content immediately. + +Differences from LocalQueueHandoffClient: +- Seeds example tickets at construction (one each for pending / processing / closed) +- Marks `is_mock = True`, making it easy to add a "demo data" watermark on the dashboard +- Demo data uses stable ticket_id prefix `demo_` for reproducible screenshots +""" +from __future__ import annotations + +from typing import List, Optional + +from ..core.models import Ticket, TicketStatusEnum, now_ts +from .local_queue import LocalQueueHandoffClient + + +class MockHandoffClient(LocalQueueHandoffClient): + """Mock implementation for demos.""" + + is_mock = True + + def __init__( + self, + *, + capacity: int = 50, + agent_pool_size: int = 2, + estimated_wait_per_slot: int = 30, + seed_demo_data: bool = True, + ) -> None: + super().__init__( + capacity=capacity, + agent_pool_size=agent_pool_size, + estimated_wait_per_slot=estimated_wait_per_slot, + ) + if seed_demo_data: + self._seed() + + def _seed(self) -> None: + """Seed demo data. Agent pool occupies 1 slot, leaving 1 free for live connection demos.""" + ts_base = now_ts() - 600 + + # 1) Closed ticket (10 minutes ago) + closed = Ticket( + ticket_id="demo_closed_001", + user_id="demo_user_001", + subject="Invoice header correction", + description="Shipped order needs invoice header correction", + priority="normal", + status=TicketStatusEnum.CLOSED.value, + agent_id="agent_alice", + transcript=[ + "User: Hi, I need to update the invoice header", + "AI: Which order is this regarding?", + "User: Order number SO20260601-0042", + "[handoff] User requested human agent", + "Agent Alice: Done. New invoice will be issued within 30 minutes.", + ], + reason="Invoice header correction", + created_at=ts_base, + updated_at=ts_base + 120, + closed_at=ts_base + 240, + ) + self._tickets[closed.ticket_id] = closed + + # 2) Processing ticket (occupies 1 agent slot) + processing = Ticket( + ticket_id="demo_processing_001", + user_id="demo_user_002", + subject="Return logistics issue", + description="Return not picked up for 5 days", + priority="high", + status=TicketStatusEnum.PROCESSING.value, + agent_id="agent_bob", + transcript=[ + "User: My return has been requested 5 days and no one has picked it up yet", + "AI: Let me check the logistics status for you...", + "AI: Sorry, the logistics API is temporarily unavailable for real-time info", + "[handoff] Escalating to human agent", + "Agent Bob: Hello, I am following up on your logistics issue", + ], + reason="Return logistics issue", + created_at=ts_base + 300, + updated_at=ts_base + 320, + ) + self._tickets[processing.ticket_id] = processing + self._connected[processing.ticket_id] = processing.agent_id # type: ignore[assignment] + + # 3) Pending ticket (FIFO queue head) + pending = Ticket( + ticket_id="demo_pending_001", + user_id="demo_user_003", + subject="Refund progress inquiry", + description="Refund applied 3 days ago not received", + priority="normal", + status=TicketStatusEnum.PENDING.value, + transcript=[ + "User: My refund from 3 days ago hasn't arrived yet", + "AI: Please provide your order number for lookup", + "User: Order SO20260605-0099", + "[handoff] Transfer to agent", + ], + reason="Refund progress", + created_at=ts_base + 540, + updated_at=ts_base + 540, + ) + self._tickets[pending.ticket_id] = pending + self._waiting.append(pending.ticket_id) + self._refresh_all_positions() + + def list_tickets( + self, + *, + limit: int = 50, + status: Optional[str] = None, + ) -> List[Ticket]: + # Mock mode defaults to reverse chrono by creation time, consistent with LocalQueue behavior + return super().list_tickets(limit=limit, status=status) + + +# --------------------------------------------------------------------------- +# Factory +# --------------------------------------------------------------------------- +def from_env() -> MockHandoffClient: + import os + + return MockHandoffClient( + capacity=int(os.getenv("HH_QUEUE_CAPACITY", "50")), + agent_pool_size=int(os.getenv("HH_AGENT_POOL_SIZE", "2")), + estimated_wait_per_slot=int(os.getenv("HH_WAIT_PER_SLOT", "30")), + seed_demo_data=os.getenv("HH_MOCK_SEED", "1") not in ("0", "false", "False"), + ) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/core/__init__.py b/skills/trtc-ai-service/capabilities/human-handoff/src/core/__init__.py new file mode 100644 index 0000000..5eea74f --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/core/__init__.py @@ -0,0 +1,25 @@ +"""human-handoff core module.""" +from .intent_detector import IntentDetector, is_handoff_intent +from .models import ( + OverallStatus, + Ticket, + TicketStatus, + TicketStatusEnum, + now_ts, + to_legacy_state, +) +from .service import HandoffService, get_default_service, reset_default_service + +__all__ = [ + "HandoffService", + "IntentDetector", + "OverallStatus", + "Ticket", + "TicketStatus", + "TicketStatusEnum", + "get_default_service", + "is_handoff_intent", + "now_ts", + "reset_default_service", + "to_legacy_state", +] diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/core/intent_detector.py b/skills/trtc-ai-service/capabilities/human-handoff/src/core/intent_detector.py new file mode 100644 index 0000000..2bbe4c3 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/core/intent_detector.py @@ -0,0 +1,75 @@ +"""Handoff intent detection: keyword strong matching + weak intent (with negative context recognition). + +Migrated from original trigger.py. This module does **not depend** on any adapter or global state; pure functions. +""" +from __future__ import annotations + +import os +import re +from typing import List + + +_DEFAULT_TRIGGERS = [ + "agent", "help", "support", + "real person", "talk to agent", "speak to a human", "human agent", +] +_DEFAULT_INTENT = ["complain", "manager", "unsatisfied", "escalate"] + + +def _csv_env(key: str, default: List[str]) -> List[str]: + raw = os.getenv(key) + if not raw: + return list(default) + return [item.strip() for item in raw.split(",") if item.strip()] + + +class IntentDetector: + """Determine whether input text expresses a "handoff" intent using regex.""" + + def __init__( + self, + *, + triggers: List[str] | None = None, + intent_keywords: List[str] | None = None, + ) -> None: + self._triggers = triggers if triggers is not None else _csv_env( + "HH_TRIGGERS", _DEFAULT_TRIGGERS + ) + self._intent = intent_keywords if intent_keywords is not None else _csv_env( + "HH_INTENT_KEYWORDS", _DEFAULT_INTENT + ) + self._triggers_re = re.compile( + "|".join(re.escape(k) for k in self._triggers), re.IGNORECASE + ) + self._intent_re = re.compile( + "|".join(re.escape(k) for k in self._intent), re.IGNORECASE + ) + self._negative_re = re.compile( + r"\b(not|don't|do not|no|never)\b", re.IGNORECASE + ) + + def is_handoff_intent(self, text: str) -> bool: + if not text or len(text) > 4096: + return False + if self._triggers_re.search(text): + return True + if self._intent_re.search(text) and not self._negative_re.search(text): + return True + return False + + +# --------------------------------------------------------------------------- +# Default singleton (keeps behavior consistent with old trigger.py; tests can manually construct new instances to override) +# --------------------------------------------------------------------------- +_default_detector: IntentDetector | None = None + + +def get_default_detector() -> IntentDetector: + global _default_detector + if _default_detector is None: + _default_detector = IntentDetector() + return _default_detector + + +def is_handoff_intent(text: str) -> bool: + return get_default_detector().is_handoff_intent(text) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/core/models.py b/skills/trtc-ai-service/capabilities/human-handoff/src/core/models.py new file mode 100644 index 0000000..c0aa77c --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/core/models.py @@ -0,0 +1,163 @@ +"""human-handoff core models. + +Defines unified domain models: +- TicketStatusEnum Ticket status (aligned with business_contract.ticket.status_query.response.status) +- Ticket Complete ticket record (transport object between adapters) +- TicketStatus Lightweight status view (returned by status_query) +- OverallStatus Overall queue status (dashboard use) + +The core layer does not know any specific backend implementation; all adapters must use this module's data structures. +""" +from __future__ import annotations + +import time +import uuid +from dataclasses import dataclass, field +from enum import Enum +from typing import Any, Dict, List, Optional + + +class TicketStatusEnum(str, Enum): + """Ticket status enum. + + Business semantics mapping: + - PENDING User has applied; not yet assigned an agent (equivalent to old HandoffState.WAITING) + - PROCESSING Agent assigned, processing (equivalent to old HandoffState.CONNECTED) + - CLOSED Agent closed ticket + - CANCELED User actively canceled + - TIMEOUT Timeout, no agent connected + """ + + PENDING = "pending" + PROCESSING = "processing" + CLOSED = "closed" + CANCELED = "canceled" + TIMEOUT = "timeout" + + +# Status name mapping for old API compatibility (HandoffState era) +_LEGACY_STATE_MAP = { + TicketStatusEnum.PENDING.value: "waiting", + TicketStatusEnum.PROCESSING.value: "connected", + TicketStatusEnum.CLOSED.value: "closed", + TicketStatusEnum.CANCELED.value: "canceled", + TicketStatusEnum.TIMEOUT.value: "timeout", +} + + +def to_legacy_state(status: str) -> str: + """Convert new TicketStatusEnum value back to old API state name.""" + if not status: + return "idle" + return _LEGACY_STATE_MAP.get(status, status) + + +@dataclass +class Ticket: + """Ticket record. Transport object between adapters. + + user_id and ticket_id default to the same value in LocalQueue implementation (using session_id), + REST implementation uses ticket_id returned by the business side. + """ + + ticket_id: str + user_id: str + subject: str = "" + description: str = "" + priority: str = "normal" + status: str = TicketStatusEnum.PENDING.value + queue_position: int = 0 + eta_seconds: int = 0 + agent_id: Optional[str] = None + transcript: List[str] = field(default_factory=list) + reason: str = "" # Trigger reason summary (compatible with old field) + created_at: Optional[float] = None + updated_at: Optional[float] = None + closed_at: Optional[float] = None + extra: Dict[str, Any] = field(default_factory=dict) + + @staticmethod + def new_id() -> str: + return f"tk_{uuid.uuid4().hex[:12]}" + + def to_dict(self) -> dict: + return { + "ticket_id": self.ticket_id, + "session_id": self.user_id, + "user_id": self.user_id, + "subject": self.subject, + "description": self.description, + "priority": self.priority, + "status": self.status, + "queue_position": self.queue_position, + "eta_seconds": self.eta_seconds, + "agent_id": self.agent_id, + "transcript": list(self.transcript), + "reason": self.reason, + "created_at": self.created_at, + "updated_at": self.updated_at, + "closed_at": self.closed_at, + # Written at ticket creation by human-handoff - session-summary linkage (None if capability not installed) + "session_summary": self.extra.get("session_summary"), + # Written by HandoffService.submit_feedback (None if not yet rated) + "feedback": self.extra.get("feedback"), + } + + def to_legacy_dict(self) -> dict: + """Field format returned by old REST API (/api/v1/handoff/*), keeps Web Demo compatibility.""" + return { + "session_id": self.user_id, + "state": to_legacy_state(self.status), + "reason": self.reason, + "requested_at": self.created_at, + "connected_at": self.updated_at if self.status == TicketStatusEnum.PROCESSING.value else None, + "closed_at": self.closed_at, + "agent_id": self.agent_id, + "queue_position": self.queue_position, + "estimated_wait_seconds": self.eta_seconds, + } + + +@dataclass +class TicketStatus: + """Response model corresponding to business_contract.ticket.status_query.""" + + ticket_id: str + status: str + agent_id: Optional[str] = None + queue_position: int = 0 + eta_seconds: int = 0 + updated_at: Optional[float] = None + + @classmethod + def from_ticket(cls, t: Ticket) -> "TicketStatus": + return cls( + ticket_id=t.ticket_id, + status=t.status, + agent_id=t.agent_id, + queue_position=t.queue_position, + eta_seconds=t.eta_seconds, + updated_at=t.updated_at or t.created_at, + ) + + +@dataclass +class OverallStatus: + agent_pool_size: int + available_agents: int + waiting: int + connected: int + capacity: int + + def to_dict(self) -> dict: + return { + "agent_pool_size": self.agent_pool_size, + "available_agents": self.available_agents, + "waiting": self.waiting, + "connected": self.connected, + "capacity": self.capacity, + } + + +def now_ts() -> float: + return time.time() diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/core/service.py b/skills/trtc-ai-service/capabilities/human-handoff/src/core/service.py new file mode 100644 index 0000000..f97ca70 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/core/service.py @@ -0,0 +1,192 @@ +"""HandoffService — Application service chaining IntentDetector with HandoffClient. + +Only depends on ports (HandoffClient interface), does not know the specific backend implementation. +Switching adapter only requires re-injecting client (adapters.factory.set_client). +""" +from __future__ import annotations + +from typing import List, Optional + +from ..ports.handoff_client import HandoffClient +from ..summary_link import attach_summary_to_ticket +from .intent_detector import IntentDetector, get_default_detector +from .models import OverallStatus, Ticket, TicketStatus, TicketStatusEnum + + +class HandoffService: + """Handoff business service.""" + + def __init__( + self, + *, + client: HandoffClient, + detector: Optional[IntentDetector] = None, + ) -> None: + self._client = client + self._detector = detector or get_default_detector() + + # ------------------------------------------------------------------ + # Intent detection + trigger (for injection into conversation-core.before_push_text) + # ------------------------------------------------------------------ + def maybe_handoff(self, session_id: str, text: str) -> Optional[str]: + """Recognize handoff intent; if matched, request ticket and return assembled script; otherwise return None.""" + if not session_id or not text: + return None + if not self._detector.is_handoff_intent(text): + return None + + # Reuse existing ticket (don't duplicate for same user in progress) + existing = self._client.get_or_attach(session_id) + if existing is not None and existing.status in ( + TicketStatusEnum.PENDING.value, + TicketStatusEnum.PROCESSING.value, + ): + return self._render_handoff_message(existing) + + ticket = self._client.create_ticket( + user_id=session_id, + subject=text[:64], + description=text[:512], + priority="normal", + ) + # Attach session summary at ticket creation so agents can immediately see the issue on the dashboard (no-op if session-summary is not installed) + attach_summary_to_ticket(ticket) + return self._render_handoff_message(ticket) + + @staticmethod + def _render_handoff_message(t: Ticket) -> Optional[str]: + if t.status == TicketStatusEnum.PROCESSING.value: + return ( + f"[handoff state=connected agent={t.agent_id}]\n" + "You are now connected to a human agent. Please wait a moment." + ) + if t.status == TicketStatusEnum.PENDING.value: + return ( + f"[handoff state=waiting position={t.queue_position} " + f"eta={t.eta_seconds}s]\n" + f"You are number {t.queue_position} in the agent queue, " + f"estimated wait {t.eta_seconds} seconds." + ) + if t.status == TicketStatusEnum.TIMEOUT.value: + return "[handoff state=timeout]\nNo agents are currently available. Please try again later." + return None + + # ------------------------------------------------------------------ + # Explicit operations (for router calls) + # ------------------------------------------------------------------ + def request( + self, + session_id: str, + *, + reason: str = "", + subject: Optional[str] = None, + description: Optional[str] = None, + ) -> Ticket: + existing = self._client.get_or_attach(session_id) + if existing is not None and existing.status in ( + TicketStatusEnum.PENDING.value, + TicketStatusEnum.PROCESSING.value, + ): + return existing + ticket = self._client.create_ticket( + user_id=session_id, + subject=(subject or reason or "human handoff")[:64], + description=(description or reason or "")[:512], + priority="normal", + ) + # Attach session summary at ticket creation (no-op if session-summary not installed; does not affect main flow) + attach_summary_to_ticket(ticket) + return ticket + + def connect(self, session_id: str, agent_id: str) -> Ticket: + """For /api/v1/handoff/connect: force-connect a session to a specified agent.""" + # Look up active ticket by user_id first + ticket = self._client.get_or_attach(session_id) + if ticket is None: + raise ValueError(f"session {session_id} not waiting") + if ticket.status == TicketStatusEnum.PROCESSING.value: + return ticket + updated = self._client.update_status( + ticket.ticket_id, + TicketStatusEnum.PROCESSING.value, + agent_id=agent_id, + ) + if updated is None: + raise ValueError(f"ticket {ticket.ticket_id} not found") + return updated + + def cancel(self, session_id: str, *, reason: str = "") -> Ticket: + ticket = self._client.get_or_attach(session_id) + if ticket is None: + raise ValueError(f"session not found: {session_id}") + result = self._client.cancel_ticket(ticket.ticket_id, reason=reason) + if result is None: + raise ValueError(f"ticket {ticket.ticket_id} not found") + return result + + def get_by_session(self, session_id: str) -> Optional[Ticket]: + return self._client.get_or_attach(session_id) + + def overall_status(self) -> OverallStatus: + return self._client.overall_status() + + # ------------------------------------------------------------------ + # Dashboard helpers + # ------------------------------------------------------------------ + def list_tickets( + self, + *, + limit: int = 50, + status: Optional[str] = None, + ) -> List[Ticket]: + return self._client.list_tickets(limit=limit, status=status) + + def update_ticket_status( + self, + ticket_id: str, + status: str, + *, + agent_id: Optional[str] = None, + ) -> Optional[Ticket]: + return self._client.update_status(ticket_id, status, agent_id=agent_id) + + def query_ticket(self, ticket_id: str) -> Optional[TicketStatus]: + return self._client.query_status(ticket_id) + + # ------------------------------------------------------------------ + # Customer satisfaction feedback + # ------------------------------------------------------------------ + def submit_feedback(self, session_id: str, rating: int, comment: str = "") -> dict: + """Persist a satisfaction rating and, if a ticket exists for the session, + attach it to that ticket so agents can see the score on the dashboard.""" + from ..feedback_store import make_feedback, save_feedback + + fb = make_feedback(rating, comment) + save_feedback(session_id, fb) + try: + ticket = self._client.get_or_attach(session_id) + if ticket is not None: + ticket.extra["feedback"] = fb + except Exception: # noqa: BLE001 - feedback must never break on ticket lookup + pass + return {"session_id": session_id, "feedback": fb} + + +# --------------------------------------------------------------------------- +# Default service singleton +# --------------------------------------------------------------------------- +_default_service: Optional[HandoffService] = None + + +def get_default_service() -> HandoffService: + """Build service singleton from current environment (client from adapters.factory).""" + global _default_service + if _default_service is None: + from ..adapters.factory import get_client + _default_service = HandoffService(client=get_client()) + return _default_service + + +def reset_default_service() -> None: + global _default_service + _default_service = None diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/feedback_store.py b/skills/trtc-ai-service/capabilities/human-handoff/src/feedback_store.py new file mode 100644 index 0000000..ef05fcf --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/feedback_store.py @@ -0,0 +1,54 @@ +"""In-process satisfaction feedback store (CSAT). + +Lightweight memory store keyed by session_id. Used by HandoffService.submit_feedback +so the dashboard can look up a session's rating without requiring a ticket to exist. + +Why memory-only? +- The default adapter (HH_ADAPTER=local_queue) is itself in-process, so an in-memory + feedback store keeps the demo self-contained with zero external dependencies. +- Production deployments that persist tickets out-of-band should swap this for a + persistent sink; the public surface (make_feedback / save_feedback / get_feedback) + stays stable. +""" +from __future__ import annotations + +import threading +import time +import uuid +from typing import Any, Dict, Optional + +_lock = threading.RLock() +_store: Dict[str, Dict[str, Any]] = {} + + +def make_feedback(rating: int, comment: str = "") -> Dict[str, Any]: + """Build a normalized feedback record (does not persist).""" + rating = int(rating) + if rating < 1: + rating = 1 + elif rating > 5: + rating = 5 + return { + "feedback_id": f"fb_{uuid.uuid4().hex[:12]}", + "rating": rating, + "comment": (comment or "").strip()[:1000], + "created_at": time.time(), + } + + +def save_feedback(session_id: str, feedback: Dict[str, Any]) -> Dict[str, Any]: + """Persist a feedback record keyed by session_id (overwrites prior entry).""" + if not session_id: + raise ValueError("session_id is required") + with _lock: + _store[session_id] = dict(feedback) + return feedback + + +def get_feedback(session_id: str) -> Optional[Dict[str, Any]]: + """Return the stored feedback for a session, or None if not rated yet.""" + if not session_id: + return None + with _lock: + fb = _store.get(session_id) + return dict(fb) if fb is not None else None diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/ports/__init__.py b/skills/trtc-ai-service/capabilities/human-handoff/src/ports/__init__.py new file mode 100644 index 0000000..46fabbc --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/ports/__init__.py @@ -0,0 +1,4 @@ +"""human-handoff abstract ports module.""" +from .handoff_client import HandoffClient + +__all__ = ["HandoffClient"] diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/ports/handoff_client.py b/skills/trtc-ai-service/capabilities/human-handoff/src/ports/handoff_client.py new file mode 100644 index 0000000..2aa978a --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/ports/handoff_client.py @@ -0,0 +1,86 @@ +"""human-handoff abstract port (Port). + +One-to-one correspondence with manifest.yaml.business_contract fields: +- create_ticket -> ticket.create +- query_status -> ticket.status_query +- cancel_ticket -> ticket.cancel +- overall_status -> internal status (not in external contract; for local dashboard use only) + +All concrete implementations (local_queue / default_rest / mock / user_custom) must inherit this ABC. +The core layer only depends on this interface, unaware of any specific backend type; switching backends only requires changing the adapter. +""" +from __future__ import annotations + +from abc import ABC, abstractmethod +from typing import List, Optional + +from ..core.models import OverallStatus, Ticket, TicketStatus + + +class HandoffClient(ABC): + """Unified interface contract for handoff / ticketing backends.""" + + # --- Methods aligned with business_contract -------------------------- + + @abstractmethod + def create_ticket( + self, + *, + user_id: str, + subject: str = "", + description: str = "", + priority: str = "normal", + transcript: Optional[List[str]] = None, + ) -> Ticket: + """Create ticket. Corresponds to business_contract.ticket.create.""" + + @abstractmethod + def query_status(self, ticket_id: str) -> Optional[TicketStatus]: + """Query single ticket status. Corresponds to ticket.status_query. + + Returns None if ticket does not exist. + """ + + @abstractmethod + def cancel_ticket(self, ticket_id: str, reason: str = "") -> Optional[Ticket]: + """Cancel ticket. Corresponds to ticket.cancel. Returns None if ticket does not exist.""" + + @abstractmethod + def overall_status(self) -> OverallStatus: + """Overall queue status (dashboard use, not external contract).""" + + # --- Dashboard helper methods (default implementation; remote backends may not override) --------- + + def list_tickets( + self, + *, + limit: int = 50, + status: Optional[str] = None, + ) -> List[Ticket]: + """List tickets (default returns empty; remote backends override as needed).""" + return [] + + def update_status( + self, + ticket_id: str, + status: str, + *, + agent_id: Optional[str] = None, + ) -> Optional[Ticket]: + """Agent manually updates ticket status (not supported by default; can be overridden by mock / local_queue).""" + raise NotImplementedError( + f"{type(self).__name__} does not support manual status update" + ) + + # --- Bridge interface compatible with old trigger.maybe_handoff --------- + + def get_or_attach(self, user_id: str) -> Optional[Ticket]: + """Find existing ticket by user_id (old session_id); returns None if not found. + + This method allows the facade layer to query existing ticket status without breaking the old API. + Default implementation iterates list_tickets. + """ + for t in self.list_tickets(limit=200): + if t.user_id == user_id: + return t + return None diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/queue.py b/skills/trtc-ai-service/capabilities/human-handoff/src/queue.py new file mode 100644 index 0000000..1fb1983 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/queue.py @@ -0,0 +1,62 @@ +"""queue.py — compatibility facade. + +Keeps the `attach_session` public symbol for manifest.extensions (agent.after_start) +to continue calling as `_hh_queue.attach_session(session_id, info=...)`. + +Old version's `get_queue()` / `HandoffQueue` / `HandoffRecord` / `HandoffState` +are no longer used under the new architecture; kept as deprecated shims for gradual migration of third-party code. + +New code should use directly: +- adapters.factory.get_client() Get HandoffClient instance +- core.service.get_default_service() Get HandoffService instance +""" +from __future__ import annotations + +import warnings +from typing import Any + +from .adapters.factory import get_client +from .core.models import ( # noqa: F401 (backward compatible export) + OverallStatus, + Ticket, + TicketStatus, + TicketStatusEnum, +) + + +def attach_session(session_id: str, info: Any = None) -> None: + """For conversation-core.after_start injection point use. + + Under the refactored implementation, "session registration" is done by client on create_ticket as needed; + kept here as a no-op entry point to avoid breaking old calls from manifest.extensions. + info parameter reserved for compatibility with old signature. + """ + # Trigger client singleton init to surface config errors early during startup + _ = get_client() + return None + + +# -------------------------------------------------------------------- +# Deprecated shim (for old tests / old external code gradual migration only; new code should not depend) +# -------------------------------------------------------------------- +def get_queue(): + warnings.warn( + "human_handoff.queue.get_queue() is deprecated; " + "use adapters.factory.get_client() or core.service.get_default_service() instead.", + DeprecationWarning, + stacklevel=2, + ) + return get_client() + + +# Old symbol aliases (some integrators may directly import) +HandoffState = TicketStatusEnum +HandoffRecord = Ticket + + +__all__ = [ + "HandoffRecord", + "HandoffState", + "attach_session", + "get_queue", +] diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/router.py b/skills/trtc-ai-service/capabilities/human-handoff/src/router.py new file mode 100644 index 0000000..424b19d --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/router.py @@ -0,0 +1,201 @@ +"""human-handoff FastAPI sub-router. + +Mounted on skeleton: app.include_router(router, prefix="/api/v1/handoff") + +Refactoring notes: +- All business logic delegated to core.service.HandoffService +- Response fields remain fully consistent with Phase 2 (to_legacy_dict), not breaking Web Demo +- New /admin/* sub-routes for Phase 3 Path A ticket agent dashboard +""" +from __future__ import annotations + +from typing import List, Optional + +from fastapi import APIRouter, HTTPException, Query +from pydantic import BaseModel, Field + +from .core.models import TicketStatusEnum +from .core.service import get_default_service + + +router = APIRouter() + + +# --------------------------------------------------------------------------- +# Request body +# --------------------------------------------------------------------------- +class RequestBody(BaseModel): + session_id: str = Field(..., max_length=64) + reason: Optional[str] = Field(default="", max_length=512) + + +class ConnectBody(BaseModel): + session_id: str = Field(..., max_length=64) + agent_id: str = Field(..., max_length=64) + + +class CancelBody(BaseModel): + session_id: str = Field(..., max_length=64) + + +class DetectBody(BaseModel): + """Body for handoff intent detection (reuses intent_detector pure function).""" + + text: str = Field(..., max_length=4096) + + +class FeedbackBody(BaseModel): + """Body for submitting a post-call satisfaction rating.""" + + session_id: str = Field(..., max_length=64) + rating: int = Field(..., ge=1, le=5) + comment: Optional[str] = Field(default="", max_length=1000) + + +class AdminUpdateBody(BaseModel): + status: str = Field(..., max_length=32) + agent_id: Optional[str] = Field(default=None, max_length=64) + + +# --------------------------------------------------------------------------- +# Existing endpoints (fully compatible with Phase 2) +# --------------------------------------------------------------------------- +@router.get("/status") +def overall() -> dict: + return {"code": 0, "data": get_default_service().overall_status().to_dict()} + + +@router.get("/{session_id}") +def session_status(session_id: str) -> dict: + ticket = get_default_service().get_by_session(session_id) + if ticket is None: + raise HTTPException( + status_code=404, detail=f"session not tracked: {session_id}" + ) + return {"code": 0, "data": ticket.to_legacy_dict()} + + +@router.post("/request") +def request_handoff(body: RequestBody) -> dict: + ticket = get_default_service().request( + body.session_id, reason=body.reason or "" + ) + return {"code": 0, "data": ticket.to_legacy_dict()} + + +@router.post("/connect") +def connect(body: ConnectBody) -> dict: + try: + ticket = get_default_service().connect(body.session_id, body.agent_id) + except ValueError as exc: + raise HTTPException(status_code=404, detail=str(exc)) from exc + except RuntimeError as exc: + raise HTTPException(status_code=409, detail=str(exc)) from exc + return {"code": 0, "data": ticket.to_legacy_dict()} + + +@router.post("/cancel") +def cancel(body: CancelBody) -> dict: + try: + ticket = get_default_service().cancel(body.session_id) + except ValueError as exc: + raise HTTPException(status_code=404, detail=str(exc)) from exc + return {"code": 0, "data": ticket.to_legacy_dict()} + + +# --------------------------------------------------------------------------- +# Intent detection + satisfaction feedback (issue 6 / issue 7) +# --------------------------------------------------------------------------- +@router.post("/detect") +def detect_handoff(body: DetectBody) -> dict: + """Pure intent detection: returns whether the text implies a handoff request. + + Reuses core.intent_detector (same logic the backend uses internally), so the + frontend fast-path and the backend stay in sync. + """ + from .core.intent_detector import is_handoff_intent + + matched = bool(is_handoff_intent(body.text or "")) + return {"code": 0, "data": {"matched": matched, "text": body.text or ""}} + + +@router.post("/feedback") +def submit_feedback(body: FeedbackBody) -> dict: + """Persist a post-call CSAT rating and attach it to the session's ticket (if any).""" + result = get_default_service().submit_feedback( + body.session_id, body.rating, body.comment or "" + ) + return {"code": 0, "data": result} + + +@router.get("/feedback/{session_id}") +def get_feedback(session_id: str) -> dict: + """Return the stored feedback for a session (404 if not yet rated).""" + from .feedback_store import get_feedback as _get + + fb = _get(session_id) + if fb is None: + raise HTTPException( + status_code=404, detail=f"no feedback for session: {session_id}" + ) + return {"code": 0, "data": {"session_id": session_id, "feedback": fb}} + + +# --------------------------------------------------------------------------- +# New: Ticket agent dashboard endpoints (Phase 3 Path A) +# Path: /admin/tickets +# These endpoints output "new version" fields (including ticket_id / subject / priority / transcript), +# coexisting with the legacy field format of existing /handoff/{session_id}. +# --------------------------------------------------------------------------- +@router.get("/admin/tickets") +def admin_list_tickets( + limit: int = Query(default=50, ge=1, le=200), + status: Optional[str] = Query(default=None, max_length=32), +) -> dict: + items = get_default_service().list_tickets(limit=limit, status=status) + return { + "code": 0, + "data": { + "items": [t.to_dict() for t in items], + "count": len(items), + }, + } + + +@router.get("/admin/tickets/{ticket_id}") +def admin_get_ticket(ticket_id: str) -> dict: + status = get_default_service().query_ticket(ticket_id) + if status is None: + raise HTTPException(status_code=404, detail=f"ticket not found: {ticket_id}") + items = [ + t for t in get_default_service().list_tickets(limit=200) + if t.ticket_id == ticket_id + ] + if not items: + raise HTTPException(status_code=404, detail=f"ticket not found: {ticket_id}") + return {"code": 0, "data": items[0].to_dict()} + + +@router.post("/admin/tickets/{ticket_id}/status") +def admin_update_status(ticket_id: str, body: AdminUpdateBody) -> dict: + # Validate status value + try: + TicketStatusEnum(body.status) + except ValueError as exc: + raise HTTPException( + status_code=400, + detail=f"invalid status: {body.status}", + ) from exc + + try: + ticket = get_default_service().update_ticket_status( + ticket_id, body.status, agent_id=body.agent_id + ) + except NotImplementedError as exc: + raise HTTPException(status_code=405, detail=str(exc)) from exc + except ValueError as exc: + raise HTTPException(status_code=400, detail=str(exc)) from exc + + if ticket is None: + raise HTTPException(status_code=404, detail=f"ticket not found: {ticket_id}") + return {"code": 0, "data": ticket.to_dict()} diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/summary_link.py b/skills/trtc-ai-service/capabilities/human-handoff/src/summary_link.py new file mode 100644 index 0000000..084578e --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/summary_link.py @@ -0,0 +1,77 @@ +"""Best-effort linkage with session-summary (optional capability). + +When a handoff ticket is created, auto-generate a session summary and attach it to the ticket, +so agents can see customer issue context directly in the ticket details — no need to manually click "generate summary". + +Design principles: +- Soft dependency: silently no-ops when session-summary is not installed; does not affect the handoff main flow. +- Non-blocking: defaults to heuristic summary (local, zero latency); does not call LLM in the ticket creation chain. +- Decoupled: dynamically loaded via conversation-core's _capability_loader; + human-handoff has no static import dependency on session-summary. +""" +from __future__ import annotations + +import importlib.util +import logging +from pathlib import Path +from typing import Any, Optional + +logger = logging.getLogger(__name__) + +_loader: Optional[Any] = None +_loader_resolved = False + + +def _get_loader() -> Optional[Any]: + """Dynamically load conversation-core's _capability_loader (no relative imports, can load independently).""" + global _loader, _loader_resolved + if _loader_resolved: + return _loader + _loader_resolved = True + try: + # /capabilities/human-handoff/src/summary_link.py → parents[3] = + repo_root = Path(__file__).resolve().parents[3] + loader_path = ( + repo_root / "capabilities" / "conversation-core" / "src" / "_capability_loader.py" + ) + if not loader_path.is_file(): + return None + spec = importlib.util.spec_from_file_location("_hh_capability_loader", loader_path) + if spec is None or spec.loader is None: + return None + mod = importlib.util.module_from_spec(spec) + spec.loader.exec_module(mod) + _loader = mod + except Exception as exc: # noqa: BLE001 + logger.info("session-summary link unavailable: %s", exc) + _loader = None + return _loader + + +def attach_summary_to_ticket(ticket: Any) -> None: + """Generate an LLM narrative summary of the session chat and write it into the ticket's + Description field (from AI connect → handoff trigger). + + The ticket Description becomes an LLM summary of the conversation, so agents see the + context directly without a separate "Session Summary" block. session-summary not + installed / any exception → silently skip (does not affect ticket creation main flow). + """ + loader = _get_loader() + if loader is None: + return + try: + recorder_mod = loader.try_load_capability("session-summary", "src/recorder.py") + summarizer_mod = loader.try_load_capability("session-summary", "src/summarizer.py") + if recorder_mod is None or summarizer_mod is None: + return + session_id = ticket.user_id + recorder = recorder_mod.get_recorder() + rec = recorder.get(session_id) + if rec is None: + return # No transcript recorded for this session (e.g. manually inserted test ticket), skip + # LLM-generated one-paragraph summary of the chat → ticket Description. + paragraph = summarizer_mod.summarize_paragraph(rec) + if paragraph: + ticket.description = paragraph + except Exception as exc: # noqa: BLE001 + logger.info("attach description summary skipped: %s", exc) diff --git a/skills/trtc-ai-service/capabilities/human-handoff/src/trigger.py b/skills/trtc-ai-service/capabilities/human-handoff/src/trigger.py new file mode 100644 index 0000000..83b18e3 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/human-handoff/src/trigger.py @@ -0,0 +1,25 @@ +"""trigger.py — compatibility facade. + +Keeps the `maybe_handoff` / `is_handoff_intent` public symbols for manifest.extensions +(agent.before_push_text) to continue calling as `_hh_trigger.maybe_handoff(session_id, text)`, +internally delegated to the refactored core.service.HandoffService. + +New code should use core.service / core.intent_detector directly; do not depend on this facade. +""" +from __future__ import annotations + +from typing import Optional + +from .core.intent_detector import is_handoff_intent # noqa: F401 (public API) +from .core.service import get_default_service + + +def maybe_handoff(session_id: str, text: str) -> Optional[str]: + """For conversation-core.before_push_text injection point use. + + Signature fully consistent with original: returns None when not triggered; returns a string when text has been replaced with handoff script. + """ + return get_default_service().maybe_handoff(session_id, text) + + +__all__ = ["is_handoff_intent", "maybe_handoff"] diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/INTERFACE_ADAPT.md b/skills/trtc-ai-service/capabilities/knowledge-base/INTERFACE_ADAPT.md new file mode 100644 index 0000000..1a6bb1e --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/INTERFACE_ADAPT.md @@ -0,0 +1,297 @@ +# knowledge-base Interface Adaptation SOP + +> When the user's existing knowledge base / FAQ / retrieval system API diverges from this capability's default contract, follow this document for scenario-specific operations. +> Recommended: use `python scripts/contract-adapt.py knowledge-base` for automated generation; this document is the manual fallback. + +--- + +## 1. Default Contract Overview + +This capability **calls** the user's knowledge base interfaces (outbound): + +| Contract | Method | Path | Purpose | +|---|---|---|---| +| `faq.search` | POST | `/faq/search` | Keyword search | +| `faq.list` | GET | `/faq` | List all entries | +| `faq.upsert` | POST | `/faq` | Create / update | +| `faq.delete` | DELETE | `/faq/{entry_id}` | Delete entry | + +Full field definitions in `manifest.yaml` `business_contract.external_apis`. + +--- + +## 2. Three-Layer Defense Mechanism + +| Layer | Artifact Location | Applicable Scenario | +|---|---|---| +| **L1 Field Mapping** | Only field name / simple type differences | 90% of common cases | +| **L2 Adapter Subclass** | Auth / path / error code differences | User's own KB system | +| **L3 Full Custom Implementation** | Protocol-level differences (vector DB / GraphQL / gRPC) | Non-REST protocols | + +All layers land in `capabilities/knowledge-base/src/adapters/user_custom.py` and are enabled via `KB_ADAPTER=user_custom`. + +--- + +## 3. L1 Field Mapping (most common) + +### 3.1 Applicability Check + +- User API is still REST + JSON +- Only field name differences (within `adapter_slots` scope) + +### 3.2 Steps + +**Step 1**: Paste user's curl or OpenAPI + +```bash +curl -X POST https://kb.example.com/api/v3/search \ + -H 'X-Api-Key: xxx' \ + -d '{ + "keyword": "refund", + "limit": 3 + }' +# Response: +# { +# "results": [ +# { "doc_id": "k001", "title": "How to refund", "content": "...", "tags": ["refund"], "relevance": 0.92 } +# ] +# } +``` + +**Step 2**: Write mapping table `capabilities/knowledge-base/src/adapters/user_custom_mapping.yaml` + +```yaml +faq.search: + request: + query: keyword # Field name mapping + top_k: limit + response: + # response is array form; transformer needs to map results[] to hits[] + hits: results + "hits[].entry.id": "results[].doc_id" + "hits[].entry.question": "results[].title" + "hits[].entry.answer": "results[].content" + "hits[].entry.keywords": "results[].tags" + "hits[].score": "results[].relevance" + +faq.list: + response: + items: data # User uses data not items + "items[].id": "data[].doc_id" + "items[].question": "data[].title" + "items[].answer": "data[].content" + "items[].keywords": "data[].tags" +``` + +**Step 3**: Generate adapter + +```bash +python scripts/contract-adapt.py knowledge-base \ + --base-url https://kb.example.com \ + --auth-header "X-Api-Key" \ + --mapping capabilities/knowledge-base/src/adapters/user_custom_mapping.yaml +``` + +**Step 4**: Enable + +```bash +export KB_ADAPTER=user_custom +export KB_REST_BASE_URL=https://kb.example.com +export KB_REST_TOKEN= +``` + +--- + +## 4. L2 Adapter Subclass (auth / path style differences) + +### 4.1 Applicability Check + +- Auth method is not Bearer (e.g. `X-Api-Key`, signature-based auth) +- Different path templates +- Different response wrapping (e.g. `{ code, msg, data: { ... } }`) + +### 4.2 Template Code + +```python +# capabilities/knowledge-base/src/adapters/user_custom.py +from typing import List, Optional + +from ..core.models import FaqEntry, SearchHit +from .default_rest import DefaultRestKbClient + + +class UserCustomKbClient(DefaultRestKbClient): + """User's own KB system adapter (L2).""" + + def _headers(self) -> dict: + h = {"Content-Type": "application/json"} + if self._token: + h["X-Api-Key"] = self._token # Not Bearer + return h + + def search( + self, + query: str, + *, + top_k: Optional[int] = None, + min_score: Optional[float] = None, + ) -> List[SearchHit]: + if not query.strip(): + return [] + payload = { + "keyword": query, # Field remapping + "limit": int(top_k or 3), + } + # User API path is different + data = self._post("/api/v3/search", payload) + results = data.get("results", []) if isinstance(data, dict) else (data or []) + hits: List[SearchHit] = [] + for r in results: + hits.append( + SearchHit( + entry=FaqEntry( + id=str(r.get("doc_id", "")), + question=str(r.get("title", "")), + answer=str(r.get("content", "")), + keywords=list(r.get("tags") or []), + source="remote_api", + ), + score=float(r.get("relevance", 0.0)), + ) + ) + return hits + + def list_all(self) -> List[FaqEntry]: + data = self._get("/api/v3/docs") + items = data.get("data", []) if isinstance(data, dict) else (data or []) + return [ + FaqEntry( + id=str(it.get("doc_id", "")), + question=str(it.get("title", "")), + answer=str(it.get("content", "")), + keywords=list(it.get("tags") or []), + source="remote_api", + ) + for it in items + ] + + +def from_env() -> Optional["UserCustomKbClient"]: + import os + base = os.getenv("KB_REST_BASE_URL") + if not base: + return None + return UserCustomKbClient( + base_url=base, + token=os.getenv("KB_REST_TOKEN"), + timeout_ms=int(os.getenv("KB_REST_TIMEOUT_MS", "5000")), + ) +``` + +--- + +## 5. L3 Full Custom (vector DB / GraphQL / gRPC) + +### 5.1 Applicability Check + +- User uses vector database (Milvus / Pinecone / Qdrant) for semantic search +- User uses GraphQL instead of REST +- User uses gRPC + +### 5.2 Template Code (vector DB example) + +```python +# capabilities/knowledge-base/src/adapters/user_custom.py +from typing import List, Optional + +from ..core.models import FaqEntry, KbStats, SearchHit +from ..ports.kb_client import KnowledgeBaseClient + + +class UserCustomKbClient(KnowledgeBaseClient): + """Vector DB adapter example (L3: directly implements KnowledgeBaseClient).""" + + def __init__(self, **kwargs): + # TODO Initialize vector DB client: + # self._milvus = MilvusClient(uri=...) + # self._embedder = SentenceTransformer(...) + ... + + def search(self, query, *, top_k=None, min_score=None) -> List[SearchHit]: + # TODO Call embedder + vector retrieval + # vec = self._embedder.encode(query) + # results = self._milvus.search(vec, top_k=top_k or 3) + results = [] + return [ + SearchHit( + entry=FaqEntry(id=r["id"], question=r["q"], answer=r["a"]), + score=float(r["distance"]), + ) + for r in results + ] + + def list_all(self) -> List[FaqEntry]: + # TODO Vector DBs may not support enumeration; return empty or raise NotSupported + return [] + + def upsert(self, entry: FaqEntry) -> FaqEntry: + # TODO Write vectors + return entry + + def delete(self, entry_id: str) -> bool: + # TODO + return False + + def stats(self) -> KbStats: + return KbStats(backend="vector_db", entry_count=-1) + + +def from_env(): + import os + return UserCustomKbClient( + endpoint=os.getenv("KB_VECTOR_ENDPOINT", ""), + api_key=os.getenv("KB_VECTOR_TOKEN", ""), + ) +``` + +--- + +## 6. Switch / Verify + +### 6.1 Enable user_custom + +```bash +export KB_ADAPTER=user_custom +# Takes effect after service restart +``` + +### 6.2 Unit Self-Check + +```bash +python -c " +from capabilities.knowledge_base.src.adapters.factory import build_default +c = build_default() +print('adapter:', type(c).__name__) +hits = c.search('refund') +for h in hits: + print(' ', h.score, h.entry.question) +" +``` + +### 6.3 End-to-End + +```bash +curl -X POST http://localhost:3000/api/v1/kb/search \ + -H 'Content-Type: application/json' \ + -d '{"query":"refund","top_k":3}' +``` + +--- + +## 7. Security Checklist + +- [ ] `KB_REST_BASE_URL` must use https:// (localhost excepted) +- [ ] Default reject private network addresses +- [ ] Auth token only from environment variables +- [ ] User-uploaded FAQ content sanitized via `_strip_html` (built into router) +- [ ] Remote exceptions do not print response bodies diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/README.md b/skills/trtc-ai-service/capabilities/knowledge-base/README.md new file mode 100644 index 0000000..b4accea --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/README.md @@ -0,0 +1,51 @@ +# knowledge-base · FAQ Retrieval Capability + +> Adds minimal FAQ retrieval to the conversation-core skeleton with zero external dependencies. + +## Install + +```bash +# From repo root +python scripts/add-capability.py knowledge-base +``` + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `KB_DATA_FILE` | `capabilities/knowledge-base/data/faq.json` | FAQ data file | +| `KB_TOP_K` | `3` | Max entries to backfill per query | +| `KB_MIN_SCORE`| `0.1` | Hit threshold (entries below this are not injected) | + +## REST API + +| Method | Path | Purpose | +|:---|:---|:---| +| GET | `/api/v1/kb/list` | List all entries | +| POST | `/api/v1/kb/search` | Keyword search | +| POST | `/api/v1/kb/upsert` | Create / update | +| DELETE | `/api/v1/kb/{id}` | Delete | +| POST | `/api/v1/kb/reload` | Hot-reload from file | + +## Injection Strategy + +- `agent.before_start`: Append matched FAQ entries to the end of LLM `instructions`. +- `server.router_extension`: Mount `/api/v1/kb/*` sub-router. + +## Data Format + +```json +[ + { + "id": "faq_xxx", + "question": "What ...?", + "answer": "...", + "keywords": ["alias1", "alias2"] + } +] +``` + +## Security + +- HTML tags are automatically stripped when writing entries (XSS defense). +- Length limits: `question ≤ 1024`, `answer ≤ 4096`, `query ≤ 256`. diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/data/faq.json b/skills/trtc-ai-service/capabilities/knowledge-base/data/faq.json new file mode 100644 index 0000000..1d1ef7e --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/data/faq.json @@ -0,0 +1,20 @@ +[ + { + "id": "faq_shipping_time", + "question": "What is the typical shipping time?", + "answer": "Standard shipping takes 3-5 business days; express shipping arrives within 1-2 business days.", + "keywords": ["shipping", "delivery", "tracking", "logistics"] + }, + { + "id": "faq_return_policy", + "question": "How do I return a product?", + "answer": "Return requests can be filed within 30 days of delivery. Visit Account > Orders to initiate a return.", + "keywords": ["return", "refund", "exchange", "money back"] + }, + { + "id": "faq_change_address", + "question": "Can I change the shipping address after ordering?", + "answer": "You can update the shipping address before the order enters the picking stage. Use Order Detail > Modify Address.", + "keywords": ["address", "modify", "change shipping", "update address"] + } +] diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/manifest.yaml b/skills/trtc-ai-service/capabilities/knowledge-base/manifest.yaml new file mode 100644 index 0000000..3237db5 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/manifest.yaml @@ -0,0 +1,211 @@ +# knowledge-base capability self-describing manifest +# Type: capability (optional install) + +name: "knowledge-base" +version: "1.1.0" +type: "capability" +description: "FAQ retrieval + keyword matching (Phase 3 refactored to ports/adapters/core three-layer architecture)" + +# --------------------------------------------------------------------------- +# Dependencies: only depends on the generic skeleton; no external databases +# --------------------------------------------------------------------------- +dependencies: + - name: "conversation-core" + version: ">=1.0.0,<2.0.0" + +# --------------------------------------------------------------------------- +# Injection extensions into skeleton (maintains full Phase 2 compatibility; retriever.py is now a facade) +# --------------------------------------------------------------------------- +extensions: + # Prepend matched FAQ results to instructions before start_agent + - inject_at: "agent.before_start" + inline_code: | + # [knowledge-base] inject FAQ context if any user keywords present + if config is not None and getattr(config, "instructions", None): + from ._capability_loader import try_load_capability + _kb = try_load_capability("knowledge-base", "src/retriever.py") + if _kb is not None: + try: + config.instructions = _kb.attach_faq_to_instructions(config.instructions) + except Exception as _kb_exc: # noqa: BLE001 + logger.warning("knowledge-base FAQ injection failed: %s", _kb_exc) + # Mount sub-router after server.include_router + - inject_at: "server.router_extension" + inline_code: | + # [knowledge-base] mount sub-router + from ._capability_loader import try_load_capability as _try_load_capability + _kb_router_mod = _try_load_capability("knowledge-base", "src/router.py") + if _kb_router_mod is not None and hasattr(_kb_router_mod, "router"): + app.include_router( + _kb_router_mod.router, prefix="/api/v1/kb", tags=["knowledge-base"] + ) + +# --------------------------------------------------------------------------- +# Configuration interface +# --------------------------------------------------------------------------- +config: + data_file: + description: "FAQ data file path (JSON), used only by local_json adapter" + default: "capabilities/knowledge-base/data/faq.json" + env: "KB_DATA_FILE" + matcher: + description: "Matching strategy: keyword | tfidf | hybrid (local_json adapter defaults to tfidf)" + default: "tfidf" + top_k: + description: "Maximum number of FAQ entries to backfill" + default: 3 + env: "KB_TOP_K" + min_score: + description: "Score threshold; entries below this threshold are not injected" + default: 0.1 + env: "KB_MIN_SCORE" + adapter: + description: "Which KnowledgeBaseClient implementation to use: local_json | mock | default_rest | user_custom" + default: "local_json" + env: "KB_ADAPTER" + +# --------------------------------------------------------------------------- +# API endpoints +# --------------------------------------------------------------------------- +endpoints: + - method: GET + path: /api/v1/kb/list + description: List all knowledge entries + - method: POST + path: /api/v1/kb/search + description: "Keyword search FAQ (body shape: {query, top_k?})" + - method: POST + path: /api/v1/kb/upsert + description: "Create or update entry (body shape: {id, question, answer, keywords[]})" + - method: DELETE + path: /api/v1/kb/{entry_id} + description: "Delete entry" + - method: POST + path: /api/v1/kb/reload + description: "Reload data from external source" + - method: GET + path: /api/v1/kb/stats + description: "Returns backend / entry count / load time (Phase 3 new)" + +# --------------------------------------------------------------------------- +# Business contract (Phase 3 new; follows references/business-contract-spec.md v1.0) +# --------------------------------------------------------------------------- +business_contract: + port_class: "src.ports.kb_client.KnowledgeBaseClient" + default_adapter: "src.adapters.local_json.LocalJsonKbClient" + mock_adapter: "src.adapters.mock.MockKbClient" + customization_sop: "INTERFACE_ADAPT.md" + external_apis: + - name: faq.search + direction: outbound + method: POST + path: /faq/search + description: "Perform keyword search against external FAQ service" + request_schema: + query: string + top_k: int + min_score: float + response_schema: + hits: + - entry: + id: string + question: string + answer: string + keywords: string[] + score: float + adapter_slots: + - request.query + - request.top_k + - response.hits + auth: + type: bearer + location: header + name: Authorization + timeout_ms: 5000 + + - name: faq.list + direction: outbound + method: GET + path: /faq + description: "List all FAQ entries" + request_schema: {} + response_schema: + items: + - id: string + question: string + answer: string + keywords: string[] + adapter_slots: + - response.items + timeout_ms: 5000 + + - name: faq.upsert + direction: outbound + method: POST + path: /faq + description: "Create or update FAQ entry" + request_schema: + id: string + question: string + answer: string + keywords: string[] + response_schema: + id: string + question: string + answer: string + keywords: string[] + adapter_slots: + - request.id + - request.question + - request.answer + - request.keywords + timeout_ms: 5000 + + - name: faq.delete + direction: outbound + method: DELETE + path: /faq/{entry_id} + description: "Delete FAQ entry" + request_schema: + entry_id: string + response_schema: + deleted: string + adapter_slots: + - request.entry_id + timeout_ms: 3000 + +# --------------------------------------------------------------------------- +# Integration rules +# --------------------------------------------------------------------------- +integration: + mode: "auto" + auto_adapters: + - tech_stack: ["react", "vue", "angular"] + adapter: "frontend-spa" + description: "Inject an FAQ editor panel component into SPA" + fallback: + guided_templates: + - "../../auto_adapters/integration_templates/generic-frontend.md" + manual_api: + rest_endpoint: "/api/v1/kb" + sdk_packages: [] + +# --------------------------------------------------------------------------- +# Security +# --------------------------------------------------------------------------- +security: + log_redaction: + enabled: true + patterns: ["api_key", "token", "credential", "authorization"] + input_validation: + max_query_length: 256 + forbid_html_tags: true # Strip HTML when writing entries to prevent injection + network: + enforce_https: true # default_rest adapter enforces HTTPS for non-localhost + +acceptance: + - "Provide keyword/tfidf retrieval interface with minimal implementation" + - "API endpoints available immediately after skeleton startup" + - "Instructions injection does not modify skeleton files; solely via manifest.extensions" + - "Switching KB_ADAPTER requires no business code changes (local_json / mock / default_rest / user_custom)" + - "default_rest adapter rejects private network addresses (except localhost)" diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/__init__.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/__init__.py new file mode 100644 index 0000000..e23b709 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/__init__.py @@ -0,0 +1,8 @@ +"""knowledge-base capability: FAQ retrieval + keyword matching. + +Minimal implementation: +- Data source: local JSON file (hot-reloadable) +- Matching: keyword weighted scoring + optional TF-IDF (stop-word filtering) +- Zero external dependencies, pure Python implementation +""" +__version__ = "1.0.0" diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/__init__.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/__init__.py new file mode 100644 index 0000000..27c0614 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/__init__.py @@ -0,0 +1,9 @@ +"""knowledge-base adapter implementations.""" +from .factory import build_default, get_client, reset_client, set_client + +__all__ = [ + "build_default", + "get_client", + "reset_client", + "set_client", +] diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/default_rest.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/default_rest.py new file mode 100644 index 0000000..3bcce05 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/default_rest.py @@ -0,0 +1,209 @@ +"""DefaultRestKbClient — call external FAQ / search API per business_contract default contract. + +Corresponding contracts: +- POST /faq/search faq.search +- GET /faq faq.list +- POST /faq faq.upsert +- DELETE /faq/{entry_id} faq.delete + +Environment variables: +- KB_REST_BASE_URL FAQ service base URL +- KB_REST_TOKEN Bearer Token (optional) +- KB_REST_TIMEOUT_MS Timeout (default 5000) + +Security: consistent with human-handoff default_rest — only https / localhost; reject private networks. +""" +from __future__ import annotations + +import logging +import os +import re +from typing import List, Optional +from urllib.parse import urlparse + +try: + import requests # type: ignore +except ImportError: # pragma: no cover + requests = None # type: ignore + +from ..core.models import FaqEntry, KbStats, SearchHit +from ..ports.kb_client import KnowledgeBaseClient + + +logger = logging.getLogger(__name__) + + +_PRIVATE_PATTERNS = [ + re.compile(r"^9\."), + re.compile(r"^10\."), + re.compile(r"^11\."), + re.compile(r"^21\."), + re.compile(r"^30\."), + re.compile(r"^169\.254\."), + re.compile(r"^172\.(1[6-9]|2[0-9]|3[01])\."), + re.compile(r"^192\.168\."), +] + + +def _is_localhost(host: str) -> bool: + return host in {"localhost", "127.0.0.1", "::1"} + + +def _is_private(host: str) -> bool: + return any(p.match(host) for p in _PRIVATE_PATTERNS) + + +def _validate_base_url(url: str) -> str: + parsed = urlparse(url) + if parsed.scheme not in {"http", "https"}: + raise ValueError(f"unsupported scheme: {parsed.scheme}") + host = parsed.hostname or "" + if not host: + raise ValueError("empty host in KB_REST_BASE_URL") + if parsed.scheme == "http" and not _is_localhost(host): + raise ValueError( + "non-HTTPS KB_REST_BASE_URL only allowed for localhost" + ) + if _is_private(host): + raise ValueError( + f"access to private network host '{host}' is denied" + ) + return url.rstrip("/") + + +class DefaultRestKbClient(KnowledgeBaseClient): + """Call external FAQ service per default REST contract.""" + + def __init__( + self, + *, + base_url: str, + token: Optional[str] = None, + timeout_ms: int = 5000, + ) -> None: + if requests is None: + raise RuntimeError( + "requests library is required for DefaultRestKbClient" + ) + self._base = _validate_base_url(base_url) + self._token = token + self._timeout = max(0.5, timeout_ms / 1000.0) + self._session = requests.Session() + + # ------------------------------------------------------------------ + def search( + self, + query: str, + *, + top_k: Optional[int] = None, + min_score: Optional[float] = None, + ) -> List[SearchHit]: + if not query or not query.strip(): + return [] + payload: dict = {"query": query} + if top_k is not None: + payload["top_k"] = int(top_k) + if min_score is not None: + payload["min_score"] = float(min_score) + data = self._post("/faq/search", payload) + items = data if isinstance(data, list) else data.get("hits", []) + hits: List[SearchHit] = [] + for it in items or []: + entry = it.get("entry") if isinstance(it, dict) else None + if entry is None and isinstance(it, dict) and "question" in it: + entry = it + score = float(it.get("score", 0.0)) + else: + score = float(it.get("score", 0.0)) if isinstance(it, dict) else 0.0 + if not entry: + continue + hits.append( + SearchHit( + entry=FaqEntry.from_dict({**entry, "source": "remote_api"}), + score=score, + ) + ) + return hits + + def list_all(self) -> List[FaqEntry]: + data = self._get("/faq") + items = data if isinstance(data, list) else data.get("items", []) + return [FaqEntry.from_dict({**it, "source": "remote_api"}) for it in items] + + def upsert(self, entry: FaqEntry) -> FaqEntry: + if not entry.id or not entry.question: + raise ValueError("id and question are required") + data = self._post("/faq", entry.to_dict()) + return FaqEntry.from_dict({**data, "source": "remote_api"}) + + def delete(self, entry_id: str) -> bool: + url = self._base + f"/faq/{entry_id}" + resp = self._session.delete( + url, headers=self._headers(), timeout=self._timeout + ) + if resp.status_code == 404: + return False + if resp.status_code >= 400: + raise RuntimeError( + f"remote kb service returned HTTP {resp.status_code}" + ) + return True + + def stats(self) -> KbStats: + try: + items = self.list_all() + return KbStats( + backend="remote_api", + entry_count=len(items), + data_source=self._base, + ) + except Exception: # noqa: BLE001 + return KbStats(backend="remote_api", entry_count=-1, data_source=self._base) + + # ------------------------------------------------------------------ + def _headers(self) -> dict: + h = {"Content-Type": "application/json"} + if self._token: + h["Authorization"] = f"Bearer {self._token}" + return h + + def _get(self, path: str): + resp = self._session.get( + self._base + path, headers=self._headers(), timeout=self._timeout + ) + return self._handle(resp) + + def _post(self, path: str, payload: dict): + resp = self._session.post( + self._base + path, + json=payload, + headers=self._headers(), + timeout=self._timeout, + ) + return self._handle(resp) + + @staticmethod + def _handle(resp): + if resp.status_code >= 400: + raise RuntimeError( + f"remote kb service returned HTTP {resp.status_code}" + ) + try: + data = resp.json() + except ValueError as exc: + raise RuntimeError("remote kb service returned non-JSON") from exc + if isinstance(data, dict) and "data" in data and isinstance(data["data"], (dict, list)): + return data["data"] + return data + + +# --------------------------------------------------------------------------- +def from_env() -> Optional[DefaultRestKbClient]: + base = os.getenv("KB_REST_BASE_URL") + if not base: + return None + return DefaultRestKbClient( + base_url=base, + token=os.getenv("KB_REST_TOKEN"), + timeout_ms=int(os.getenv("KB_REST_TIMEOUT_MS", "5000")), + ) diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/factory.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/factory.py new file mode 100644 index 0000000..afc281d --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/factory.py @@ -0,0 +1,86 @@ +"""adapter factory: selects KnowledgeBaseClient implementation based on environment variable. + +Environment variable `KB_ADAPTER`: + local_json Default local JSON file search (production-ready, zero dependencies) + mock Built-in demo FAQ (for Recipe video recording) + default_rest Call remote FAQ service per business_contract default contract + user_custom User integration wizard generated implementation + +When not set or invalid, fall back to local_json (keeps behavior compatibility with Phase 2). +""" +from __future__ import annotations + +import logging +import os +from typing import Optional + +from ..ports.kb_client import KnowledgeBaseClient + + +logger = logging.getLogger(__name__) + + +_VALID = ("local_json", "mock", "default_rest", "user_custom") + + +def _build(name: str) -> Optional[KnowledgeBaseClient]: + if name == "local_json": + from .local_json import from_env as build_local + return build_local() + if name == "mock": + from .mock import from_env as build_mock + return build_mock() + if name == "default_rest": + from .default_rest import from_env as build_rest + c = build_rest() + if c is None: + logger.warning( + "KB_ADAPTER=default_rest but KB_REST_BASE_URL is empty; " + "falling back to local_json" + ) + return c + if name == "user_custom": + try: + from .user_custom import from_env as build_custom # type: ignore + except ImportError: + logger.warning( + "KB_ADAPTER=user_custom but src/adapters/user_custom.py is missing; " + "run scripts/contract-adapt.py knowledge-base to generate it" + ) + return None + return build_custom() + return None + + +def build_default() -> KnowledgeBaseClient: + name = (os.getenv("KB_ADAPTER") or "local_json").strip().lower() + if name not in _VALID: + logger.warning("KB_ADAPTER=%s is not recognised; using local_json", name) + name = "local_json" + client = _build(name) + if client is None: + from .local_json import from_env as build_local + client = build_local() + return client + + +# --------------------------------------------------------------------------- +_singleton: Optional[KnowledgeBaseClient] = None + + +def get_client() -> KnowledgeBaseClient: + global _singleton + if _singleton is None: + _singleton = build_default() + return _singleton + + +def set_client(client: KnowledgeBaseClient) -> None: + """For testing only: inject a custom client.""" + global _singleton + _singleton = client + + +def reset_client() -> None: + global _singleton + _singleton = None diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/local_json.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/local_json.py new file mode 100644 index 0000000..bee52f0 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/local_json.py @@ -0,0 +1,172 @@ +"""LocalJsonKbClient — default local implementation. + +Zero external dependencies. Reads FAQ from JSON file + TF-IDF scoring. Migrated from the original retriever.py implementation as the "default out-of-the-box" version. +""" +from __future__ import annotations + +import json +import os +import threading +import time +from pathlib import Path +from typing import Dict, List, Optional + +from ..core.models import FaqEntry, KbStats, SearchHit +from ..core.scoring import build_df, tfidf_score, tokenize +from ..ports.kb_client import KnowledgeBaseClient + + +class LocalJsonKbClient(KnowledgeBaseClient): + """FAQ retriever based on local JSON file.""" + + def __init__( + self, + data_file: Optional[str | Path] = None, + *, + min_score: float = 0.1, + top_k: int = 3, + ) -> None: + self._lock = threading.RLock() + self._entries: List[FaqEntry] = [] + self._df: Dict[str, int] = {} + self._min_score = float(min_score) + self._top_k = int(top_k) + self._data_file: Optional[Path] = ( + Path(data_file) if data_file else None + ) + self._loaded_at: Optional[float] = None + if self._data_file and self._data_file.exists(): + self.reload() + + # ------------------------------------------------------------------ + # KnowledgeBaseClient required implementations + # ------------------------------------------------------------------ + def search( + self, + query: str, + *, + top_k: Optional[int] = None, + min_score: Optional[float] = None, + ) -> List[SearchHit]: + if not query or not query.strip(): + return [] + k = top_k or self._top_k + threshold = float(min_score) if min_score is not None else self._min_score + q_tokens = tokenize(query) + if not q_tokens: + return [] + with self._lock: + n = max(1, len(self._entries)) + hits: List[SearchHit] = [] + for entry in self._entries: + score = tfidf_score(entry, q_tokens, df=self._df, n_docs=n) + if score >= threshold: + hits.append(SearchHit(entry=entry, score=score)) + hits.sort(key=lambda h: h.score, reverse=True) + return hits[:k] + + def list_all(self) -> List[FaqEntry]: + with self._lock: + return [ + FaqEntry( + id=e.id, + question=e.question, + answer=e.answer, + keywords=list(e.keywords), + source=e.source or "local_json", + ) + for e in self._entries + ] + + def upsert(self, entry: FaqEntry) -> FaqEntry: + if not entry.id or not entry.question: + raise ValueError("id and question are required") + with self._lock: + for i, e in enumerate(self._entries): + if e.id == entry.id: + self._entries[i] = entry + self._rebuild_df() + self._persist() + return entry + self._entries.append(entry) + self._rebuild_df() + self._persist() + return entry + + def delete(self, entry_id: str) -> bool: + with self._lock: + before = len(self._entries) + self._entries = [e for e in self._entries if e.id != entry_id] + removed = before != len(self._entries) + if removed: + self._rebuild_df() + self._persist() + return removed + + def stats(self) -> KbStats: + with self._lock: + return KbStats( + backend="local_json", + entry_count=len(self._entries), + loaded_at=self._loaded_at, + data_source=str(self._data_file) if self._data_file else None, + ) + + def reload(self) -> int: + if not self._data_file or not self._data_file.exists(): + return 0 + raw = json.loads(self._data_file.read_text(encoding="utf-8")) + with self._lock: + self._entries = [FaqEntry.from_dict(item) for item in raw] + for e in self._entries: + e.source = e.source or "local_json" + self._rebuild_df() + self._loaded_at = time.time() + return len(self._entries) + + # ------------------------------------------------------------------ + @property + def data_file(self) -> Optional[Path]: + return self._data_file + + @property + def min_score(self) -> float: + return self._min_score + + @property + def top_k(self) -> int: + return self._top_k + + # ------------------------------------------------------------------ + # Internal + # ------------------------------------------------------------------ + def _rebuild_df(self) -> None: + self._df = build_df(self._entries) + + def _persist(self) -> None: + if not self._data_file: + return + # Ensure directory exists + self._data_file.parent.mkdir(parents=True, exist_ok=True) + tmp = self._data_file.with_suffix(self._data_file.suffix + ".tmp") + tmp.write_text( + json.dumps( + [e.to_dict() for e in self._entries], + ensure_ascii=False, + indent=2, + ), + encoding="utf-8", + ) + os.replace(tmp, self._data_file) + + +# --------------------------------------------------------------------------- +# Factory +# --------------------------------------------------------------------------- +def from_env() -> LocalJsonKbClient: + default_file = Path(__file__).resolve().parent.parent.parent / "data" / "faq.json" + return LocalJsonKbClient( + data_file=os.getenv("KB_DATA_FILE", str(default_file)), + min_score=float(os.getenv("KB_MIN_SCORE", "0.1")), + top_k=int(os.getenv("KB_TOP_K", "3")), + ) diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/mock.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/mock.py new file mode 100644 index 0000000..cd98272 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/adapters/mock.py @@ -0,0 +1,91 @@ +"""MockKbClient — mock implementation for Recipe demo recording. + +Inherits LocalJsonKbClient; seeds embedded demo FAQ into memory on construction (no disk file dependency). + +Applicable scenarios: +- User's first Recipe launch works out of the box even before data/faq.json has been filled in +- Stable data for video demos, won't break due to external file changes +""" +from __future__ import annotations + +import os +from typing import List + +from ..core.models import FaqEntry +from .local_json import LocalJsonKbClient + + +_DEMO_FAQ: List[dict] = [ + { + "id": "demo_refund", + "question": "How do I request a refund?", + "answer": "Go to My Orders, pick the item, tap Request refund and fill in the reason. Once approved, the amount returns to your original payment within 1-3 business days.", + "keywords": ["refund", "money back", "return", "cancel order"], + "source": "demo_seed", + }, + { + "id": "demo_logistics", + "question": "How long does shipping take?", + "answer": "Standard items ship within 48 hours. Major cities receive in 3-5 days; remote areas 5-7 days. You can track shipments in real time under My Orders.", + "keywords": ["shipping", "delivery", "tracking", "where is my package", "logistics"], + "source": "demo_seed", + }, + { + "id": "demo_invoice", + "question": "How can I get an invoice?", + "answer": "Tick \"Need invoice\" at checkout and fill in your billing details. For existing orders, open the order in My Orders and tap Request invoice. E-invoices are emailed within 24 hours.", + "keywords": ["invoice", "receipt", "billing", "tax"], + "source": "demo_seed", + }, + { + "id": "demo_size", + "question": "What if the size doesn't fit?", + "answer": "We offer 7-day free size exchange — return shipping is on us. Open My Orders, tap Exchange size and follow the instructions. If your size is out of stock you can request a refund and re-order.", + "keywords": ["size", "exchange", "doesn't fit", "wrong size"], + "source": "demo_seed", + }, + { + "id": "demo_after_sale", + "question": "What is the warranty period?", + "answer": "All products carry a 1-year warranty. Electronics may be replaced within 30 days; after 30 days the manufacturer's warranty policy applies. Provide your order ID and a description of the issue and we'll open a ticket for you.", + "keywords": ["warranty", "after-sale", "repair", "broken", "defective"], + "source": "demo_seed", + }, +] + + +class MockKbClient(LocalJsonKbClient): + """Mock implementation for demos.""" + + is_mock = True + + def __init__( + self, + *, + min_score: float = 0.05, # More lenient threshold for demos + top_k: int = 3, + seed_demo_data: bool = True, + ) -> None: + # No data_file specified — data is entirely memory-resident + super().__init__( + data_file=None, + min_score=min_score, + top_k=top_k, + ) + if seed_demo_data: + self._seed() + + def _seed(self) -> None: + for raw in _DEMO_FAQ: + entry = FaqEntry.from_dict(raw) + self._entries.append(entry) + self._rebuild_df() + + +# --------------------------------------------------------------------------- +def from_env() -> MockKbClient: + return MockKbClient( + min_score=float(os.getenv("KB_MIN_SCORE", "0.05")), + top_k=int(os.getenv("KB_TOP_K", "3")), + seed_demo_data=os.getenv("KB_MOCK_SEED", "1") not in ("0", "false", "False"), + ) diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/core/__init__.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/__init__.py new file mode 100644 index 0000000..69db768 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/__init__.py @@ -0,0 +1,12 @@ +"""knowledge-base core module.""" +from .models import FaqEntry, KbStats, SearchHit +from .service import KbService, get_default_service, reset_default_service + +__all__ = [ + "FaqEntry", + "KbService", + "KbStats", + "SearchHit", + "get_default_service", + "reset_default_service", +] diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/core/models.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/models.py new file mode 100644 index 0000000..20dad44 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/models.py @@ -0,0 +1,77 @@ +"""knowledge-base core models. + +Defines unified domain models: +- FaqEntry Knowledge entry (id / question / answer / keywords / source) +- SearchHit Search hit (with score) +- KbStats Knowledge base statistics (entry count / data source type / load time) + +All adapters must use this module's data structures as transport objects. +""" +from __future__ import annotations + +from dataclasses import dataclass, field +from typing import List, Optional + + +@dataclass +class FaqEntry: + """A single FAQ knowledge entry.""" + + id: str + question: str + answer: str + keywords: List[str] = field(default_factory=list) + # Optional: annotate entry source (local_json / remote_api / user_uploaded etc.), useful for dashboard display + source: Optional[str] = None + + def to_dict(self) -> dict: + return { + "id": self.id, + "question": self.question, + "answer": self.answer, + "keywords": list(self.keywords), + **({"source": self.source} if self.source else {}), + } + + @classmethod + def from_dict(cls, raw: dict) -> "FaqEntry": + return cls( + id=str(raw.get("id") or raw.get("question", ""))[:64] or "auto", + question=str(raw.get("question", "")).strip(), + answer=str(raw.get("answer", "")).strip(), + keywords=[ + str(k).strip() + for k in (raw.get("keywords") or []) + if str(k).strip() + ], + source=raw.get("source"), + ) + + +@dataclass +class SearchHit: + """Search hit.""" + + entry: FaqEntry + score: float + + def to_dict(self) -> dict: + return {"entry": self.entry.to_dict(), "score": round(float(self.score), 4)} + + +@dataclass +class KbStats: + """Knowledge base statistics (dashboard use).""" + + backend: str # "local_json" / "remote_api" / "mock" / "user_custom" + entry_count: int + loaded_at: Optional[float] = None + data_source: Optional[str] = None + + def to_dict(self) -> dict: + return { + "backend": self.backend, + "entry_count": self.entry_count, + "loaded_at": self.loaded_at, + "data_source": self.data_source, + } diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/core/scoring.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/scoring.py new file mode 100644 index 0000000..b620a60 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/scoring.py @@ -0,0 +1,73 @@ +"""Tokenization + scoring utilities: migrated from original retriever.py. + +No external dependencies, pure Python: +- Chinese: character bigram splitting +- English: word-level splitting +- TF-IDF scoring (keywords weighted 3x) +""" +from __future__ import annotations + +import math +import re +from typing import Dict, Iterable, List + +from .models import FaqEntry + + +_WORD_RE = re.compile(r"[A-Za-z0-9_]+") +_CN_RE = re.compile(r"[\u4e00-\u9fff]+") + + +def tokenize(text: str) -> List[str]: + """Mixed Chinese/English tokenization.""" + if not text: + return [] + text = text.lower() + tokens: List[str] = [] + tokens.extend(_WORD_RE.findall(text)) + for blob in _CN_RE.findall(text): + if len(blob) == 1: + tokens.append(blob) + else: + tokens.extend(blob[i : i + 2] for i in range(len(blob) - 1)) + return tokens + + +def doc_tokens(entry: FaqEntry) -> List[str]: + """Generate token list for scoring a single entry (keywords weighted 3x).""" + toks = tokenize(entry.question) + tokenize(entry.answer) + for kw in entry.keywords: + toks.extend(tokenize(kw) * 3) + return toks + + +def build_df(entries: Iterable[FaqEntry]) -> Dict[str, int]: + """Build document frequency table.""" + df: Dict[str, int] = {} + for e in entries: + for t in set(doc_tokens(e)): + df[t] = df.get(t, 0) + 1 + return df + + +def tfidf_score( + entry: FaqEntry, + q_tokens: List[str], + *, + df: Dict[str, int], + n_docs: int, +) -> float: + """TF-IDF score (normalized to [0, 1]).""" + d_tokens = doc_tokens(entry) + if not d_tokens: + return 0.0 + tf: Dict[str, int] = {} + for t in d_tokens: + tf[t] = tf.get(t, 0) + 1 + score = 0.0 + for t in q_tokens: + if t not in tf: + continue + idf = math.log((n_docs + 1) / (1 + df.get(t, 0))) + 1 + score += (tf[t] / len(d_tokens)) * idf + return max(0.0, min(score, 1.0)) diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/core/service.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/service.py new file mode 100644 index 0000000..963f29e --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/core/service.py @@ -0,0 +1,78 @@ +"""KbService — Application service layer, combining KnowledgeBaseClient with instruction concatenation logic. + +Only depends on ports (KnowledgeBaseClient interface); does not know the specific backend. +""" +from __future__ import annotations + +from typing import List, Optional + +from ..ports.kb_client import KnowledgeBaseClient +from .models import FaqEntry, KbStats, SearchHit + + +class KbService: + + def __init__(self, *, client: KnowledgeBaseClient) -> None: + self._client = client + + # ------------------------------------------------------------------ + def search( + self, + query: str, + *, + top_k: Optional[int] = None, + min_score: Optional[float] = None, + ) -> List[SearchHit]: + return self._client.search(query, top_k=top_k, min_score=min_score) + + def list_all(self) -> List[FaqEntry]: + return self._client.list_all() + + def upsert(self, entry: FaqEntry) -> FaqEntry: + return self._client.upsert(entry) + + def delete(self, entry_id: str) -> bool: + return self._client.delete(entry_id) + + def stats(self) -> KbStats: + return self._client.stats() + + def reload(self) -> int: + return self._client.reload() + + # ------------------------------------------------------------------ + # For injection into conversation-core.before_start: concatenate search results into instructions + # ------------------------------------------------------------------ + def attach_faq_to_instructions( + self, + instructions: str, + *, + max_hits: int = 3, + ) -> str: + if not instructions: + return instructions + hits = self._client.search(instructions, top_k=max_hits) + if not hits: + return instructions + block = ["", "# Retrieved Knowledge (auto-injected by knowledge-base capability)"] + for h in hits: + block.append(f"- Q: {h.entry.question}") + block.append(f" A: {h.entry.answer}") + return instructions + "\n" + "\n".join(block) + + +# --------------------------------------------------------------------------- +_default_service: Optional[KbService] = None + + +def get_default_service() -> KbService: + global _default_service + if _default_service is None: + from ..adapters.factory import get_client + _default_service = KbService(client=get_client()) + return _default_service + + +def reset_default_service() -> None: + global _default_service + _default_service = None diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/ports/__init__.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/ports/__init__.py new file mode 100644 index 0000000..c22f788 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/ports/__init__.py @@ -0,0 +1,4 @@ +"""knowledge-base abstract ports module.""" +from .kb_client import KnowledgeBaseClient + +__all__ = ["KnowledgeBaseClient"] diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/ports/kb_client.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/ports/kb_client.py new file mode 100644 index 0000000..bf06060 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/ports/kb_client.py @@ -0,0 +1,61 @@ +"""knowledge-base abstract port (Port). + +Aligned with manifest.yaml.business_contract.external_apis: +- search -> faq.search +- list_all -> faq.list +- upsert -> faq.upsert +- delete -> faq.delete + +All concrete implementations (local_json / default_rest / mock / user_custom) must inherit this ABC. +The core layer only depends on this interface. +""" +from __future__ import annotations + +from abc import ABC, abstractmethod +from typing import List, Optional + +from ..core.models import FaqEntry, KbStats, SearchHit + + +class KnowledgeBaseClient(ABC): + """Unified interface contract for knowledge base backends.""" + + # ------------------------------------------------------------------ + # Aligned with business_contract + # ------------------------------------------------------------------ + @abstractmethod + def search( + self, + query: str, + *, + top_k: Optional[int] = None, + min_score: Optional[float] = None, + ) -> List[SearchHit]: + """Search for matching FAQ entries. Corresponds to business_contract.faq.search.""" + + @abstractmethod + def list_all(self) -> List[FaqEntry]: + """List all entries. Corresponds to business_contract.faq.list.""" + + @abstractmethod + def upsert(self, entry: FaqEntry) -> FaqEntry: + """Create or update a single entry. Corresponds to business_contract.faq.upsert.""" + + @abstractmethod + def delete(self, entry_id: str) -> bool: + """Delete a single entry. Corresponds to business_contract.faq.delete.""" + + # ------------------------------------------------------------------ + # Dashboard helper methods (default implementation; remote backends may not override) + # ------------------------------------------------------------------ + def stats(self) -> KbStats: + """Return statistics (defaults to live calculation based on list_all).""" + items = self.list_all() + return KbStats( + backend=type(self).__name__, + entry_count=len(items), + ) + + def reload(self) -> int: + """Reload data from external source. Default no-op; local implementations can override to re-read files.""" + return len(self.list_all()) diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/retriever.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/retriever.py new file mode 100644 index 0000000..cea5706 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/retriever.py @@ -0,0 +1,56 @@ +"""retriever.py — compatibility facade. + +Keeps the original `attach_faq_to_instructions` / `get_retriever` / `FaqEntry` / `SearchHit` +public symbols for manifest.extensions (agent.before_start) and external callers to continue using. + +New code should use directly: +- adapters.factory.get_client() Get KnowledgeBaseClient instance +- core.service.get_default_service() Get KbService instance +""" +from __future__ import annotations + +import warnings + +from .core.models import FaqEntry, SearchHit # noqa: F401 (public API) +from .core.service import get_default_service + + +def attach_faq_to_instructions(instructions: str) -> str: + """For use by conversation-core.before_start injection point. + + Keeps old signature: single instructions parameter, returns concatenated instructions. + """ + return get_default_service().attach_faq_to_instructions(instructions) + + +# -------------------------------------------------------------------- +# Deprecated shim: original FaqRetriever global instance / class +# -------------------------------------------------------------------- +def get_retriever(): + """[DEPRECATED] Return KnowledgeBaseClient instance. + + Old FaqRetriever class method names (list_entries/upsert/delete/search/reload) + have corresponding methods on the new client (list_all/upsert/delete/search/reload), + though some signatures differ slightly (list_entries -> list_all). + """ + warnings.warn( + "knowledge_base.retriever.get_retriever() is deprecated; " + "use adapters.factory.get_client() or core.service.get_default_service() instead.", + DeprecationWarning, + stacklevel=2, + ) + from .adapters.factory import get_client + return get_client() + + +# Legacy alias +FaqRetriever = "FaqRetriever (deprecated; use adapters.local_json.LocalJsonKbClient)" + + +__all__ = [ + "FaqEntry", + "FaqRetriever", + "SearchHit", + "attach_faq_to_instructions", + "get_retriever", +] diff --git a/skills/trtc-ai-service/capabilities/knowledge-base/src/router.py b/skills/trtc-ai-service/capabilities/knowledge-base/src/router.py new file mode 100644 index 0000000..25e0737 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/knowledge-base/src/router.py @@ -0,0 +1,85 @@ +"""knowledge-base FastAPI sub-router. + +Mounted on skeleton: app.include_router(router, prefix="/api/v1/kb") + +Refactoring notes: +- All business logic delegated to core.service.KbService +- Response fields remain fully consistent with Phase 2 (using SearchHit.to_dict / FaqEntry.to_dict) +""" +from __future__ import annotations + +import re +from typing import List, Optional + +from fastapi import APIRouter, HTTPException +from pydantic import BaseModel, Field + +from .core.models import FaqEntry +from .core.service import get_default_service + + +router = APIRouter() + + +_HTML_TAG_RE = re.compile(r"<[^>]+>") + + +def _strip_html(s: str) -> str: + return _HTML_TAG_RE.sub("", s).strip() + + +class UpsertRequest(BaseModel): + id: str = Field(..., max_length=64) + question: str = Field(..., max_length=1024) + answer: str = Field(..., max_length=4096) + keywords: List[str] = Field(default_factory=list) + + +class SearchRequest(BaseModel): + query: str = Field(..., max_length=256) + top_k: Optional[int] = Field(default=None, ge=1, le=20) + + +@router.get("/list") +def list_entries() -> dict: + items = get_default_service().list_all() + return {"code": 0, "data": [e.to_dict() for e in items]} + + +@router.post("/search") +def search(req: SearchRequest) -> dict: + hits = get_default_service().search(req.query, top_k=req.top_k) + return {"code": 0, "data": [h.to_dict() for h in hits]} + + +@router.post("/upsert") +def upsert(req: UpsertRequest) -> dict: + entry = FaqEntry( + id=req.id.strip(), + question=_strip_html(req.question), + answer=_strip_html(req.answer), + keywords=[_strip_html(k) for k in (req.keywords or []) if k.strip()], + ) + if not entry.id or not entry.question: + raise HTTPException(status_code=400, detail="id and question are required") + saved = get_default_service().upsert(entry) + return {"code": 0, "data": saved.to_dict()} + + +@router.delete("/{entry_id}") +def delete(entry_id: str) -> dict: + ok = get_default_service().delete(entry_id) + if not ok: + raise HTTPException(status_code=404, detail=f"entry not found: {entry_id}") + return {"code": 0, "data": {"deleted": entry_id}} + + +@router.post("/reload") +def reload_data() -> dict: + n = get_default_service().reload() + return {"code": 0, "data": {"count": n}} + + +@router.get("/stats") +def stats() -> dict: + return {"code": 0, "data": get_default_service().stats().to_dict()} diff --git a/skills/trtc-ai-service/capabilities/session-summary/INTERFACE_ADAPT.md b/skills/trtc-ai-service/capabilities/session-summary/INTERFACE_ADAPT.md new file mode 100644 index 0000000..26d4088 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/INTERFACE_ADAPT.md @@ -0,0 +1,99 @@ +# session-summary Interface Adaptation SOP + +> Session summary + structured summary write-back to CRM / ticketing / data platform. +> This capability's source code has not been refactored to ports/adapters/core in this release (Phase 1 compromise). + +--- + +## 1. Default Contract Overview + +| Contract | Method | Path | Purpose | +|---|---|---|---| +| `summary.write_to_crm` | POST | `/sessions/{session_id}/summary` | Write finalized summary to user's CRM / ticketing system | +| `summary.llm_summarize` | — | Reuses conversation-core `llm.chat_completions` | LLM secondary summarization, no separate adaptation needed | + +Full field details in `manifest.yaml.business_contract.external_apis`. + +--- + +## 2. Default Behavior + +session-summary currently **only writes to local files by default** (path `data/.json`), +and does not actively write to any remote system. This means: + +- Integrators need to scan the `data/` directory themselves or call `/api/v1/summary/{session_id}` to pull +- No outbound calls are enabled, keeping security risk low +- Suitable as a "draft state"; integrators plug into business systems on demand + +--- + +## 3. Enabling CRM Write-Back + +### 3.1 Configure Default Contract + +```bash +# Enable CRM write-back + user's CRM API fully matches default contract +export SS_CRM_WRITE_ENABLED=1 +export SS_CRM_BASE_URL=https://crm.example.com +export SS_CRM_TOKEN=sk-xxx +``` + +> The current capability source code has **not implemented** the `SS_CRM_*` environment variable logic; +> you need to write an adapter subclass per §4 below before enabling. After Phase 4's full refactor, this will become out-of-the-box. + +### 3.2 Custom Field Mapping + +If the user's CRM API has different field names (e.g. `summary` / `priority` / `tags`), write a simple +webhook middleware layer or append an HTTP call at the end of `recorder.py`'s `finalize_session` function. + +Reference code snippet (manually append to `src/recorder.py`'s finalize section): + +```python +import os, requests +def _maybe_write_crm(session_id: str, summary_payload: dict): + base = os.getenv("SS_CRM_BASE_URL") + if not base or os.getenv("SS_CRM_WRITE_ENABLED") != "1": + return + # Field remapping example + body = { + "session_id": session_id, + "summary": summary_payload.get("topic"), # User's field is called summary + "priority": summary_payload.get("outcome"), # User's field is called priority + "tags": summary_payload.get("next_actions"), + } + requests.post( + f"{base}/sessions/{session_id}/summary", + json=body, + headers={"Authorization": f"Bearer {os.getenv('SS_CRM_TOKEN', '')}"}, + timeout=5, + ) +``` + +--- + +## 4. Phase 4 Plan: Full ports/adapters Refactor + +The following will be introduced in the future: + +``` +capabilities/session-summary/src/ +├── ports/ +│ └── crm_client.py # ABC: write_summary / query_summary +└── adapters/ + ├── local_file.py # Default implementation: local disk only (current behavior) + ├── default_rest.py # Calls per default CRM contract + ├── mock.py # Mock for demos + └── user_custom.py # User integration wizard generator +``` + +At that time, `SS_ADAPTER=user_custom` will support direct switching without manually modifying recorder.py. + +--- + +## 5. Security Checklist + +- [ ] CRM write-back auto-redacts sensitive fields (`secret_id` / `api_key` / `token` etc.) +- [ ] `SS_CRM_BASE_URL` must use https:// (localhost excepted) +- [ ] Reject private network addresses +- [ ] Disk file permissions enforced to 0600 (declared in manifest.security.storage) +- [ ] Summary transcripts filter out sensitive user PII (phone / ID numbers) — current capability **not implemented**; user's business layer must handle this diff --git a/skills/trtc-ai-service/capabilities/session-summary/README.md b/skills/trtc-ai-service/capabilities/session-summary/README.md new file mode 100644 index 0000000..a351c35 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/README.md @@ -0,0 +1,47 @@ +# session-summary · Session Summary + Structured Abstract + +> Auto-archives turn records for each session on top of conversation-core, +> and produces structured summaries (topics / intents / next_actions) after calling `finalize`. + +## Install + +```bash +python scripts/add-capability.py session-summary +``` + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `SS_STORAGE_DIR` | `capabilities/session-summary/data/` | Summary landing directory (permissions 0600) | +| `SS_RETENTION_DAYS` | `30` | Retention days; auto-cleanup after expiry | +| `SS_LLM_SUMMARY` | `true` | Whether to call LLM for secondary summarization (depends on `LLM_API_KEY`) | + +## REST API + +| Method | Path | Purpose | +|:---|:---|:---| +| GET | `/api/v1/summary/_list?_offset=0&_limit=20` | Recent summary list | +| GET | `/api/v1/summary/{session_id}` | Single session summary details | +| POST | `/api/v1/summary/{session_id}/finalize` | Close session and trigger summary | + +## Summary Output + +```json +{ + "topics": ["order", "shipping"], + "user_intents": ["When will my order ship?"], + "next_actions": ["Please update my address"], + "highlights": ["12 turns recorded"], + "engine": "heuristic", + "model": null +} +``` + +LLM path falls back to local heuristic implementation on failure, ensuring offline availability. + +## Security + +- Disk file permissions enforced to `0600` +- Sensitive fields (`secret_id / api_key / token / credential`) redacted before writing +- Transcript text max length `4096` diff --git a/skills/trtc-ai-service/capabilities/session-summary/data/test_session.json b/skills/trtc-ai-service/capabilities/session-summary/data/test_session.json new file mode 100644 index 0000000..0d231f0 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/data/test_session.json @@ -0,0 +1,18 @@ +{ + "session_id": "test_session", + "opened_at": 1780905036.267712, + "closed_at": null, + "turns": [ + { + "role": "user", + "text": "Can you check the delivery status of order A1234?", + "ts": 1780905036.26798 + }, + { + "role": "assistant", + "text": "Usually 3-5 business days.", + "ts": 1780905036.2682061 + } + ], + "summary": null +} \ No newline at end of file diff --git a/skills/trtc-ai-service/capabilities/session-summary/manifest.yaml b/skills/trtc-ai-service/capabilities/session-summary/manifest.yaml new file mode 100644 index 0000000..3a3e4d7 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/manifest.yaml @@ -0,0 +1,165 @@ +# session-summary capability self-describing manifest +# Type: capability (optional install) + +name: "session-summary" +version: "1.0.0" +type: "capability" +description: "Automatic session summarization + structured summary output (turn recording / key fact extraction / LLM secondary summarization)" + +dependencies: + - name: "conversation-core" + version: ">=1.0.0,<2.0.0" + +# Optional integration (soft dependency, not hard dependency): +# When human-handoff is also installed, ticket creation will auto-trigger this capability +# to generate a summary for that session and attach it to the ticket. Agents can see the +# customer issue context directly in the dashboard ticket details (no need to manually click "generate summary"). +# The integration is implemented as a best-effort call from human-handoff (capabilities/human-handoff/src/summary_link.py); +# when this capability is not installed, the call silently skips without affecting the handoff main flow. +optional_integrations: + - capability: "human-handoff" + trigger: "ticket.create" + behavior: "auto-finalize session summary (heuristic) and embed into ticket payload" + +extensions: + # Archive as a user turn before text injection + - inject_at: "agent.before_push_text" + inline_code: | + # [session-summary] record user turn + from ._capability_loader import try_load_capability + _ss_recorder = try_load_capability("session-summary", "src/recorder.py") + if _ss_recorder is not None: + _ss_recorder.record_user_turn(session_id, text) + - inject_at: "agent.after_start" + inline_code: | + # [session-summary] open session + from ._capability_loader import try_load_capability + _ss_recorder = try_load_capability("session-summary", "src/recorder.py") + if _ss_recorder is not None: + _ss_recorder.open_session(session_id) + - inject_at: "server.router_extension" + inline_code: | + # [session-summary] mount sub-router + from ._capability_loader import try_load_capability as _try_load_capability + _ss_router_mod = _try_load_capability("session-summary", "src/router.py") + if _ss_router_mod is not None and hasattr(_ss_router_mod, "router"): + app.include_router( + _ss_router_mod.router, prefix="/api/v1/summary", tags=["session-summary"] + ) + +config: + adapter: + description: "Summary write-back target: mock (default, no external deps) | local_json | default_rest" + default: "mock" + env: "SS_ADAPTER" + options: + mock: "Log-only + returns mock record_id, suitable for local / Path A demos" + local_json: "Append to local JSONL (data/_writeback.jsonl)" + default_rest: "POST to real CRM / ticketing system (requires SS_REST_BASE_URL[/SS_REST_TOKEN])" + storage_dir: + description: "Summary file landing directory (one JSON per session)" + default: "capabilities/session-summary/data/" + llm_summary: + description: "Whether to enable LLM secondary summarization (consumes LLM_API_KEY; set false for offline scenarios)" + default: true + retention_days: + description: "Summary retention days; auto-cleanup after expiry" + default: 30 + redact_patterns: + description: "Fields to auto-redact before writing summaries (same standard as skeleton log_redaction)" + default: + - "secret_id" + - "secret_key" + - "api_key" + - "token" + +endpoints: + - method: GET + path: /api/v1/summary/{session_id} + description: Get single session summary (includes turn list and structured summary) + - method: POST + path: /api/v1/summary/{session_id}/finalize + description: Close session and trigger LLM secondary summarization + - method: GET + path: /api/v1/summary/_list + description: List most recent N summaries (_offset / _limit query) + +integration: + mode: "auto" + auto_adapters: + - tech_stack: ["express", "koa", "fastify"] + adapter: "node-backend" + - tech_stack: ["flask", "fastapi", "django"] + adapter: "python-backend" + - tech_stack: ["react", "vue", "angular"] + adapter: "frontend-spa" + description: "Append summary viewer panel to SPA" + fallback: + guided_templates: + - "../../auto_adapters/integration_templates/generic-backend.md" + manual_api: + rest_endpoint: "/api/v1/summary" + +security: + log_redaction: + enabled: true + patterns: ["secret_id", "secret_key", "api_key", "token", "credential"] + storage: + file_permission: "0600" + +# --------------------------------------------------------------------------- +# Business contract (Phase 3 new; follows references/business-contract-spec.md v1.0) +# Note: Source code for this capability is not refactored in this phase (compromise strategy); +# only declares the external contract for contract-adapt.py consumption. +# --------------------------------------------------------------------------- +business_contract: + port_class: "src.adapters.base.SummarySink" + default_adapter: "src.adapters.default_rest.DefaultRestSink" + mock_adapter: "src.adapters.mock.MockSink" + customization_sop: "INTERFACE_ADAPT.md" + external_apis: + # CRM / Ticketing: write session summary back + - name: summary.write_to_crm + direction: outbound + method: POST + path: /sessions/{session_id}/summary + description: "After session finalize, write structured summary to user's CRM / ticketing / data platform" + request_schema: + session_id: string + user_id: string + topic: string + outcome: enum[resolved, transferred, abandoned, follow_up] + next_actions: string[] + full_transcript: string[] + structured_facts: object + finalized_at: int + response_schema: + record_id: string + accepted: bool + adapter_slots: + - request.topic + - request.outcome + - request.next_actions + - request.structured_facts + - response.record_id + auth: + type: bearer + location: header + name: Authorization + timeout_ms: 5000 + + # Secondary summarization call (reuses conversation-core LLM interface; declared here only as reference) + - name: summary.llm_summarize + direction: outbound + description: "Reuses conversation-core llm.chat_completions interface; + uses same LLM_API_URL / LLM_API_KEY config; no separate adaptation needed." + method: POST + path: "(see conversation-core: llm.chat_completions)" + request_schema: {} + response_schema: {} + adapter_slots: [] + +acceptance: + - "Each user/assistant turn persisted as JSONL" + - "After calling finalize, produce structured summary (includes topics / next_actions)" + - "File permissions set to 0600, sensitive fields auto-redacted" diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/__init__.py b/skills/trtc-ai-service/capabilities/session-summary/src/__init__.py new file mode 100644 index 0000000..742f2f6 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/__init__.py @@ -0,0 +1,2 @@ +"""session-summary capability: session summary + structured abstract.""" +__version__ = "1.0.0" diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/adapters/__init__.py b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/__init__.py new file mode 100644 index 0000000..c2a7830 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/__init__.py @@ -0,0 +1,5 @@ +"""session-summary write-back adapter layer.""" +from .base import SummarySink +from .factory import get_sink + +__all__ = ["SummarySink", "get_sink"] diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/adapters/base.py b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/base.py new file mode 100644 index 0000000..d0b8ca0 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/base.py @@ -0,0 +1,31 @@ +"""session-summary write-back adapter layer. + +Consistent "three-tier adapter + factory" paradigm matching knowledge-base / human-handoff: + mock — Default; summary only logged + returns mock record_id, no external system needed, + allows local / Path A demos to immediately see "write-back success". + local_json — Append to local JSONL (data/_writeback.jsonl), can be checked offline. + default_rest — POST to user's real CRM / ticketing system (SS_REST_BASE_URL). + +External contract see manifest.business_contract.external_apis[summary.write_to_crm]. +When interface fields don't align, refer to INTERFACE_ADAPT.md for request/response mapping. +""" +from __future__ import annotations + +import abc +from typing import Any, Dict + + +class SummarySink(abc.ABC): + """Unified abstraction for session summary write-back targets.""" + + name: str = "base" + + @abc.abstractmethod + def write(self, summary_record: Dict[str, Any]) -> Dict[str, Any]: + """Write a finalized summary record to the target system. + + Returns + ------- + dict: {"record_id": str, "accepted": bool, "sink": str} + """ + raise NotImplementedError diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/adapters/default_rest.py b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/default_rest.py new file mode 100644 index 0000000..a3f845e --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/default_rest.py @@ -0,0 +1,67 @@ +"""default_rest write-back sink — POST to user's real CRM / ticketing system. + +Environment variables: + SS_REST_BASE_URL Required, e.g. https://crm.example.com/api + SS_REST_TOKEN Optional, Bearer Token + SS_REST_TIMEOUT_MS Optional, default 5000 + +Aligned with manifest.business_contract.external_apis[summary.write_to_crm]: + POST {base}/sessions/{session_id}/summary +Adjust _build_payload in this file or refer to INTERFACE_ADAPT.md when fields don't align. +""" +from __future__ import annotations + +import logging +import os +from typing import Any, Dict + +from .base import SummarySink + +logger = logging.getLogger(__name__) + + +class DefaultRestSink(SummarySink): + name = "default_rest" + + def __init__(self) -> None: + self._base = (os.getenv("SS_REST_BASE_URL") or "").rstrip("/") + self._token = os.getenv("SS_REST_TOKEN") or "" + self._timeout = max(int(os.getenv("SS_REST_TIMEOUT_MS", "5000")), 100) / 1000.0 + if not self._base: + raise RuntimeError("SS_REST_BASE_URL not configured for default_rest sink") + # Security: forbid plaintext http (except localhost debugging), prevent credentials / summaries over unencrypted channels + if not self._base.startswith("https://") and "localhost" not in self._base: + raise RuntimeError(f"SS_REST_BASE_URL must use HTTPS: {self._base}") + + def _build_payload(self, rec: Dict[str, Any]) -> Dict[str, Any]: + summary = rec.get("summary") or {} + return { + "session_id": rec.get("session_id"), + "topic": (summary.get("topics") or [None])[0], + "outcome": summary.get("outcome", "follow_up"), + "next_actions": summary.get("next_actions") or [], + "full_transcript": [t.get("text", "") for t in rec.get("turns") or []], + "structured_facts": summary, + "finalized_at": rec.get("closed_at"), + } + + def write(self, summary_record: Dict[str, Any]) -> Dict[str, Any]: + import requests + + sid = summary_record.get("session_id", "") + url = f"{self._base}/sessions/{sid}/summary" + headers = {"Content-Type": "application/json"} + if self._token: + headers["Authorization"] = f"Bearer {self._token}" + resp = requests.post(url, json=self._build_payload(summary_record), + headers=headers, timeout=self._timeout) + resp.raise_for_status() + try: + data = resp.json() + except ValueError: + data = {} + return { + "record_id": data.get("record_id", ""), + "accepted": bool(data.get("accepted", True)), + "sink": self.name, + } diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/adapters/factory.py b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/factory.py new file mode 100644 index 0000000..1632e8f --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/factory.py @@ -0,0 +1,51 @@ +"""Write-back sink factory — selects implementation by env SS_ADAPTER (consistent with KB/handoff factory paradigm). + + SS_ADAPTER=mock Default; no external dependencies + SS_ADAPTER=local_json Write local JSONL + SS_ADAPTER=default_rest POST to real CRM (requires SS_REST_BASE_URL) + +When any implementation initialization fails (e.g. default_rest missing base_url), safely degrades to mock, +ensuring the finalize flow is never interrupted due to unavailable write-back targets. +""" +from __future__ import annotations + +import logging +import os +import threading +from typing import Optional + +from .base import SummarySink +from .mock import MockSink + +logger = logging.getLogger(__name__) + +_lock = threading.RLock() +_instance: Optional[SummarySink] = None +_instance_key: Optional[str] = None + + +def _build(name: str) -> SummarySink: + name = (name or "mock").strip().lower() + if name == "local_json": + from .local_json import LocalJsonSink + return LocalJsonSink() + if name == "default_rest": + from .default_rest import DefaultRestSink + return DefaultRestSink() + return MockSink() + + +def get_sink() -> SummarySink: + """Return the currently configured write-back sink (cached by SS_ADAPTER; rebuilds on env change).""" + global _instance, _instance_key + key = (os.getenv("SS_ADAPTER", "mock") or "mock").strip().lower() + with _lock: + if _instance is not None and _instance_key == key: + return _instance + try: + _instance = _build(key) + except Exception as exc: # noqa: BLE001 + logger.warning("session-summary sink '%s' init failed, fallback to mock: %s", key, exc) + _instance = MockSink() + _instance_key = key + return _instance diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/adapters/local_json.py b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/local_json.py new file mode 100644 index 0000000..de38c5a --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/local_json.py @@ -0,0 +1,42 @@ +"""local_json write-back sink — append to local JSONL for offline verification.""" +from __future__ import annotations + +import hashlib +import json +import os +import threading +import time +from pathlib import Path +from typing import Any, Dict + +from .base import SummarySink + +_LOCK = threading.RLock() + + +class LocalJsonSink(SummarySink): + name = "local_json" + + def __init__(self) -> None: + base = os.getenv( + "SS_STORAGE_DIR", + str(Path(__file__).resolve().parents[2] / "data"), + ) + self._file = Path(base) / "_writeback.jsonl" + self._file.parent.mkdir(parents=True, exist_ok=True) + + def write(self, summary_record: Dict[str, Any]) -> Dict[str, Any]: + sid = str(summary_record.get("session_id", "")) + record_id = "CRM-LOCAL-" + hashlib.md5(sid.encode("utf-8")).hexdigest()[:10].upper() + line = json.dumps( + {"record_id": record_id, "written_at": int(time.time()), "record": summary_record}, + ensure_ascii=False, + ) + with _LOCK: + with self._file.open("a", encoding="utf-8") as f: + f.write(line + "\n") + try: + os.chmod(self._file, 0o600) + except OSError: + pass + return {"record_id": record_id, "accepted": True, "sink": self.name} diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/adapters/mock.py b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/mock.py new file mode 100644 index 0000000..ec97a7b --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/adapters/mock.py @@ -0,0 +1,22 @@ +"""mock write-back sink — default implementation, no external dependencies, suitable for demos.""" +from __future__ import annotations + +import hashlib +import logging +from typing import Any, Dict + +from .base import SummarySink + +logger = logging.getLogger(__name__) + + +class MockSink(SummarySink): + name = "mock" + + def write(self, summary_record: Dict[str, Any]) -> Dict[str, Any]: + sid = str(summary_record.get("session_id", "")) + record_id = "CRM-MOCK-" + hashlib.md5(sid.encode("utf-8")).hexdigest()[:10].upper() + topics = (summary_record.get("summary") or {}).get("topics") or [] + logger.info("[session-summary][mock] write-back session=%s topics=%s -> %s", + sid, topics[:3], record_id) + return {"record_id": record_id, "accepted": True, "sink": self.name, "_mock": True} diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/recorder.py b/skills/trtc-ai-service/capabilities/session-summary/src/recorder.py new file mode 100644 index 0000000..78e18f5 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/recorder.py @@ -0,0 +1,210 @@ +"""Session turn recorder. + +Persistence strategy: +- Each session generates ``{session_id}.json`` under storage_dir, permissions 0600. +- Sensitive fields redacted before writing (same standard as skeleton log_filter). +- File structure: + { + "session_id": "...", + "opened_at": 1717830000.0, + "closed_at": null, + "turns": [ + {"role": "user", "ts": 1717830001.0, "text": "..."}, + {"role": "assistant", "ts": 1717830002.0, "text": "..."} + ], + "summary": null # Populated after finalize + } +""" +from __future__ import annotations + +import json +import os +import re +import threading +import time +from dataclasses import dataclass, field +from pathlib import Path +from typing import Any, Dict, List, Optional + + +_DEFAULT_STORAGE_DIR = Path( + os.getenv( + "SS_STORAGE_DIR", + str(Path(__file__).resolve().parent.parent / "data"), + ) +) +_RETENTION_DAYS = int(os.getenv("SS_RETENTION_DAYS", "30")) + + +# Default redaction mode (aligned with skeleton log_filter) +_REDACT_PATTERNS = [ + re.compile(r"(?i)(secret_id|secret_key|api_key|token|credential|authorization)\s*[:=]\s*([^\s,'\"\\]+)"), +] + + +def _redact(text: str) -> str: + if not text: + return text + out = text + for pat in _REDACT_PATTERNS: + out = pat.sub(lambda m: f"{m.group(1)}=***", out) + return out + + +@dataclass +class Turn: + role: str + text: str + ts: float = field(default_factory=time.time) + + def to_dict(self) -> dict: + return {"role": self.role, "text": self.text, "ts": self.ts} + + +@dataclass +class SessionRecord: + session_id: str + opened_at: float = field(default_factory=time.time) + closed_at: Optional[float] = None + turns: List[Turn] = field(default_factory=list) + summary: Optional[Dict[str, Any]] = None + + def to_dict(self) -> dict: + return { + "session_id": self.session_id, + "opened_at": self.opened_at, + "closed_at": self.closed_at, + "turns": [t.to_dict() for t in self.turns], + "summary": self.summary, + } + + +class Recorder: + def __init__(self, storage_dir: Optional[Path] = None) -> None: + self._lock = threading.RLock() + self._dir = Path(storage_dir or _DEFAULT_STORAGE_DIR) + self._dir.mkdir(parents=True, exist_ok=True) + self._cache: Dict[str, SessionRecord] = {} + self._cleanup_old_files() + + @property + def storage_dir(self) -> Path: + return self._dir + + def open(self, session_id: str) -> SessionRecord: + with self._lock: + rec = self._cache.get(session_id) + if rec is not None: + return rec + path = self._path(session_id) + if path.exists(): + rec = self._load(path) + else: + rec = SessionRecord(session_id=session_id) + self._cache[session_id] = rec + self._persist(rec) + return rec + + def add_turn(self, session_id: str, role: str, text: str) -> None: + if not text or not text.strip(): + return + if role not in ("user", "assistant", "system", "tool"): + return + with self._lock: + rec = self._cache.get(session_id) or self.open(session_id) + rec.turns.append(Turn(role=role, text=_redact(text)[:4096])) + self._persist(rec) + + def get(self, session_id: str) -> Optional[SessionRecord]: + with self._lock: + rec = self._cache.get(session_id) + if rec is not None: + return rec + path = self._path(session_id) + if path.exists(): + rec = self._load(path) + self._cache[session_id] = rec + return rec + return None + + def list_recent(self, offset: int = 0, limit: int = 20) -> List[Dict[str, Any]]: + files = sorted( + self._dir.glob("*.json"), + key=lambda p: p.stat().st_mtime, + reverse=True, + )[offset : offset + limit] + out: List[Dict[str, Any]] = [] + for p in files: + try: + out.append(json.loads(p.read_text(encoding="utf-8"))) + except (OSError, json.JSONDecodeError): + continue + return out + + def finalize(self, session_id: str, summary: Dict[str, Any]) -> SessionRecord: + with self._lock: + rec = self.get(session_id) + if rec is None: + raise ValueError(f"session not found: {session_id}") + rec.closed_at = time.time() + rec.summary = summary + self._persist(rec) + return rec + + # ------------------------------------------------------------------ + def _path(self, session_id: str) -> Path: + safe = re.sub(r"[^A-Za-z0-9_\-]", "_", session_id)[:64] + return self._dir / f"{safe}.json" + + def _load(self, path: Path) -> SessionRecord: + data = json.loads(path.read_text(encoding="utf-8")) + rec = SessionRecord( + session_id=data.get("session_id", path.stem), + opened_at=float(data.get("opened_at", time.time())), + closed_at=data.get("closed_at"), + summary=data.get("summary"), + ) + for t in data.get("turns") or []: + rec.turns.append(Turn(role=t.get("role", "user"), text=t.get("text", ""), ts=float(t.get("ts", 0)))) + return rec + + def _persist(self, rec: SessionRecord) -> None: + path = self._path(rec.session_id) + tmp = path.with_suffix(".json.tmp") + tmp.write_text(json.dumps(rec.to_dict(), ensure_ascii=False, indent=2), encoding="utf-8") + os.replace(tmp, path) + try: + os.chmod(path, 0o600) + except OSError: + pass + + def _cleanup_old_files(self) -> None: + cutoff = time.time() - _RETENTION_DAYS * 86400 + for p in self._dir.glob("*.json"): + try: + if p.stat().st_mtime < cutoff: + p.unlink() + except OSError: + continue + + +# --------------------------------------------------------------------------- +# Global singleton + manifest.extensions injection functions +# --------------------------------------------------------------------------- +_global_recorder = Recorder() + + +def get_recorder() -> Recorder: + return _global_recorder + + +def open_session(session_id: str) -> None: + _global_recorder.open(session_id) + + +def record_user_turn(session_id: str, text: str) -> None: + _global_recorder.add_turn(session_id, "user", text) + + +def record_assistant_turn(session_id: str, text: str) -> None: + _global_recorder.add_turn(session_id, "assistant", text) diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/router.py b/skills/trtc-ai-service/capabilities/session-summary/src/router.py new file mode 100644 index 0000000..803b946 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/router.py @@ -0,0 +1,93 @@ +"""session-summary FastAPI sub-router.""" +from __future__ import annotations + +import os +from typing import List, Optional + +from fastapi import APIRouter, HTTPException +from pydantic import BaseModel, Field + +from .recorder import get_recorder +from .summarizer import summarize + +router = APIRouter() + + +# --------------------------------------------------------------------------- +# Request bodies +# --------------------------------------------------------------------------- +class TurnItem(BaseModel): + role: str = Field(..., max_length=16) + text: str = Field("", max_length=4096) + + +class RecordBody(BaseModel): + """Batch-upload conversation turns so a summary can be attached to a ticket + before the frontend requests a handoff.""" + + turns: List[TurnItem] = Field(default_factory=list) + + +@router.get("/_list") +def list_recent(_offset: int = 0, _limit: int = 20) -> dict: + if _limit < 1 or _limit > 200: + raise HTTPException(status_code=400, detail="_limit out of range [1,200]") + return {"code": 0, "data": get_recorder().list_recent(offset=_offset, limit=_limit)} + + +@router.get("/{session_id}") +def get_summary(session_id: str) -> dict: + rec = get_recorder().get(session_id) + if rec is None: + raise HTTPException(status_code=404, detail=f"session not found: {session_id}") + return {"code": 0, "data": rec.to_dict()} + + +@router.post("/{session_id}/record") +def record_turns(session_id: str, body: RecordBody) -> dict: + """Batch-record conversation turns for a session. + + Called by the frontend right before requesting a human handoff, so that + attach_summary_to_ticket can produce a context summary from the transcript. + Safe to call repeatedly; idempotent per turn text. + """ + recorder = get_recorder() + recorder.open(session_id) + accepted = 0 + for t in body.turns: + role = (t.role or "").strip().lower() + if role not in ("user", "assistant", "system", "tool"): + continue + if not t.text: + continue + recorder.add_turn(session_id, role, t.text) + accepted += 1 + rec = recorder.get(session_id) + return { + "code": 0, + "data": { + "session_id": session_id, + "accepted": accepted, + "total_turns": len(rec.turns) if rec else 0, + }, + } + + +@router.post("/{session_id}/finalize") +def finalize(session_id: str) -> dict: + rec = get_recorder().get(session_id) + if rec is None: + raise HTTPException(status_code=404, detail=f"session not found: {session_id}") + prefer_llm = os.getenv("SS_LLM_SUMMARY", "true").lower() == "true" + summary = summarize(rec, prefer_llm=prefer_llm) + rec = get_recorder().finalize(session_id, summary) + # Write-back: select mock / local_json / default_rest by SS_ADAPTER (safe degradation to mock on failure) + writeback = None + try: + from .adapters.factory import get_sink + writeback = get_sink().write(rec.to_dict()) + except Exception as exc: # noqa: BLE001 + writeback = {"accepted": False, "error": str(exc)} + data = rec.to_dict() + data["writeback"] = writeback + return {"code": 0, "data": data} diff --git a/skills/trtc-ai-service/capabilities/session-summary/src/summarizer.py b/skills/trtc-ai-service/capabilities/session-summary/src/summarizer.py new file mode 100644 index 0000000..18ccc15 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/session-summary/src/summarizer.py @@ -0,0 +1,163 @@ +"""Structured summary generator. + +Strategy: +- Offline heuristic (default): extract questions / to-do keywords / key nouns, done locally with zero dependencies. +- LLM secondary summarization (optional, requires LLM_API_KEY): serialize turns and call OpenAI-compatible protocol. + +Output JSON: + { + "topics": ["..."], + "user_intents": ["..."], + "next_actions": ["..."], + "highlights": ["..."], + "engine": "heuristic" | "llm", + "model": "gpt-4o-mini" | null + } +""" +from __future__ import annotations + +import json +import logging +import os +import re +from typing import Any, Dict, List, Optional + +from .recorder import SessionRecord + +logger = logging.getLogger(__name__) + + +_QUESTION_RE = re.compile(r"[^??]+[??]") +_ACTION_RE = re.compile(r"(I want|please help|need to|please)([^。.!??!\n]+)", re.IGNORECASE) +_NOUN_RE = re.compile(r"[A-Z][A-Za-z0-9_]{2,}|[\u4e00-\u9fff]{2,}") +_STOPWORDS = {"the", "a", "an", "because", "so", "when", "can", "need"} + + +def _heuristic(record: SessionRecord) -> Dict[str, Any]: + topics: List[str] = [] + intents: List[str] = [] + actions: List[str] = [] + highlights: List[str] = [] + seen_topic, seen_intent, seen_action = set(), set(), set() + for t in record.turns: + if t.role != "user": + continue + for q in _QUESTION_RE.findall(t.text): + q = q.strip() + if q and q not in seen_intent: + intents.append(q[:120]) + seen_intent.add(q) + for m in _ACTION_RE.finditer(t.text): + phrase = (m.group(1) + m.group(2)).strip()[:120] + if phrase and phrase not in seen_action: + actions.append(phrase) + seen_action.add(phrase) + for noun in _NOUN_RE.findall(t.text): + if noun in _STOPWORDS or len(noun) > 24: + continue + if noun not in seen_topic and len(topics) < 8: + topics.append(noun) + seen_topic.add(noun) + if record.turns: + highlights.append(f"{len(record.turns)} turns recorded") + return { + "topics": topics, + "user_intents": intents[:5], + "next_actions": actions[:5], + "highlights": highlights, + "engine": "heuristic", + "model": None, + } + + +def _llm_summarize(record: SessionRecord) -> Dict[str, Any]: + api_key = os.getenv("LLM_API_KEY") + api_url = os.getenv("LLM_API_URL", "https://api.openai.com/v1/chat/completions") + model = os.getenv("LLM_MODEL", "gpt-4o-mini") + if not api_key: + raise RuntimeError("LLM_API_KEY not configured") + import requests + + transcript = "\n".join(f"[{t.role}] {t.text}" for t in record.turns[-50:]) + prompt = ( + "You are a session summary assistant. Summarize the following conversation as JSON with keys: topics, user_intents," + " next_actions, highlights. Each value is a string array (max 5 items)." + "Do not include any sensitive information (API Key/Token etc.).\n" + f"Conversation content:\n{transcript}\n" + "Output JSON only." + ) + resp = requests.post( + api_url, + headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}, + json={ + "model": model, + "messages": [{"role": "user", "content": prompt}], + "temperature": 0.2, + "response_format": {"type": "json_object"}, + }, + timeout=20, + ) + resp.raise_for_status() + data = resp.json() + content = data["choices"][0]["message"]["content"] + try: + parsed = json.loads(content) + except json.JSONDecodeError: + parsed = {"highlights": [content[:512]]} + parsed.setdefault("topics", []) + parsed.setdefault("user_intents", []) + parsed.setdefault("next_actions", []) + parsed.setdefault("highlights", []) + parsed["engine"] = "llm" + parsed["model"] = model + return parsed + + +def summarize(record: SessionRecord, *, prefer_llm: bool = True) -> Dict[str, Any]: + if prefer_llm and os.getenv("LLM_API_KEY"): + try: + return _llm_summarize(record) + except Exception as exc: # noqa: BLE001 + logger.warning("LLM summarize failed, fallback to heuristic: %s", exc) + return _heuristic(record) + + +def summarize_paragraph(record: SessionRecord) -> Optional[str]: + """Generate a one-paragraph narrative summary of the session via LLM. + + Used by the handoff flow to fill a ticket's Description with an LLM summary of the + chat from AI connect → handoff trigger. Returns None if LLM is not configured or the + session has no turns (caller then leaves the description unchanged). + """ + api_key = os.getenv("LLM_API_KEY") + if not api_key: + return None + if not record.turns: + return None + import requests + + api_url = os.getenv("LLM_API_URL", "https://api.openai.com/v1/chat/completions") + model = os.getenv("LLM_MODEL", "gpt-4o-mini") + transcript = "\n".join(f"[{t.role}] {t.text}" for t in record.turns[-50:]) + prompt = ( + "You are a customer-service ticket summarizer. Read the conversation below between " + "a customer and an AI assistant, then write ONE concise paragraph (2-4 sentences) " + "summarizing what the customer asked about and what was discussed, from the moment " + "the AI connected up to the point the customer requested a human agent. Do not invent " + "facts not present in the conversation. Do not include any sensitive data (API key / " + "token etc.). Output only the paragraph, with no preamble.\n" + f"Conversation:\n{transcript}\n" + ) + resp = requests.post( + api_url, + headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}, + json={ + "model": model, + "messages": [{"role": "user", "content": prompt}], + "temperature": 0.3, + }, + timeout=20, + ) + resp.raise_for_status() + data = resp.json() + return (data["choices"][0]["message"]["content"] or "").strip() or None diff --git a/skills/trtc-ai-service/capabilities/tool-calling/INTERFACE_ADAPT.md b/skills/trtc-ai-service/capabilities/tool-calling/INTERFACE_ADAPT.md new file mode 100644 index 0000000..0e4b6c4 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/INTERFACE_ADAPT.md @@ -0,0 +1,158 @@ +# tool-calling Interface Adaptation SOP + +> Alpha/beta dual-track tool calling. This capability's source code has not been refactored to ports/adapters/core in this release (Phase 1 compromise), +> but the manifest already declares the full `business_contract.alpha_track` / `beta_track` / `arbitration` contracts. + +--- + +## 1. Dual-Track Contract Overview + +### 1.1 Alpha Track (local functions) + +```yaml +alpha_track: + registration_schema: + name: string # e.g.: query_order + description: string # Tool description for LLM consumption + parameters: object # JSON Schema + handler: callable # Sync or async Python function +``` + +Alpha track is suitable for: low latency, tight business coupling, tools that are inconvenient to expose as HTTP services. + +### 1.2 Beta Track (remote APIs) + +```yaml +beta_track: + api_schema: + method: GET | POST | PUT | DELETE | PATCH + path: string + request_schema: object + response_schema: object + auth: bearer | api_key | none +``` + +Beta track is suitable for: cross-service calls, tools that need to reuse existing API gateways. + +### 1.3 Arbitration Rules + +```yaml +arbitration: + default_priority: alpha # Alpha first + fallback_on_failure: true # Alpha failure auto-degrades to beta + timeout_ms: 3000 # Single-track call timeout + merge_strategy: first_success # Take first success only +``` + +--- + +## 2. Three Scenarios When User Interfaces Don't Match + +### 2.1 Scenario 1: User's alpha-track function signature differs + +**Symptom**: The user already has local functions (e.g. `def get_order(order_id, user_id) -> dict`), +but the skeleton expects parameters named `id` / `customer_id`. + +**Solution**: Write a thin wrapper registration function — no need to modify the skeleton. + +```python +# Inside user project +from capabilities.tool_calling.src.dispatcher import register_tool + +def get_order(order_id, user_id): + """User's existing function.""" + return {"order_id": order_id, "status": "shipped"} + +# Adaptation layer: parameter remapping +register_tool( + name="query_order", + description="Query order status", + parameters={ + "type": "object", + "properties": { + "id": {"type": "string"}, + "customer_id": {"type": "string"}, + }, + "required": ["id", "customer_id"], + }, + handler=lambda id, customer_id: get_order(id, customer_id), # Field remapping +) +``` + +### 2.2 Scenario 2: User's beta-track remote API uses a different protocol + +**Symptom**: The remote business API is not OpenAI Tool Calling-style JSON-RPC, +but a user's own REST endpoint (e.g. `POST /api/v1/orders/query`). + +**Solution**: Declare the full API schema when registering in `tools.yaml`. + +```yaml +# capabilities/tool-calling/data/tools.yaml +tools: + - name: query_order + description: Query order status + alpha: null # No local implementation + beta: + base_url: https://api.example.com # Must be HTTPS + method: POST + path: /api/v1/orders/query + headers: + X-Api-Key: ${USER_KB_TOKEN} # Read from environment variable + request_template: + body: + order_no: "{{ id }}" # Template render: map tool input id to order_no + uid: "{{ customer_id }}" + response_path: "$.data.order" # Response field extraction (JSONPath) +``` + +> This release's advanced template rendering in tools.yaml is **not fully implemented**; +> for complex mapping needs, it's recommended to rewrite as an alpha-track local function + internal `requests` call. + +### 2.3 Scenario 3: User wants to disable beta track (local functions only) + +```bash +export TC_PRIORITY=alpha +export TC_DISABLE_BETA=1 +``` + +The skeleton only uses alpha track; beta failures will not trigger. Vice versa (`TC_DISABLE_ALPHA=1`). + +--- + +## 3. Arbitration Priority Override + +The manifest defaults to `priority=alpha`; can be overridden via environment variables: + +```bash +export TC_PRIORITY=beta # Beta first (when alpha implementation is not yet stable) +export TC_PRIORITY=manifest_order # Follow order of alpha/beta fields in tools.yaml +``` + +--- + +## 4. Phase 4 Plan: Full ports/adapters Refactor + +The following will be introduced in the future: + +``` +capabilities/tool-calling/src/ +├── ports/ +│ ├── local_tool.py # ABC: LocalTool +│ └── remote_tool.py # ABC: RemoteToolClient +└── adapters/ + ├── alpha_python.py # Alpha default implementation (current dispatcher behavior) + ├── beta_rest.py # Beta default implementation + └── user_custom.py # User integration wizard generator +``` + +This document will be supplemented with automated adaptation workflows at that time. + +--- + +## 5. Security Checklist + +- [ ] Beta track `base_url` must use https:// (localhost excepted) +- [ ] Reject private network access (9.* / 10.* / 172.16-31.* / 192.168.*) +- [ ] `Authorization` / `X-Api-Key` only from environment variables +- [ ] Alpha track handlers must not expose `eval` / `exec` / arbitrary command execution +- [ ] Tool result re-injection must have prompt injection protection (manifest.security.injection_protection) diff --git a/skills/trtc-ai-service/capabilities/tool-calling/README.md b/skills/trtc-ai-service/capabilities/tool-calling/README.md new file mode 100644 index 0000000..6a221a7 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/README.md @@ -0,0 +1,50 @@ +# tool-calling · Alpha/Beta Dual-Track Tool Calling + +> Provides local function (alpha) + remote API (beta) tool calling on top of conversation-core, +> with alpha-first by default and automatic degradation to beta on alpha failure (P1 arbitration rules). + +## Install + +```bash +python scripts/add-capability.py tool-calling +``` + +## Configuration + +| Env Variable | Default | Description | +|:---|:---|:---| +| `TC_REGISTRY_FILE` | `capabilities/tool-calling/data/tools.yaml` | Tool declaration file | + +Tool declaration format in `data/tools.yaml`, supports hot-reload (`POST /api/v1/tools/reload`). + +## REST API + +| Method | Path | Purpose | +|:---|:---|:---| +| GET | `/api/v1/tools/list` | List all tools | +| POST | `/api/v1/tools/invoke` | Explicit invocation `{name, params, priority?}` | +| POST | `/api/v1/tools/reload` | Reload registry | + +## In-Conversation Trigger + +Push the following text to `agent/control` to trigger: + +``` +/tool get_order {"order_id": "A1234"} +``` + +The dispatcher replaces the original text with a `[tool_result ...]...[/tool_result]` block and injects it into the LLM. + +## Arbitration Rules + +- `priority=alpha` (default): alpha first, then beta +- `priority=beta`: beta first, then alpha +- `priority=manifest_order`: follow declaration order + +Automatic degradation to the next available track on any track failure; returns `ok=false` and `fallback_chain` when all fail. + +## Security + +- Beta track enforces HTTPS (except `http://localhost*`); +- Tool name ≤ 64, trigger text ≤ 4096; +- Tool parameters auto-redacted in logs (manifest declares `log_redaction.patterns`). diff --git a/skills/trtc-ai-service/capabilities/tool-calling/data/tools.yaml b/skills/trtc-ai-service/capabilities/tool-calling/data/tools.yaml new file mode 100644 index 0000000..848b833 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/data/tools.yaml @@ -0,0 +1,58 @@ +# tool-calling default tool registry — generic AI customer service toolset (industry-neutral) +# +# Dual-track notes: +# alpha (α track) = local function / mock implementation: works out of the box, no real backend needed, easy local demo +# beta (β track) = remote HTTPS API: point the endpoint at your real business system +# Default priority=alpha: prefer local mock; change priority or pass priority=beta in a single invoke to force real API. +# When interfaces don't align, refer to INTERFACE_ADAPT.md for request/response field mapping. + +priority: alpha # alpha | beta | manifest_order + +tools: + - name: query_order_status + description: "Query document/order/ticket status (input order_id) — high-frequency customer service action" + alpha: + module: "capabilities.tool_calling.examples.local_tools" + function: "query_order_status" + timeout_ms: 600 + beta: + endpoint: "https://api.example.com/v1/orders/status" + method: "POST" + timeout_ms: 5000 + headers: {} + + - name: get_business_info + description: "Query business info: hours/address/contact (topic=hours|address|contact|all)" + alpha: + module: "capabilities.tool_calling.examples.local_tools" + function: "get_business_info" + timeout_ms: 400 + beta: + endpoint: "https://api.example.com/v1/business/info" + method: "GET" + timeout_ms: 5000 + headers: {} + + - name: book_appointment + description: "Create appointment/booking: restaurant reservations, service appointments, etc. (date, time_slot, party_size)" + alpha: + module: "capabilities.tool_calling.examples.local_tools" + function: "book_appointment" + timeout_ms: 800 + beta: + endpoint: "https://api.example.com/v1/appointments" + method: "POST" + timeout_ms: 5000 + headers: {} + + - name: submit_feedback + description: "Submit satisfaction rating/feedback (rating 1-5, comment) — common end-of-session action" + alpha: + module: "capabilities.tool_calling.examples.local_tools" + function: "submit_feedback" + timeout_ms: 400 + beta: + endpoint: "https://api.example.com/v1/feedback" + method: "POST" + timeout_ms: 5000 + headers: {} diff --git a/skills/trtc-ai-service/capabilities/tool-calling/examples/__init__.py b/skills/trtc-ai-service/capabilities/tool-calling/examples/__init__.py new file mode 100644 index 0000000..95df8f1 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/examples/__init__.py @@ -0,0 +1 @@ +"""tool-calling built-in example tools namespace.""" diff --git a/skills/trtc-ai-service/capabilities/tool-calling/examples/local_tools.py b/skills/trtc-ai-service/capabilities/tool-calling/examples/local_tools.py new file mode 100644 index 0000000..65c3525 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/examples/local_tools.py @@ -0,0 +1,101 @@ +"""tool-calling built-in "Generic AI Customer Service Tool Set" (alpha-track default = local mock). + +Design principles (aligned with SKILL §6 point 2): +- Industry-neutral: not tied to specific verticals; covers common actions across most customer service scenarios + (check document status / check business info / make appointment / submit feedback). +- Works out of the box: each tool has a directly runnable alpha-track mock implementation; + users on Path A or local demos can see the capability in effect immediately, even without a real backend API. +- Smoothly replaceable: each tool also declares a beta-track (remote HTTPS) placeholder in data/tools.yaml; + connecting to a real system only requires pointing the beta endpoint to your own API (or adapting per INTERFACE_ADAPT.md). + +Return value must be JSON-serializable; mock data uniformly carries "_mock": true marker, +making it easy for frontend / logs to distinguish "demo data" from "real business data". +""" +from __future__ import annotations + +import hashlib +import time +from typing import Any, Dict + + +def _stable_pick(seed: str, choices): + """Stably select a value based on seed (same input always returns the same result, making demos reproducible).""" + h = int(hashlib.md5(seed.encode("utf-8")).hexdigest(), 16) + return choices[h % len(choices)] + + +def query_order_status(order_id: str = "", **_: Any) -> Dict[str, Any]: + """Query document / order / ticket status (generic customer service action). + + Parameters: + order_id: Document number (order number / ticket number / appointment number fine). + """ + if not order_id: + return {"_mock": True, "error": "order_id is required"} + status = _stable_pick(order_id, ["processing", "confirmed", "in_progress", "completed", "cancelled"]) + return { + "_mock": True, + "order_id": order_id, + "status": status, + "updated_at": int(time.time()), + "note": "Demo data from built-in mock tool; point the β endpoint to your real system to use live data.", + } + + +def get_business_info(topic: str = "hours", **_: Any) -> Dict[str, Any]: + """Query business info (hours / address / contact etc.), common high-frequency customer service question. + + Parameters: + topic: hours | address | contact | all + """ + info = { + "hours": "Mon-Sun 10:00-22:00 (last entry 21:00)", + "address": "No.1 Demo Street, Example District", + "contact": "+86-000-0000-0000 / support@example.com", + } + topic = (topic or "hours").lower() + data = info if topic == "all" else {topic: info.get(topic, info["hours"])} + return {"_mock": True, "topic": topic, **data, + "note": "Demo data from built-in mock tool; replace with your real business profile."} + + +def book_appointment(date: str = "", time_slot: str = "", party_size: int = 2, **_: Any) -> Dict[str, Any]: + """Create reservation / booking (restaurant reservation, service appointment, callback booking etc. generic actions). + + Parameters: + date: Date, e.g. 2026-06-12 + time_slot: Time slot, e.g. 18:30 + party_size: Party size / quantity + """ + if not date or not time_slot: + return {"_mock": True, "error": "date and time_slot are required"} + confirm = "BK" + hashlib.md5(f"{date}{time_slot}{party_size}".encode()).hexdigest()[:8].upper() + return { + "_mock": True, + "confirmation_id": confirm, + "date": date, + "time_slot": time_slot, + "party_size": int(party_size) if str(party_size).isdigit() else party_size, + "status": "confirmed", + "note": "Demo booking created by built-in mock tool; wire the β endpoint to your reservation system.", + } + + +def submit_feedback(rating: int = 5, comment: str = "", **_: Any) -> Dict[str, Any]: + """Submit satisfaction / feedback (common end-of-session action). + + Parameters: + rating: 1-5 rating + comment: Text feedback (optional) + """ + try: + rating = max(1, min(5, int(rating))) + except (TypeError, ValueError): + rating = 5 + return { + "_mock": True, + "received": True, + "rating": rating, + "comment": (comment or "")[:512], + "note": "Demo acknowledgement from built-in mock tool.", + } diff --git a/skills/trtc-ai-service/capabilities/tool-calling/manifest.yaml b/skills/trtc-ai-service/capabilities/tool-calling/manifest.yaml new file mode 100644 index 0000000..f1ed0f3 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/manifest.yaml @@ -0,0 +1,146 @@ +# tool-calling capability self-describing manifest +# Type: capability (optional install) +# Risk mitigation: P1 alpha/beta dual-track priority arbitration (manifest explicitly declares priority) + +name: "tool-calling" +version: "1.0.0" +type: "capability" +description: "Business tool calling - alpha/beta dual-track (local function first, remote API as fallback) + result re-injection" + +dependencies: + - name: "conversation-core" + version: ">=1.0.0,<2.0.0" + +# --------------------------------------------------------------------------- +# Injection extensions into skeleton +# --------------------------------------------------------------------------- +extensions: + - inject_at: "agent.before_push_text" + inline_code: | + # [tool-calling] intercept text injection: recognize "/tool name args" to trigger local tool call + from ._capability_loader import try_load_capability + _tc_dispatcher = try_load_capability("tool-calling", "src/dispatcher.py") + if _tc_dispatcher is not None: + _hijack = _tc_dispatcher.maybe_dispatch(text) + if _hijack is not None: + # Replace original text with tool result, ensuring LLM receives tool-augmented content + text = _hijack + - inject_at: "server.router_extension" + inline_code: | + # [tool-calling] mount sub-router + from ._capability_loader import try_load_capability as _try_load_capability + _tc_router_mod = _try_load_capability("tool-calling", "src/router.py") + if _tc_router_mod is not None and hasattr(_tc_router_mod, "router"): + app.include_router( + _tc_router_mod.router, prefix="/api/v1/tools", tags=["tool-calling"] + ) + +# --------------------------------------------------------------------------- +# Configuration interface (alpha/beta dual-track) +# --------------------------------------------------------------------------- +config: + priority: + description: "Priority arbitration rule: alpha | beta | manifest_order" + default: "alpha" + timeout_ms_alpha: + default: 1000 + timeout_ms_beta: + default: 5000 + retry_alpha: + default: 0 + retry_beta: + default: 1 + registry_file: + description: "Tool registration declaration file (YAML), supports hot-reload" + default: "capabilities/tool-calling/data/tools.yaml" + +# --------------------------------------------------------------------------- +# API endpoints +# --------------------------------------------------------------------------- +endpoints: + - method: GET + path: /api/v1/tools/list + description: List registered alpha/beta tools and their priority + - method: POST + path: /api/v1/tools/invoke + description: "Explicit tool invocation (body shape: {name, params, priority?})" + - method: POST + path: /api/v1/tools/reload + description: "Hot-reload registry" + +# --------------------------------------------------------------------------- +# Integration rules +# --------------------------------------------------------------------------- +integration: + mode: "auto" + auto_adapters: + - tech_stack: ["express", "koa", "fastify", "next"] + adapter: "node-backend" + description: "Register beta-track remote tool implementation in Node.js backend" + - tech_stack: ["flask", "fastapi", "django"] + adapter: "python-backend" + description: "Directly register alpha-track local function in Python backend" + - tech_stack: ["spring-boot", "quarkus"] + adapter: "java-backend" + description: "Expose beta-track REST endpoints via Filter" + fallback: + guided_templates: + - "../../auto_adapters/integration_templates/generic-backend.md" + manual_api: + rest_endpoint: "/api/v1/tools" + +security: + log_redaction: + enabled: true + patterns: ["api_key", "token", "credential", "authorization"] + injection_protection: + prompt_injection_guard: true # Whitelist validation of tool trigger keywords before entering LLM + network: + enforce_https: true # Beta-track remote calls enforce HTTPS + +# --------------------------------------------------------------------------- +# Business contract (Phase 3 new; follows references/business-contract-spec.md v1.0 §5) +# tool-calling uses its own contract section (alpha/beta/arbitration), not the port/adapter template. +# Source code is not refactored in this phase (compromise strategy); only declares contract for contract-adapt.py consumption. +# --------------------------------------------------------------------------- +business_contract: + customization_sop: "INTERFACE_ADAPT.md" + + # Alpha track: local function registration specification + alpha_track: + interface: "src.dispatcher.LocalToolHandler" # Equivalent abstraction in existing dispatcher + registration_schema: + name: string # Tool name (globally unique, snake_case) + description: string # One-line description for LLM consumption + parameters: object # JSON Schema describing parameters + handler: callable # Sync/async Python callable + invocation_schema: + input: object # Same schema as parameters + output: object # User-defined return structure (must be JSON-serializable) + fail_fast: true # Local exceptions thrown immediately for arbitration decision + + # Beta track: remote business API integration specification + beta_track: + interface: "src.dispatcher.RemoteToolClient" # Remote HTTP/RPC client abstraction + api_schema: + method: enum[GET, POST, PUT, DELETE, PATCH] + path: string + request_schema: object # JSON Schema + response_schema: object # JSON Schema + auth: enum[bearer, api_key, none] + timeout_ms: 5000 + retry: + max: 1 + backoff_ms: 200 + + # Arbitration rules + arbitration: + default_priority: alpha # alpha | beta | manifest_order + fallback_on_failure: true + timeout_ms: 3000 # Single-track call timeout; triggers fallback on expiry + merge_strategy: first_success # first_success | alpha_then_beta_diff + +acceptance: + - "Provide ToolRegistry supporting simultaneous alpha/beta implementation declarations" + - "Default alpha-first; auto-degraded to beta on alpha failure/unavailability" + - "REST endpoints available immediately after skeleton startup" diff --git a/skills/trtc-ai-service/capabilities/tool-calling/src/__init__.py b/skills/trtc-ai-service/capabilities/tool-calling/src/__init__.py new file mode 100644 index 0000000..8ea3c91 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/src/__init__.py @@ -0,0 +1,8 @@ +"""tool-calling capability: alpha/beta dual-track tool calling. + +Core modules: +- registry Load tools from YAML registration declarations + Python entry points +- dispatcher Recognize and trigger calls from conversation text ("/tool name {json}") +- router REST endpoints +""" +__version__ = "1.0.0" diff --git a/skills/trtc-ai-service/capabilities/tool-calling/src/dispatcher.py b/skills/trtc-ai-service/capabilities/tool-calling/src/dispatcher.py new file mode 100644 index 0000000..7e633bb --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/src/dispatcher.py @@ -0,0 +1,54 @@ +"""Text injection interceptor: recognize "/tool" invocations from the conversation stream. + +Expected text format: + /tool {json_params} + +Example: + /tool get_order {"order_id": "A1234"} + +Dispatcher parses and calls ToolRegistry, returning the result as a structured string, +which is then forwarded to the LLM by conversation-core's injection point. +""" +from __future__ import annotations + +import json +import re +from typing import Optional + +from .registry import get_loader + +_TOOL_RE = re.compile(r"^\s*/tool\s+([A-Za-z0-9_\-]{1,64})\s*(\{.*\})?\s*$", re.DOTALL) +_MAX_TEXT_LEN = 4096 + + +def maybe_dispatch(text: str) -> Optional[str]: + """Recognize "/tool" trigger, return new text with tool result; return None when not triggered.""" + if not text or len(text) > _MAX_TEXT_LEN: + return None + m = _TOOL_RE.match(text) + if not m: + return None + name = m.group(1) + raw_params = m.group(2) or "{}" + try: + params = json.loads(raw_params) + if not isinstance(params, dict): + params = {} + except json.JSONDecodeError: + params = {} + result = get_loader().call(name, params) + payload = { + "tool": result.tool, + "track": result.track, + "ok": result.ok, + "output": result.output, + "error": result.error, + "latency_ms": result.latency_ms, + "fallback_chain": result.fallback_chain, + } + # Return in convention block, making it easy for LLM system prompt to recognize tool results + return ( + f"[tool_result name={result.tool} track={result.track} ok={result.ok}]\n" + + json.dumps(payload, ensure_ascii=False) + + "\n[/tool_result]" + ) diff --git a/skills/trtc-ai-service/capabilities/tool-calling/src/registry.py b/skills/trtc-ai-service/capabilities/tool-calling/src/registry.py new file mode 100644 index 0000000..b83c76d --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/src/registry.py @@ -0,0 +1,219 @@ +"""ToolRegistry loader. + +YAML declaration example: + priority: alpha # alpha | beta | manifest_order + tools: + - name: get_order + alpha: + module: "capabilities.tool_calling.examples.local_tools" + function: "get_order" + timeout_ms: 800 + beta: + endpoint: "https://internal.example.com/api/orders" + method: "POST" + timeout_ms: 5000 + description: "Query order" + +Loading strategy: +- Alpha-track functions loaded dynamically via ``importlib``; when a module is missing, the tool retains only the beta track. +- Beta track is declaration-only; invocation is done by dispatcher injecting ``beta_invoker``. +""" +from __future__ import annotations + +import importlib +import logging +import os +import threading +from pathlib import Path +from typing import Any, Dict, List, Optional + +import yaml + +# Import arbitrator via relative path (Phase 2 shared infrastructure) +import sys +_PROJECT_ROOT = Path(__file__).resolve().parents[3] +if str(_PROJECT_ROOT) not in sys.path: + sys.path.insert(0, str(_PROJECT_ROOT)) +from scripts.lib.arbitrator import ( # noqa: E402 + AlphaTool, + BetaTool, + ToolCallResult, + ToolRegistry, +) + +logger = logging.getLogger(__name__) + +_DEFAULT_REGISTRY_FILE = Path( + os.getenv( + "TC_REGISTRY_FILE", + str(Path(__file__).resolve().parent.parent / "data" / "tools.yaml"), + ) +) + + +class ToolRegistryLoader: + def __init__(self, registry_file: Optional[Path] = None) -> None: + self._lock = threading.RLock() + self._registry_file = Path(registry_file) if registry_file else _DEFAULT_REGISTRY_FILE + self._registry: ToolRegistry = ToolRegistry() + self._descriptions: Dict[str, str] = {} + if self._registry_file.exists(): + self.reload() + + @property + def registry(self) -> ToolRegistry: + with self._lock: + return self._registry + + def reload(self) -> int: + if not self._registry_file.exists(): + return 0 + raw = yaml.safe_load(self._registry_file.read_text(encoding="utf-8")) or {} + priority = (raw.get("priority") or "alpha").strip() + new_reg = ToolRegistry(default_priority=priority) + descriptions: Dict[str, str] = {} + for tool_def in raw.get("tools") or []: + name = str(tool_def.get("name") or "").strip() + if not name: + continue + descriptions[name] = str(tool_def.get("description", "")) + alpha_def = tool_def.get("alpha") + beta_def = tool_def.get("beta") + if alpha_def: + func = self._load_callable(alpha_def) + if func is not None: + new_reg.register_alpha( + AlphaTool( + name=name, + func=func, + timeout_ms=int(alpha_def.get("timeout_ms", 1000)), + description=descriptions[name], + ) + ) + if beta_def and beta_def.get("endpoint"): + new_reg.register_beta( + BetaTool( + name=name, + endpoint=str(beta_def["endpoint"]), + method=str(beta_def.get("method", "POST")), + timeout_ms=int(beta_def.get("timeout_ms", 5000)), + headers=dict(beta_def.get("headers") or {}), + description=descriptions[name], + ) + ) + with self._lock: + self._registry = new_reg + self._descriptions = descriptions + return len(descriptions) + + def list_tools(self) -> List[Dict[str, Any]]: + with self._lock: + return self._registry.list_tools() + + def call( + self, + name: str, + params: Optional[Dict[str, Any]] = None, + *, + priority: Optional[str] = None, + ) -> ToolCallResult: + return self._registry.call( + name, + params, + priority=priority, + beta_invoker=_default_beta_invoker, + ) + + @staticmethod + def _load_callable(alpha_def: Dict[str, Any]): + mod_name = alpha_def.get("module") + func_name = alpha_def.get("function") + if not mod_name or not func_name: + return None + module = None + try: + module = importlib.import_module(mod_name) + except ImportError: + # Fallback: capability directory name has hyphens (tool-calling), standard import cannot resolve it + # capabilities.tool_calling.* - switched to file-path-based loading (registry knows its own location). + module = ToolRegistryLoader._load_module_by_path(mod_name) + if module is None: + logger.warning("alpha tool module not loadable: %s", mod_name) + return None + return getattr(module, func_name, None) + + @staticmethod + def _load_module_by_path(mod_name: str): + """Map dotted module name to file path within capability package and load. + + Convention: module name like ``capabilities.tool_calling.examples.local_tools``, + take the ``examples`` segment and everything after as the path relative to ````. + """ + import importlib.util + + cap_root = Path(__file__).resolve().parent.parent # capabilities/tool-calling/ + parts = mod_name.split(".") + # Strip capabilities. prefix (regardless of underscores / hyphens), keep the examples/... tail + tail: List[str] = [] + seen_examples = False + for p in parts: + if p == "examples": + seen_examples = True + if seen_examples: + tail.append(p) + if not tail: + tail = parts[-2:] # Fallback: take last two segments + file_path = cap_root.joinpath(*tail).with_suffix(".py") + if not file_path.is_file(): + return None + qual = "_tc_local_" + "_".join(tail) + cached = sys.modules.get(qual) + if cached is not None: + return cached + spec = importlib.util.spec_from_file_location(qual, file_path) + if spec is None or spec.loader is None: + return None + module = importlib.util.module_from_spec(spec) + sys.modules[qual] = module + try: + spec.loader.exec_module(module) + except Exception as exc: # noqa: BLE001 + sys.modules.pop(qual, None) + logger.warning("alpha tool file load failed %s: %s", file_path, exc) + return None + return module + + +# --------------------------------------------------------------------------- +# Beta-track default implementation: requests sync POST / GET +# --------------------------------------------------------------------------- +def _default_beta_invoker(tool: BetaTool, params: Dict[str, Any]) -> Any: + import requests # Already in skeleton requirements + + if not tool.endpoint.startswith("https://") and not tool.endpoint.startswith("http://localhost"): + # Security: except localhost debugging, beta-track enforces HTTPS (manifest.security.network.enforce_https) + raise RuntimeError(f"β endpoint must use HTTPS: {tool.endpoint}") + headers = {"Content-Type": "application/json", **tool.headers} + timeout = max(tool.timeout_ms, 100) / 1000.0 + method = tool.method.upper() + if method == "GET": + resp = requests.get(tool.endpoint, params=params, headers=headers, timeout=timeout) + else: + resp = requests.request( + method, tool.endpoint, json=params, headers=headers, timeout=timeout + ) + resp.raise_for_status() + ctype = resp.headers.get("Content-Type", "") + if "application/json" in ctype: + return resp.json() + return resp.text + + +# --------------------------------------------------------------------------- +# Global singleton (used by dispatcher / router) +# --------------------------------------------------------------------------- +_global_loader = ToolRegistryLoader() + + +def get_loader() -> ToolRegistryLoader: + return _global_loader diff --git a/skills/trtc-ai-service/capabilities/tool-calling/src/router.py b/skills/trtc-ai-service/capabilities/tool-calling/src/router.py new file mode 100644 index 0000000..fb04cf1 --- /dev/null +++ b/skills/trtc-ai-service/capabilities/tool-calling/src/router.py @@ -0,0 +1,50 @@ +"""tool-calling FastAPI sub-router.""" +from __future__ import annotations + +from typing import Any, Dict, Optional + +from fastapi import APIRouter, HTTPException +from pydantic import BaseModel, Field + +from .registry import get_loader + +router = APIRouter() + + +class InvokeRequest(BaseModel): + name: str = Field(..., max_length=64) + params: Dict[str, Any] = Field(default_factory=dict) + priority: Optional[str] = Field(default=None, pattern="^(alpha|beta|manifest_order)$") + + +@router.get("/list") +def list_tools() -> dict: + return {"code": 0, "data": get_loader().list_tools()} + + +@router.post("/invoke") +def invoke(req: InvokeRequest) -> dict: + result = get_loader().call(req.name, req.params, priority=req.priority) + if not result.ok: + # 200 + ok=false or 502? Here we use 200; caller judges by ok field; beta network errors use 502 + if result.track == "beta" and "Connection" in (result.error or ""): + raise HTTPException(status_code=502, detail=result.error) + return { + "code": 0 if result.ok else 1, + "msg": "success" if result.ok else result.error, + "data": { + "tool": result.tool, + "track": result.track, + "ok": result.ok, + "output": result.output, + "error": result.error, + "latency_ms": result.latency_ms, + "fallback_chain": result.fallback_chain, + }, + } + + +@router.post("/reload") +def reload_registry() -> dict: + n = get_loader().reload() + return {"code": 0, "data": {"count": n}} diff --git a/skills/trtc-ai-service/references/business-contract-spec.md b/skills/trtc-ai-service/references/business-contract-spec.md new file mode 100644 index 0000000..1175203 --- /dev/null +++ b/skills/trtc-ai-service/references/business-contract-spec.md @@ -0,0 +1,263 @@ +# `business_contract` Field Specification (v1.0) + +> Scope: the `business_contract` section in `capabilities//manifest.yaml`. +> +> Purpose: enable capability packages to **structurally** declare the business interface contracts they call / are called by, +> so that `scripts/contract-adapt.py` (Phase 3 Stage 4) can generate executable adapters based on them, +> and the assembly guide (Path A / Path B) can proactively list the contract manifest at the end. + +--- + +## 1. Top-Level Structure + +```yaml +business_contract: + port_class: "" # Full dotted path to ABC abstract base class + default_adapter: "" # Default implementation (production) + mock_adapter: "" # Mock implementation (demo / video recording) + external_apis: # outbound / inbound interface contract list + - + customization_sop: "INTERFACE_ADAPT.md" # Path to interface adaptation SOP (relative to capability root) +``` + +**Special case**: `tool-calling` does not use the port/adapter abstraction; it uses the alpha/beta dual-track contract section defined in §5 instead. + +--- + +## 2. `external_apis[]` Field Definition + +```yaml +- name: # Contract name (snake_case dot-delimited), globally unique + direction: outbound | inbound # outbound = we call the business side; inbound = business side calls back to us + method: GET | POST | PUT | DELETE | PATCH + path: # Path template, may contain {placeholder} + description: # One-line description (for assembly wrap-up printing) + request_schema: # Request schema (simplified JSON Schema) + : + response_schema: # Response schema + : + adapter_slots: # Field paths allowed for user remapping (dot-delimited) + - . + auth: # (optional) Auth method + type: bearer | api_key | none + location: header | query + name: + retry: # (optional) Retry policy + max: + backoff_ms: + timeout_ms: # (optional) Timeout +``` + +### 2.1 `type` Value Convention + +| Type | Meaning | +|---|---| +| `string` | String | +| `int` / `integer` | Integer | +| `float` / `number` | Float | +| `bool` / `boolean` | Boolean | +| `string[]` / `int[]` / `[]` | Homogeneous array | +| `enum[a, b, c]` | Enum literal | +| `object` | Nested object (can be further expanded as sub-schema) | +| Nested dict | Write nested structure directly | + +### 2.2 `adapter_slots` Field Path Rules + +- Starts with `request.` or `response.` +- Nested levels use dot separation: `response.data.ticket_id` +- Arrays use `[]`: `request.transcript[]` +- Only lists fields **allowed for user remapping**; unlisted fields are treated as "our contract is fixed, user modification forbidden" + +### 2.3 Special Nature of `direction = inbound` + +An inbound contract means "the business side calls back to us", in which case: +- `path` is the endpoint we expose (e.g.: `/api/v1/handoff/callback/ticket-status`) +- `request_schema` is the payload structure sent by the business side +- `response_schema` is the ack structure we return (typically `{code: int, message: string}`) +- `adapter_slots` declares "business side field names may differ from our expectations", used by contract-adapt to generate inbound field mappers + +--- + +## 3. Naming Conventions + +| Element | Rule | Example | +|---|---|---| +| `name` | `.` snake_case | `ticket.create`, `faq.search`, `crm.write` | +| `port_class` | `src.ports..` | `src.ports.handoff_client.HandoffClient` | +| `default_adapter` / `mock_adapter` | `src.adapters..` | `src.adapters.local_queue.LocalQueueHandoffClient` | + +--- + +## 4. Full Example: `human-handoff` + +```yaml +business_contract: + port_class: "src.ports.handoff_client.HandoffClient" + default_adapter: "src.adapters.local_queue.LocalQueueHandoffClient" + mock_adapter: "src.adapters.mock.MockHandoffClient" + customization_sop: "INTERFACE_ADAPT.md" + external_apis: + - name: ticket.create + direction: outbound + method: POST + path: /tickets + description: "Create a new ticket in the ticketing system when user triggers handoff" + request_schema: + user_id: string + subject: string + description: string + priority: enum[low, normal, high, urgent] + transcript: string[] + response_schema: + ticket_id: string + queue_position: int + eta_seconds: int + adapter_slots: + - request.subject + - request.priority + - response.ticket_id + - response.queue_position + - response.eta_seconds + auth: + type: bearer + location: header + name: Authorization + timeout_ms: 5000 + + - name: ticket.status_query + direction: outbound + method: GET + path: /tickets/{ticket_id} + description: "Poll ticket status for queue progress updates" + request_schema: + ticket_id: string + response_schema: + ticket_id: string + status: enum[pending, processing, closed, canceled] + agent_id: string + updated_at: int + adapter_slots: + - response.status + - response.agent_id + timeout_ms: 3000 + + - name: ticket.cancel + direction: outbound + method: POST + path: /tickets/{ticket_id}/cancel + description: "Notify ticketing system when user cancels handoff" + request_schema: + ticket_id: string + reason: string + response_schema: + ticket_id: string + canceled: bool + adapter_slots: + - request.reason + timeout_ms: 3000 + + - name: ticket.status_callback + direction: inbound + method: POST + path: /api/v1/handoff/callback/ticket-status + description: "Callback from ticketing system to notify status changes (optional; when disabled, status_query polling is used instead)" + request_schema: + ticket_id: string + status: enum[pending, processing, closed, canceled] + agent_id: string + response_schema: + code: int + message: string + adapter_slots: + - request.status + - request.agent_id +``` + +--- + +## 5. `tool-calling` Exclusive Contract Section (replaces §1 port/adapter triple) + +```yaml +business_contract: + alpha_track: # Alpha track: local function registration spec + interface: "src.ports.local_tool.LocalTool" + registration_schema: + name: string # Tool name (globally unique) + description: string + parameters: object # JSON Schema describing parameter structure + handler: callable # Function object (runtime-only) + invocation_schema: + input: object # Same schema as parameters + output: object # User-defined return structure + fail_fast: bool # Default true: local exceptions thrown immediately for arbitration decisions + + beta_track: # Beta track: remote business API integration spec + interface: "src.ports.remote_tool.RemoteToolClient" + api_schema: + method: enum[GET, POST, PUT, DELETE, PATCH] + path: string + request_schema: object + response_schema: object + auth: enum[bearer, api_key, none] + timeout_ms: 5000 + retry: { max: 1, backoff_ms: 200 } + + arbitration: # Arbitration rules + default_priority: enum[alpha, beta, manifest_order] + fallback_on_failure: bool + timeout_ms: int # Single-track call timeout; triggers fallback on expiry + merge_strategy: enum[first_success, alpha_then_beta_diff] +``` + +`merge_strategy` values: + +| Value | Behavior | +|---|---| +| `first_success` | Return immediately on priority track success; backup track not called (default) | +| `alpha_then_beta_diff` | Both tracks called; diff log recorded when results differ (for canary comparison) | + +--- + +## 6. How `contract-adapt.py` Consumes This Field + +1. Read `business_contract.external_apis[].request_schema` / `response_schema` +2. Parse user-submitted curl / OpenAPI, extract the user API's schema +3. Compare against `adapter_slots` list to generate field mapping `mapping.yaml` +4. Render adapter template (inheriting `port_class`), output to `src/adapters/user_custom.py` +5. Three-level degradation: + - **L1**: Only `adapter_slots` field differences → fully executable adapter + - **L2**: Schema nesting or type differences exist → adapter template + TODO comments + - **L3**: Protocol-level differences (webhook / MQ / gRPC) or parse failure → output corresponding `INTERFACE_ADAPT.md` section path + +--- + +## 7. Validation Rules (mandatory at resolver stage) + +| Rule | Error Code | Behavior | +|---|---|---| +| `port_class` / `default_adapter` / `mock_adapter` any not importable | `BC001` | Resolve failure, block install | +| `external_apis[].name` duplicate | `BC002` | Same as above | +| `direction = outbound` without `method` or `path` | `BC003` | Same as above | +| `adapter_slots` path not found in `request_schema` / `response_schema` | `BC004` | Warning only, does not block | +| `tool-calling.arbitration.default_priority` invalid value | `BC005` | Resolve failure | +| `auth.type = bearer` but no env variable source declared | `BC006` | Warning only | + +Implementation location: `scripts/lib/contract_resolver.py` (Phase 3 Stage 4 implementation). +In Phase 1, only the field definitions are agreed upon; resolver validation will be implemented in Stage 4. + +--- + +## 8. Relationship with Existing Manifest Fields + +- `business_contract` and existing `extensions` / `endpoints` / `integration` fields **do not affect each other**; can be independently added/removed +- `endpoints` describes "the REST endpoints we expose to the frontend / user" +- `business_contract.external_apis` describes "the interfaces we call / are called back by the business side" +- Both coexist without conflict, serving different consumers (frontend / Agent / business side) + +--- + +## 9. Version Compatibility + +- Current spec version: `v1.0` +- No forward compatibility; if a future breaking change is needed, it will be marked with a `business_contract.spec_version: "2.0"` field +- The resource resolver (`manifest_resolver.py`) currently ignores unknown fields; adding this field will not break existing Phase 1/2 functionality diff --git a/skills/trtc-ai-service/scenarios/custom-builder/README.md b/skills/trtc-ai-service/scenarios/custom-builder/README.md new file mode 100644 index 0000000..6dea1a5 --- /dev/null +++ b/skills/trtc-ai-service/scenarios/custom-builder/README.md @@ -0,0 +1,86 @@ +# scenarios/custom-builder —— Path B Custom Flow + +> Companion doc: repo root `SKILL.md` (Path B SOP §6). + +This directory contains all artifacts for **Path B** ("Custom"). It has **no executable scripts** — +the 4-round Q&A is entirely facilitated by the Coding Agent using `ask_followup_question`. +This directory provides only two types of static materials: + +``` +scenarios/custom-builder/ +├── README.md ← This file +├── prompts/ ← Question templates for AI (does not modify user's project) +│ ├── q1-business-scenario.md ← Q1: Business description (free text) +│ ├── q2-io-modality.md ← Q2: I/O modality (4 choose 1) +│ ├── q3-ui-form.md ← Q3: UI form (3 choose 1) +│ └── q4-capabilities.md ← Q4: Capability selection (multi-select; defaults to none) +└── output-templates/ + └── recipe.yaml.j2 ← AI rendering artifact template (output to /recipe.yaml) +``` + +--- + +## AI Execution Flow (aligned with SKILL.md §6) + +| Step | Tool | Source | +|---|---|---| +| 6.1 | `ask_followup_question` (free text) | `prompts/q1-business-scenario.md` | +| 6.2 | `ask_followup_question` (single-select 4 items) | `prompts/q2-io-modality.md` | +| 6.3 | `ask_followup_question` (single-select 3 items) | `prompts/q3-ui-form.md` | +| 6.4 | `ask_followup_question` (multi-select 4 items) | `prompts/q4-capabilities.md` | +| 6.5 | `write_to_file` render `recipe.yaml` | `output-templates/recipe.yaml.j2` | +| 6.6 | `execute_command("python3 scripts/add-capability.py --apply --json")` | Q4 selections; skip if none | +| 6.7 | Remaining steps same as Path A (§7 Keys → §8 Contract → §9 Launch) | SKILL.md | + +--- + +## Constraints / Red Lines + +- **No builder.py**: The 4-round Q&A is **entirely** facilitated by AI; do not turn this into a local script (the user experience would drop out of the chat window) +- **No manifest.yaml generation**: Each capability already has its own `manifest.yaml`; Path B does not need to regenerate +- **prompts/q*.md are static materials**: AI **reads** them only; no content modification / re-formatting +- **recipe.yaml.j2 uses Jinja2 syntax**: But the AI does not need to actually invoke a Jinja2 interpreter; it can do string replacement mentally and then `write_to_file` the final yaml. The template is just a **structural contract** for the AI + +--- + +## After Collecting All Answer Variables, AI Constructs This Context + +```python +context = { + # Q1 + "business_desc": "", + "business_name": "", + + # Q2 option → internal enum + "io_modality": "text_with_tts", # text_only | text_with_tts | omni | voice_only + + # Q3 option → internal enum + "ui_form": "floating", # floating | fullscreen | headless + + # Q4 user-selected capability array + "extra_capabilities": [ # any subset; empty array installs skeleton only + "knowledge-base", + "human-handoff", + # "tool-calling", + # "session-summary", + ], + + # Meta info (AI fills in) + "render_time": "", + "rendered_by": "Coding Agent", +} +``` + +Feed this to `output-templates/recipe.yaml.j2` to get the customized `/recipe.yaml`. + +--- + +## Differences from Path A (reference table) + +| Dimension | Path A | Path B | +|---|---|---| +| Entry | "Build me an AI customer service agent with TRTC" | Same, SKILL.md §4 choose B | +| Installed capabilities | Default `knowledge-base + human-handoff` | Default none; user selects via Q4 | +| Business prompt | Just ask "describe your business" before launch | Q1 is required (more explicit) | +| UI | Floating widget + ticket dashboard (default) | Controlled by Q3: floating / fullscreen / headless | +| recipe.yaml location | `scenarios/customer-service/recipe.yaml` (static in repo) | `/recipe.yaml` (generated each time; can be manually edited and re-installed) | diff --git a/skills/trtc-ai-service/scenarios/custom-builder/output-templates/recipe.yaml.j2 b/skills/trtc-ai-service/scenarios/custom-builder/output-templates/recipe.yaml.j2 new file mode 100644 index 0000000..18b7040 --- /dev/null +++ b/skills/trtc-ai-service/scenarios/custom-builder/output-templates/recipe.yaml.j2 @@ -0,0 +1,194 @@ +{# ========================================================================= + Path B Custom Recipe Render Template (Jinja2) + + Output: /recipe.yaml + Input (collected by AI in Q1~Q4): + business_desc string Q1 user answer (required, raw text) + business_name string Optional (default "Our Company") + io_modality enum Q2 option → internal enum: + text_only | text_with_tts | omni | voice_only + ui_form enum Q3 option → floating | fullscreen | headless + extra_capabilities list Q4 selected subset; values: + knowledge-base | human-handoff | tool-calling | session-summary + + AI rendering notes: + - Do not push rendering logic down to Python; prefer the AI to perform string replacement / Jinja eval + mentally, then write the final recipe.yaml all at once with write_to_file + - If a boolean field cannot be mapped (vague user answer), fallback to the defaults below and confirm with the user +========================================================================= #} +# ===================================================================== +# Path B Custom Recipe — {{ business_name | default("Our Company") }} AI Customer Service +# +# Generated by SKILL.md §7 flow + custom-builder/output-templates/recipe.yaml.j2 +# Render time: {{ render_time | default("(AI writes ISO timestamp)") }} +# Rendered by: {{ rendered_by | default("Coding Agent") }} +# ===================================================================== +apiVersion: ai-customer-service/v1 +kind: Recipe +metadata: + name: customer-service-custom + scenario: customer-service + source: custom-builder + version: "1.0.0" + description: "Generated from Path B 4-round Q&A; can be manually edited and reassembled" + +# --------------------------------------------------------------------------- +# Q1: Business description (drives system prompt rendering) +# --------------------------------------------------------------------------- +agent_runtime: + language: "{{ language | default('en') }}" + voice_id: "{{ voice_id | default('') }}" + greeting: "{{ greeting | default('Hello, this is the AI customer service for ' ~ (business_name | default('Our Company')) ~ '. How can we help you?') }}" + max_idle_time: 60 + system_prompt: + template_file: "../customer-service/system-prompt.template.md" + variables: + business_desc: | + {{ business_desc | trim | replace('\n', '\n ') }} + business_name: "{{ business_name | default('Our Company') }}" + +# --------------------------------------------------------------------------- +# Q2: I/O Modality +# --------------------------------------------------------------------------- +runtime_modality: + preset: "{{ io_modality }}" # text_only | text_with_tts | omni | voice_only +{%- if io_modality == "text_only" %} + voice_input: false + voice_output: false + text_input: true + text_output: true +{%- elif io_modality == "text_with_tts" %} + voice_input: false + voice_output: true + text_input: true + text_output: true +{%- elif io_modality == "omni" %} + voice_input: true + voice_output: true + text_input: true + text_output: true +{%- elif io_modality == "voice_only" %} + voice_input: true + voice_output: true + text_input: false + text_output: false +{%- else %} + # Unknown modality enum; fallback to text_with_tts + voice_input: false + voice_output: true + text_input: true + text_output: true +{%- endif %} + +# --------------------------------------------------------------------------- +# Q3: UI Form +# +# UI base notes (important): +# widget-floating has been upgraded to a "real connection" implementation — it reuses +# conversation-core's TRTC + agent/start pipeline via agent-link.js; AI replies come +# from LLM subtitle stream, and capabilities (handoff / tools / summaries) are +# dynamically mounted based on actual backend enablement. +# (The old hardcoded IM approach is deprecated; always include agent-link.js when +# overlaying the widget-floating directory.) +# +# B2 custom UI: The AI takes the widget-floating "real connection base" and generates +# a customized skin (business panel / copy / token values) based on the user's business +# info and color preferences. The core pipeline is unchanged, ensuring the AI is always +# genuinely connected and satisfies design-system hard constraints (fonts / icons / +# no emoji / via tokens). +# --------------------------------------------------------------------------- +ui: + form: "{{ ui_form }}" # floating | fullscreen | headless +{%- if ui_form == "headless" %} + overlay_required: false + ui_overlay: null +{%- else %} + overlay_required: true + ui_overlay: + source_root: "scenarios/customer-service/ui" + target: "capabilities/conversation-core/web-demo" + layers: + - source: "widget-floating" + target: "." + replace: true +{%- if "human-handoff" in extra_capabilities %} + - source: "admin-board" + target: "admin" + replace: true +{%- endif %} +{%- if ui_form == "fullscreen" %} + layout_hint: "fullscreen" # 浮窗模板 + body 上挂 fullscreen class(CSS 强制铺满) +{%- endif %} +{%- endif %} + +# --------------------------------------------------------------------------- +# Q4: Capabilities Checklist +# Assembly command (AI executes in §7.3): +# python3 scripts/add-capability.py {{ extra_capabilities | join(' ') }} --apply --json +# Skip assembly when extra_capabilities is empty +# --------------------------------------------------------------------------- +capabilities: + required: + - name: conversation-core + role: skeleton + install: +{%- for cap in extra_capabilities %} +{%- if cap == "knowledge-base" %} + - name: knowledge-base + role: capability + adapter: mock # Default mock; switch to local_json or default_rest before launch + env: + KB_ADAPTER: mock + KB_TOP_K: "3" + KB_MIN_SCORE: "0.1" +{%- elif cap == "human-handoff" %} + - name: human-handoff + role: capability + adapter: local_queue # Default local queue; switch to default_rest to connect to real ticketing system + env: + HH_ADAPTER: local_queue +{%- elif cap == "tool-calling" %} + - name: tool-calling + role: capability + # For tool-calling alpha/beta dual-track details, see capabilities/tool-calling/INTERFACE_ADAPT.md +{%- elif cap == "session-summary" %} + - name: session-summary + role: capability + # Depends on LLM_API_KEY; ensure §5 LLM Key configuration is complete +{%- endif %} +{%- endfor %} +{%- if not extra_capabilities %} + [] # No capability selected; install skeleton only +{%- endif %} + optional: [] + excluded: + - name: digital-human # Not in current scope + +# --------------------------------------------------------------------------- +# Post-launch must-do: contract checklist reminder (see SKILL.md §9) +# --------------------------------------------------------------------------- +post_assembly: + contract_review: + enabled: true + capabilities: +{%- for cap in extra_capabilities %} + - "{{ cap }}" +{%- endfor %} +{%- if not extra_capabilities %} + [] +{%- endif %} + sop_section: "SKILL.md#8" + health_check: + url: "http://localhost:3000/api/v1/health" + expect_field: "status" + expect_value: "ok" + +# --------------------------------------------------------------------------- +# Design language lock (consistent with Path A) +# --------------------------------------------------------------------------- +design: + tokens_file: "../../design_tokens.json" + guidelines: "../customer-service/ui/design-system/DESIGN_GUIDELINES.md" + theme: "dark" + emoji_in_ui: false + font_family: "SF Pro, Inter, 'Helvetica Neue', sans-serif" diff --git a/skills/trtc-ai-service/scenarios/custom-builder/prompts/q1-business-scenario.md b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q1-business-scenario.md new file mode 100644 index 0000000..4007269 --- /dev/null +++ b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q1-business-scenario.md @@ -0,0 +1,43 @@ +# Q1 —— Business Description (free text) + +> Path B Question 1. AI uses `ask_followup_question` as a standalone question, **without options**, letting the user type freely. +> +> AI saves the user's answer to the internal variable `business_desc`, for later use in: +> 1. Rendering `{{business_desc}}` in `scenarios/customer-service/system-prompt.template.md` +> 2. Writing to `/recipe.yaml` at `agent_runtime.system_prompt.variables.business_desc` +> +> AI must **not** guess the industry; leave unsaid fields blank and backfill after Q4. + +--- + +## What the AI should say + +> Question 1 (of 4): What business is your customer service bot for? +> Just describe the **business scope** and **typical questions** in a sentence or two. For example: +> +> - "We are an e-commerce store selling smart home appliances — air fryers, robot vacuums, humidifiers. Users usually ask about warranty, returns, and shipping." +> - "I run customer support for a SaaS HR platform. Common issues are login failures, org structure sync, and plan upgrades." +> - "A restaurant delivery service. Users mainly ask about order status, refunds, menu stock, and delivery fees." +> +> The more specific your business, the better the final system prompt will match your real scenario. + +--- + +## Validation after receiving the answer + +- Length ≥ 8 and ≤ 600 characters. If too short, follow up: "That's quite brief — could you add a bit more about typical issues or industry keywords?" +- Must contain at least one noun phrase (industry name, product name, user type); pure interjections or casual chat → **re-ask** +- Do not ask the user to provide brand name / company name (if needed, the template uses placeholder `{{business_name | default('we')}}`) + +--- + +## Answer write-back + +```yaml +# Render to /recipe.yaml +agent_runtime: + system_prompt: + variables: + business_desc: | + +``` diff --git a/skills/trtc-ai-service/scenarios/custom-builder/prompts/q2-io-modality.md b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q2-io-modality.md new file mode 100644 index 0000000..a4156d1 --- /dev/null +++ b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q2-io-modality.md @@ -0,0 +1,57 @@ +# Q2 —— I/O Modality (4 choose 1) + +> Path B Question 2. AI uses `ask_followup_question` in **single-select** mode. +> +> Answer written to internal variable `io_modality` (mapped to English enums in the table below), later used to: +> 1. Determine whether `agent_runtime.greeting` uses TTS audio +> 2. Set conversation-core `io_modality.*.enabled` fields +> 3. Decide whether the floating widget UI exposes a microphone button + +--- + +## What the AI should say (recommend copy-pasting directly into ask_followup_question) + +> Question 2: What I/O modality should be used between end users and the AI agent? + +`options` (keep order; order corresponds one-to-one with enums below): + +```text +① Text-only IM (user types → AI replies in text; no voice) +② Text + TTS (user types → AI replies in text + reads aloud, recommended) +③ Omni-modal (voice + text, bidirectional; user can also speak) +④ Voice-only call (user dials → AI answers → full voice; no demo UI in this release) +``` + +`multiSelect: false` + +--- + +## Option → Backend Config Mapping + +| User Option | Internal Enum (`io_modality`) | conversation-core io_modality Config | UI Impact | +|---|---|---|---| +| ① Text-only | `text_only` | voice_input=disabled, voice_output=disabled | Widget shows input box only; mic hidden | +| ② Text + TTS (recommended) | `text_with_tts` | voice_input=disabled, voice_output=enabled (trtc-tts) | Widget shows input box + "read aloud" toggle; mic hidden | +| ③ Omni-modal | `omni` | voice_input=enabled (trtc-asr), voice_output=enabled | Widget shows input box + mic (push-to-talk) | +| ④ Voice-only call | `voice_only` | voice_input=enabled, voice_output=enabled, text_input=disabled | No UI; backend phone gateway only (no floating widget in this release) | + +--- + +## Validation / Fallback + +- User picks ④ but Q3 picks "floating" → warn about conflict, guide user to change Q3 to "headless" +- User picks ② / ③ but LLM verification fails → save the choice, still write to recipe per user's intent; after launch, the widget will show "voice output depends on TTS key, currently unavailable" + +--- + +## Answer write-back + +```yaml +# Render to /recipe.yaml +runtime_modality: + preset: text_with_tts # From "Internal Enum" column above + voice_input: false + voice_output: true + text_input: true + text_output: true +``` diff --git a/skills/trtc-ai-service/scenarios/custom-builder/prompts/q3-ui-form.md b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q3-ui-form.md new file mode 100644 index 0000000..7d6380d --- /dev/null +++ b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q3-ui-form.md @@ -0,0 +1,55 @@ +# Q3 —— UI Form (3 choose 1) + +> Path B Question 3. AI uses `ask_followup_question` in **single-select** mode. +> +> Answer written to internal variable `ui_form`, later used to determine: +> 1. Floating widget / fullscreen page / headless three deployment modes +> 2. Whether to overlay UI onto `capabilities/conversation-core/web-demo` +> 3. Whether to generate integration guides (references to `integration-templates/*.md`) + +--- + +## What the AI should say + +> Question 3: Where do you want the AI customer service to "appear"? + +`options`: + +```text +① Floating widget (embedded in the bottom-right corner of your existing page, recommended) +② Fullscreen chat page (standalone page / sub-route, full-page conversation) +③ Backend API only (you build your own frontend / integrate into existing IM, no demo UI needed) +``` + +`multiSelect: false` + +--- + +## Option → Behavior Mapping + +| User Option | Internal Enum (`ui_form`) | recipe.ui_overlay | Notes | +|---|---|---|---| +| ① Floating | `floating` | source=`scenarios/customer-service/ui/widget-floating`, target=`web-demo/` | Same as Path A | +| ② Fullscreen | `fullscreen` | source=`scenarios/customer-service/ui/widget-floating` but `target=web-demo/`; after launch, guide user to open `/?layout=full` (or self-extend a dedicated template) | No dedicated fullscreen template in this release; reuses floating template forced to fullscreen via CSS class hook | +| ③ Backend only | `headless` | `ui_overlay: null` | Only installs capability packages; artifacts only expose `/api/v1/*` | + +> **Note**: This release does not generate a dedicated fullscreen conversation template (`fullscreen` reuses floating CSS forced to fullscreen); +> for finer control, a `widget-fullscreen/` subdirectory under `scenarios/customer-service/ui/` can be added later. + +--- + +## Validation / Fallback + +- Cross-validation with Q2 (`io_modality`): see Q2 "Validation / Fallback" +- Choosing ③ skips cp overlay; but AI must still remind user about external integration docs per §8 (`auto_adapters/integration_templates/generic-frontend.md`) + +--- + +## Answer write-back + +```yaml +# Render to /recipe.yaml +ui: + form: floating # floating | fullscreen | headless + overlay_required: true # false in headless mode +``` diff --git a/skills/trtc-ai-service/scenarios/custom-builder/prompts/q4-capabilities.md b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q4-capabilities.md new file mode 100644 index 0000000..4d1481a --- /dev/null +++ b/skills/trtc-ai-service/scenarios/custom-builder/prompts/q4-capabilities.md @@ -0,0 +1,78 @@ +# Q4 —— Capability Selection (multi-select; defaults to none) + +> Path B Question 4. AI uses `ask_followup_question` in **multi-select** mode (`multiSelect: true`). +> +> Answer written to internal variable `extra_capabilities` (string array), used to determine: +> 1. Capability list for Path B assembly command: `add-capability.py conversation-core ` +> 2. recipe.yaml `capabilities.install` list +> +> **Important**: Unlike Path A (which defaults to KB + HH), Path B defaults to **none**, installing only the conversation-core skeleton. +> Only capabilities **explicitly selected** by the user are added to the list. + +--- + +## What the AI should say + +> Question 4: Besides the conversation skeleton (conversation-core), what additional capabilities do you want to layer on? +> (Multi-select; you can also select none. Defaults to skeleton only.) + +`options`: + +```text +① knowledge-base — FAQ / knowledge base retrieval +② human-handoff — Human handoff + ticket flow (with agent dashboard) +③ tool-calling — Let the AI call your business tools / remote APIs +④ session-summary — Auto-generate a summary / ticket note when a session ends +``` + +`multiSelect: true` + +--- + +## Capabilities not in options (explanation talking points) + +| Capability | Why it's not listed | +|---|---| +| `digital-human` | Currently a placeholder capability (manifest hasn't completed ports/adapters); for digital human, please wait for a future version | + +--- + +## Validation / Fallback + +- Q4 all empty → skip `add-capability.py` call (conversation-core skeleton only, already in the repo) +- Selected `tool-calling` but Q2 picked "voice-only call" → warn "tool calling will not display intermediate status on a voice-only channel"; ask user to confirm whether to keep it +- Selected `session-summary` but `LLM_API_KEY` not configured → warn "session summary depends on LLM Key; please complete LLM Key configuration in §7" + +--- + +## Options → Assembly Command + +```bash +# AI executes at Path B Step 6 (skip the entire command when Q4 is empty): +python3 scripts/add-capability.py \ + knowledge-base human-handoff tool-calling session-summary \ + --apply --json +``` + +> The actual command only includes the capability names the user **selected**; the above is the "select all" example. + +--- + +## Answer write-back + +```yaml +# Render to /recipe.yaml +capabilities: + required: + - name: conversation-core + role: skeleton + install: + # User-selected capabilities (append one entry per selected; adapter defaults to manifest.config.adapter.default) + - name: knowledge-base + adapter: mock + - name: human-handoff + adapter: local_queue + optional: [] + excluded: + - name: digital-human # Not participating in this release +``` diff --git a/skills/trtc-ai-service/scenarios/customer-service/README.md b/skills/trtc-ai-service/scenarios/customer-service/README.md new file mode 100644 index 0000000..4bb4a72 --- /dev/null +++ b/skills/trtc-ai-service/scenarios/customer-service/README.md @@ -0,0 +1,114 @@ +# scenarios/customer-service —— Path A Default Recipe + +> Companion doc: repo root `SKILL.md` (Path A SOP §5). + +This directory is the "ready-to-use" AI customer service demo for Path A — after the user says +"Build me an AI customer service agent with TRTC", the Coding Agent follows the 6-step workflow +in `SKILL.md §5` to install the `knowledge-base` + `human-handoff` + `session-summary` capability packages, run +`post-install-patch.py`, overlay the UI, and serve the **Voice Agent UI** at `http://localhost:3000`. + +--- + +## Default Artifact: voice-customer-service (v1.1) + +- **Real Voice Agent**: Built on conversation-core voice pipeline (TRTC enterRoom + agent/start + ASR/LLM/TTS) +- **Silent RAG**: Frontend calls `/api/v1/kb/search` on every user message; hits are **not shown as cards**, instead absorbed naturally by the LLM (keeping the conversation stream clean) +- **Handoff Queue Animation**: 8s progress bar + shimmer highlight + countdown; then calls `/handoff/connect` simulating `demo_agent_alex` pickup; polling for `state=connected` switches the badge +- **Product / Order Business Panel**: Left sidebar, clicking a card auto-initiates an inquiry +- **Design Compliant**: Light glassmorphism + purple-pink accent + Lucide SVG icons + tokens.css; **zero emoji** in UI +- **English-only**: All UI copy / mock data / FAQ / keyword triggers are in English (targeting overseas developers) +- **Top Bar LED hover tooltips**: Clear separation of Tencent Cloud (CAM/STS control plane) / TRTC (media data plane) / LLM (replaceable inference engine) responsibilities + +--- + +## Directory Overview + +``` +scenarios/customer-service/ +├── README.md ← This file +├── recipe.yaml ← Path A recipe (parsed by AI) +├── system-prompt.template.md ← Neutral system prompt template +├── sample-data/ +│ └── faq-sample.json ← 5 demo FAQ entries (English) +└── ui/ + ├── design-system/ + │ └── DESIGN_GUIDELINES.md ← Design spec (mandatory) + ├── voice-customer-service/ ← ⭐ Default UI (v1.1 Voice Agent) + │ ├── README.md + │ ├── index.html ← Contains Lucide SVG icon defs + three-column layout + │ ├── app.js ← TRTC pipeline + silent KB + HH progress bar + dedup + │ ├── styles.css ← Light glassmorphism + tooltip + progress animation + │ ├── mock-shop.json ← 3 products + 3 orders (English) + │ └── tokens.css ← Light glassmorphism design tokens (hand-aligned to the running theme) + ├── widget-floating/ ← Alternative: lightweight text IM floating widget (no TRTC voice) + └── admin-board/ ← Ticket agent dashboard (operations-side) +``` + +--- + +## For AI / Developers: Manual Deployment + +> Path A SOP is driven by `SKILL.md §5`; below is the **bare command version** (for local manual verification): + +```bash +# 1. Install KB + HH + session-summary (default mock + local_queue adapters) +python3 scripts/add-capability.py knowledge-base human-handoff session-summary --apply --json + +# 2. Post-install patch (fix legacy injection misalignment + write default .env capability config + validate server.py) +python3 scripts/post-install-patch.py + +# 3. UI overlay: voice-customer-service (default) + admin-board +cp scenarios/customer-service/ui/voice-customer-service/{index.html,app.js,styles.css,data.js,mock-shop.json,tokens.css} \ + capabilities/conversation-core/web-demo/ +mkdir -p capabilities/conversation-core/web-demo/admin +cp -R scenarios/customer-service/ui/admin-board/. \ + capabilities/conversation-core/web-demo/admin/ + +# 4. Start (first launch creates venv + pip install, 30-60s) +bash start.sh +``` + +After startup, access: + +| Entry | URL | Purpose | +|---|---|---| +| AI Voice Agent | http://localhost:3000 | End-user voice + text dual-mode conversation | +| Admin board | http://localhost:3000/static/admin/ | Agent view / connect / close tickets | +| Health probe | http://localhost:3000/api/v1/health | Three LED JSON | +| API docs | http://localhost:3000/docs | FastAPI Swagger | + +> ⚠ Previous docs incorrectly mentioned `/admin/tickets` — that route does not exist. **Correct path is `/static/admin/`**. + +--- + +## Switching to Another UI + +If you don't want the voice channel and only need a lightweight text IM floating widget, change Step 3 to: + +```bash +cp -R scenarios/customer-service/ui/widget-floating/. \ + capabilities/conversation-core/web-demo/ +``` + +`widget-floating` calls `/api/v1/kb/search` + `/api/v1/handoff/request` (pure REST text IM) — it does **not** open a TRTC room. + +--- + +## Design Language + +- Fully references `design_tokens.json` v1.1.0; all color values / font sizes / spacing in the UI use CSS variables +- Font family locked to `SF Pro / Inter / Helvetica Neue` +- **Emoji disabled** in UI; status indicators use `color.status.{success,info,warning,error}` namespace +- Frosted glass panels include `@supports` fallback; older browsers degrade to semi-transparent solid panels + +Any changes to `tokens.css` must first modify `design_tokens.json`, then recompile. + +--- + +## Going to Production + +1. **KB**: `KB_ADAPTER=mock` → `local_json` (point to real FAQ file) or `default_rest` (connect to external knowledge base) +2. **Handoff**: `HH_ADAPTER=local_queue` → `default_rest`; if the API diverges from the default contract, use the `SKILL.md §8.3` contract-adapt flow to generate `user_custom.py` +3. **UI**: The `