Skip to content

Core Concepts

zhouning edited this page Mar 22, 2026 · 1 revision

核心概念

GIS Data Agent v14.5 采用分层架构,将 AI 智能体、领域技能、工具函数和语义路由有机组合,实现从自然语言输入到空间分析结果的端到端自动化。

架构层级

Agent(智能体)

Agent 是系统的执行单元,每个 Agent 都是 Google ADK LlmAgent 实例,具备:

  • 专属工具集:每个 Agent 绑定一组 Toolset,决定其能力边界
  • 系统提示:YAML 格式 Prompt,定义 Agent 的行为规范和专业知识(存储在 prompts/ 目录)
  • 输出键 (output_key):Agent 间通过 output_key 共享状态(如 data_profileprocessed_dataanalysis_report
  • ADK 约束:由于 ADK 的"单父节点"限制,同一 Agent 实例不能跨多个 Pipeline 复用,需通过工厂函数(_make_planner_*())为每条 Pipeline 创建独立实例

系统定义了三条 Pipeline,每条 Pipeline 是一个 SequentialAgent,串联多个专业 Agent 协作完成任务。

Skill(技能)

Skill 是细粒度的领域专业知识包,系统内置 18 个 Skill,每个 Skill 包含:

  • 领域 (domain):所属专业领域(GIS / Governance / Database / Visualization / Analysis / Fusion / General / Collaboration)
  • 触发关键词 (intent_triggers):激活该 Skill 的中英文关键词
  • 指令 (instructions):详细的操作流程和最佳实践(Markdown 格式的 SKILL.md)
  • 参考资源 (references):检查清单、模板、评估标准等辅助文件
  • 设计模式 (design pattern):如 Generator(模板生成)、Reviewer(清单审查)、Inversion(采访模式)

三级增量加载机制

级别 内容 加载时机
L1 元数据(name, domain, triggers) 系统启动时始终加载
L2 指令(SKILL.md 正文) 技能被激活时加载
L3 资源(references/, assets/) 执行分析时按需加载

Tool(工具)

Tool 是 Agent 可调用的最小执行单元,每个 Tool 是一个包装为 ADK FunctionTool 的 Python 函数。系统共有 170+ 工具函数,覆盖数据探索、空间处理、可视化、数据库操作、遥感分析等领域。

工具函数通过 contextvars.ContextVar 隐式获取用户身份,实现多租户隔离——无需在每个工具调用中显式传递 user_id

Toolset(工具集)

Toolset 是工具的逻辑分组单元,每个 Toolset 继承自 BaseToolset,将同一领域的相关工具集聚注册。系统共有 28 个 Toolset。Toolset 支持:

  • 工具过滤 (tool_filter):可按需选择性暴露部分工具
  • 动态工具:如 McpHubToolset 从 MCP 服务器动态发现工具,UserToolset 加载用户自定义工具
  • Pipeline 感知:部分 Toolset 根据所在 Pipeline 返回不同工具子集

Intent Router(意图路由器)

Intent Router 是语义调度层,使用 Gemini 2.0 Flash 模型将用户自然语言消息分类为三种意图之一,路由到对应 Pipeline。

核心特性:

  • 三语言支持:自动检测中文、英文、日文输入
  • 分类提示:使用结构化 Prompt 引导 LLM 输出 optimization / governance / general 标签
  • RBAC 门控:路由后进行角色检查——viewer 角色不可访问 Optimization 和 Governance Pipeline

Pipeline(管道)

Pipeline 是 Agent 的编排容器,使用 ADK SequentialAgent 按顺序串联多个 Agent。三条 Pipeline 分别处理不同类型的用户请求。

架构总览

User Message(用户消息)
  └→ Intent Router (Gemini 2.0 Flash 意图分类)
      ├→ Optimization Pipeline(优化管道 — SequentialAgent)
      │   └→ Agent₁ → Agent₂ → ... → AgentN
      │       └→ Tools (from Toolsets 工具集)
      │           └→ Skills (scenario expertise 场景技能)
      ├→ Governance Pipeline(治理管道)
      └→ General Pipeline(通用管道)

关键交互流程

  1. 用户发送消息(支持文本 + 文件上传)
  2. app.py 设置 ContextVar 用户身份(per async task)
  3. 处理文件上传(ZIP 自动解压 .shp/.kml/.geojson/.gpkg
  4. classify_intent() 调用 Gemini Flash 分类意图
  5. RBAC 角色检查(viewer 受限)
  6. 调度到对应 Pipeline 执行
  7. 检测 layer_control 响应 → 注入地图元数据

用户自服务扩展

除内置组件外,用户还可以自定义扩展系统:

  • Custom Skills:通过 UI 创建自定义 LlmAgent,支持指令编写、工具集选择、关键词触发,数据库存储,每用户隔离
  • User Tools:声明式工具模板(http_call / sql_query / file_transform / chain),动态构建为 FunctionTool
  • Workflow Editor:ReactFlow DAG 编辑器,支持将 Custom Skills 编排为多 Agent 流水线

Core Concepts

GIS Data Agent v14.5 employs a layered architecture that organically combines AI agents, domain skills, tool functions, and semantic routing to achieve end-to-end automation from natural language input to spatial analysis results.

Architecture Layers

Agent

An Agent is the system's execution unit. Each Agent is a Google ADK LlmAgent instance with:

  • Dedicated toolsets: Each Agent is bound to a set of Toolsets that define its capability boundary
  • System prompts: YAML-format prompts defining the Agent's behavioral norms and expertise (stored in prompts/)
  • Output key (output_key): Agents share state via output_key (e.g., data_profile, processed_data, analysis_report)
  • ADK constraint: Due to ADK's "single parent" restriction, the same Agent instance cannot be reused across multiple Pipelines — factory functions (_make_planner_*()) create independent instances for each Pipeline

The system defines three Pipelines, each a SequentialAgent chaining multiple specialized Agents for collaborative task completion.

Skill

A Skill is a fine-grained domain expertise package. The system ships with 18 built-in Skills, each containing:

  • Domain: The professional domain it belongs to (GIS / Governance / Database / Visualization / Analysis / Fusion / General / Collaboration)
  • Intent triggers: Chinese and English keywords that activate the Skill
  • Instructions: Detailed operational procedures and best practices (Markdown SKILL.md)
  • Reference resources: Checklists, templates, evaluation criteria, and supporting files
  • Design pattern: e.g., Generator (template generation), Reviewer (checklist review), Inversion (interview mode)

Three-level incremental loading:

Level Content When Loaded
L1 Metadata (name, domain, triggers) Always loaded at system startup
L2 Instructions (SKILL.md body) Loaded when the skill is activated
L3 Resources (references/, assets/) Loaded on-demand during analysis

Tool

A Tool is the smallest execution unit an Agent can call — each Tool is a Python function wrapped as an ADK FunctionTool. The system has 170+ tool functions covering data exploration, spatial processing, visualization, database operations, remote sensing, and more.

Tool functions implicitly obtain user identity via contextvars.ContextVar, enabling multi-tenant isolation without explicitly passing user_id in every tool call.

Toolset

A Toolset is a logical grouping of related tools. Each Toolset extends BaseToolset and registers a cohesive set of tools from the same domain. The system has 28 Toolsets. Toolsets support:

  • Tool filtering (tool_filter): Selectively expose a subset of tools
  • Dynamic tools: e.g., McpHubToolset discovers tools from MCP servers at runtime; UserToolset loads user-defined tools
  • Pipeline awareness: Some Toolsets return different tool subsets depending on the active Pipeline

Intent Router

The Intent Router is the semantic dispatch layer. It uses Gemini 2.0 Flash to classify user natural language messages into one of three intents, routing to the appropriate Pipeline.

Key features:

  • Tri-lingual support: Automatic detection of Chinese, English, and Japanese input
  • Classification prompt: A structured prompt guides the LLM to output optimization / governance / general labels
  • RBAC gating: Post-routing role check — viewer role is blocked from Optimization and Governance Pipelines

Pipeline

A Pipeline is an orchestration container for Agents, using ADK SequentialAgent to chain multiple Agents sequentially. The three Pipelines handle different types of user requests.

Architecture Overview

User Message
  └→ Intent Router (Gemini 2.0 Flash classification)
      ├→ Optimization Pipeline (SequentialAgent)
      │   └→ Agent₁ → Agent₂ → ... → AgentN
      │       └→ Tools (from Toolsets)
      │           └→ Skills (scenario expertise)
      ├→ Governance Pipeline
      └→ General Pipeline

Key Interaction Flow

  1. User sends a message (text + optional file upload)
  2. app.py sets ContextVar user identity (per async task)
  3. File uploads are processed (ZIP auto-extraction for .shp/.kml/.geojson/.gpkg)
  4. classify_intent() calls Gemini Flash to classify intent
  5. RBAC role check (viewer restricted)
  6. Dispatch to the corresponding Pipeline
  7. Detect layer_control in response → inject map metadata

User Self-Service Extension

Beyond built-in components, users can extend the system:

  • Custom Skills: Create custom LlmAgent instances via UI — supports instruction writing, toolset selection, keyword triggers, database storage, per-user isolation
  • User Tools: Declarative tool templates (http_call / sql_query / file_transform / chain) dynamically built into FunctionTool
  • Workflow Editor: ReactFlow DAG editor for composing Custom Skills into multi-Agent pipeline workflows

Clone this wiki locally