-
Notifications
You must be signed in to change notification settings - Fork 2
Data Governance Guide
GIS Data Agent v14.5 — 基于 Google ADK 的 AI 地理空间平台 仓库:https://github.com/zhouning/gisdataagent
GIS Data Agent 的数据治理流程对空间数据执行 六维质量审计,自动发现并修复常见问题:
| 维度 | 说明 |
|---|---|
| 拓扑完整性 | 自相交、重叠、空洞等几何拓扑错误 |
| 属性完整性 | 必填字段缺失率、空值比例 |
| 数据间隙 | 要素覆盖范围中的空白区域 |
| 坐标参考系 (CRS) | CRS 一致性、投影合理性 |
| 属性取值有效性 | 字段值域范围检查、异常值检测 |
| 重复要素 | 几何形状或属性完全重复的记录 |
支持的格式:Shapefile (.shp)、GeoJSON、GeoPackage (.gpkg)、CSV(含经纬度列)
将文件直接拖入聊天面板,或通过文件上传按钮选择。ZIP 压缩包中的 .shp/.kml/.geojson/.gpkg 会自动解压。
在聊天面板中输入:
- 中文:
请对这个数据进行治理审计 - 英文:
Run a governance audit on this dataset
语义意图路由器 (intent_router.py) 使用 Gemini 2.0 Flash 将请求分类为"治理"意图,并分派至 Governance Pipeline(SequentialAgent)。
GovExploration → GovProcessing → GovernanceReportLoop
治理探索代理运行 11 项质量检查,涵盖上述六个维度。每项检查产出具体的问题列表和严重程度评分。
治理处理代理根据检查结果执行自动修复:
- 投影重投影:统一 CRS 至目标参考系
- 拓扑修复:修复自相交、消除碎片多边形
- 重复要素删除:基于几何+属性的精确去重
- 属性标准化:空值填充、值域修正
- 雷达图:六维质量评分一目了然
- 问题分布图:空间化展示数据问题的地理分布
治理报告循环代理生成结构化审计报告,最多迭代 3 轮 以确保报告质量。报告包含:问题清单、修复操作日志、治理前后对比、评分变化。
| 工具名称 | 功能 |
|---|---|
check_gaps |
检测要素覆盖范围中的空白间隙 |
check_completeness |
评估属性完整性(空值率、缺失字段) |
check_attribute_range |
检查属性值是否在合理范围内 |
check_duplicates |
发现几何或属性重复的要素 |
check_crs_consistency |
验证 CRS 一致性和投影合理性 |
governance_score |
计算综合治理评分 (0-100) |
governance_summary |
生成治理概要报告 |
综合评分范围 0–100,对应字母等级:
| 评分 | 等级 | 含义 |
|---|---|---|
| 90–100 | A | 优秀——数据质量极高 |
| 80–89 | B | 良好——存在少量问题 |
| 70–79 | C | 合格——需关注部分维度 |
| 60–69 | D | 较差——多个维度存在问题 |
| <60 | F | 不合格——需大规模治理 |
各维度权重可配置,默认均等权重。
check_consistency 工具支持 跨模态对比审计——将 PDF 报告中的数据描述与实际 Shapefile 数据进行对比,发现文档与空间数据之间的不一致。
GIS Data Agent v14.5 — AI geospatial platform on Google ADK Repository: https://github.com/zhouning/gisdataagent
GIS Data Agent's data governance workflow performs a 6-dimension quality audit on spatial data, automatically discovering and remediating common issues:
| Dimension | Description |
|---|---|
| Topology integrity | Self-intersections, overlaps, gaps, and other geometry errors |
| Completeness | Missing required fields, null value ratios |
| Data gaps | Blank areas within feature coverage extents |
| CRS consistency | Coordinate Reference System uniformity and projection validity |
| Attribute validity | Field value range checks and outlier detection |
| Duplicates | Geometrically or attributively identical records |
Supported formats: Shapefile (.shp), GeoJSON, GeoPackage (.gpkg), CSV (with lat/lng columns)
Drag files directly into the chat panel or use the upload button. ZIP archives containing .shp/.kml/.geojson/.gpkg are auto-extracted.
Type in the chat panel:
- Chinese:
请对这个数据进行治理审计 - English:
Run a governance audit on this dataset
The Semantic Intent Router (intent_router.py) uses Gemini 2.0 Flash to classify the request as a "governance" intent and dispatches it to the Governance Pipeline (SequentialAgent):
GovExploration → GovProcessing → GovernanceReportLoop
The governance exploration agent runs 11 quality checks covering all six dimensions. Each check produces a specific list of issues and a severity score.
The governance processing agent performs automatic fixes based on check results:
- Reprojection: Unify CRS to target reference system
- Topology repair: Fix self-intersections, remove sliver polygons
- Duplicate removal: Exact deduplication based on geometry + attributes
- Attribute standardization: Null filling, value range corrections
- Radar chart: Six-dimension quality scores at a glance
- Problem distribution map: Geographically visualize where data issues occur
The governance report loop agent generates a structured audit report, iterating up to 3 rounds to ensure report quality. The report includes: issue inventory, remediation action log, before/after comparison, and score changes.
| Tool Name | Function |
|---|---|
check_gaps |
Detect blank gaps in feature coverage |
check_completeness |
Assess attribute completeness (null rates, missing fields) |
check_attribute_range |
Check whether attribute values fall within valid ranges |
check_duplicates |
Find geometrically or attributively duplicate features |
check_crs_consistency |
Validate CRS consistency and projection validity |
governance_score |
Compute composite governance score (0-100) |
governance_summary |
Generate governance summary report |
Composite score ranges from 0–100 with corresponding letter grades:
| Score | Grade | Meaning |
|---|---|---|
| 90–100 | A | Excellent — very high data quality |
| 80–89 | B | Good — minor issues present |
| 70–79 | C | Acceptable — some dimensions need attention |
| 60–69 | D | Poor — multiple dimensions have issues |
| <60 | F | Failing — major governance effort required |
Dimension weights are configurable; equal weighting is the default.
The check_consistency tool supports cross-modal comparison auditing — comparing data descriptions from PDF reports against actual Shapefile data to discover inconsistencies between documentation and spatial data.
GIS Data Agent v14.5 · Built on Google ADK · GitHub · Report Issue
快速开始 Getting Started
核心概念 Core Concepts
使用指南 User Guide
架构与运维 Architecture & Ops
参考 Reference