Skip to content

Commit 588312d

Browse files
committed
feat: Enhance database schema handling and security features
- Updated TableSchema to include detailed column information, foreign keys, and primary keys. - Introduced ColumnInfo and ForeignKey types for better schema representation. - Added security components: QueryValidator for SQL query validation and PIIMasker for data masking. - Implemented query modification to enforce row limits on executed queries. - Added tools for listing tables and describing table schemas with detailed output. - Enhanced MCPServer to manage custom tools and integrate security features. - Created tests for query validation, PII masking, schema loading, and tool handling.
1 parent 7985647 commit 588312d

15 files changed

Lines changed: 1691 additions & 49 deletions

File tree

CHANGELOG.md

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,53 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8-
## [Unreleased]
8+
## [0.2.0] - 2026-02-15
9+
10+
### Added
11+
- **Enterprise-Grade Security Features:**
12+
- **AST-Based Query Sanitization**: Uses sqlparser library for deep SQL analysis
13+
- Blocks dangerous operations: INSERT, UPDATE, DELETE, DROP, ALTER, TRUNCATE, EXEC, MERGE, etc.
14+
- Allows only SELECT and WITH (CTE) queries
15+
- Validates queries at AST level, not just regex
16+
- **PII Data Masking**: Automatically masks personally identifiable information
17+
- Built-in patterns: Credit cards, emails, SSNs, Turkish IDs, IBANs, phone numbers
18+
- Configurable custom patterns via regex
19+
- Enable/disable per pattern
20+
- Enterprise-ready for GDPR/KVKK compliance
21+
- **Automatic Row Limiting**: Prevents database overload
22+
- Configurable max row limit (default: 1000)
23+
- Automatically adds LIMIT clause to queries
24+
- Preserves existing LIMIT if present
25+
- **Dynamic Tool Generation:**
26+
- New `list_tables` tool: Lists all tables in a database with summary information
27+
- New `describe_table` tool: Shows detailed schema for a specific table
28+
- Custom tool support: Define reusable SQL queries as MCP tools in config
29+
- Parameter substitution in custom queries using `{{parameter}}` syntax
30+
- **Automatic Schema Discovery:** CoreMCP now automatically scans database tables, columns, primary keys, and foreign keys on startup
31+
- **Column Comments/Descriptions Support:** Extracts and presents database column comments (e.g., MS_Description in MSSQL) to AI for better context
32+
- **Schema Context Prompt:** New `database_schema` MCP prompt that provides complete database structure to Claude automatically
33+
- **Enhanced Schema Types:**
34+
- New `ColumnInfo` type with name, data type, nullable flag, and description
35+
- New `ForeignKey` type with full relationship information
36+
- Enhanced `TableSchema` with primary keys array
37+
- **MSSQL Extended Support:**
38+
- Automatic extraction of column descriptions from MS_Description extended properties
39+
- Full primary key discovery
40+
- Complete foreign key relationship mapping
41+
42+
### Changed
43+
- **Query Execution**: Now uses security validator and row limiter for all queries
44+
- **Config Structure**: Added `security` section with PII masking and row limit configuration
45+
- **MCP Server**: Added `list_tables` and `describe_table` handlers
46+
- **Config Structure**: Added `custom_tools` section for defining custom query tools
47+
- **MSSQL Adapter:** Enhanced `GetSchema()` to retrieve complete table metadata including types, constraints, and relationships
48+
- **MCP Server:** Added automatic schema loading during startup via `LoadSchemas()` method
49+
- **Core Types:** Updated `TableSchema` structure to support rich metadata
50+
51+
### Security
52+
- **BREAKING**: All queries now validated with AST parser (blocks write operations)
53+
- **BREAKING**: Automatic LIMIT clause added to all SELECT queries (configurable)
54+
- PII masking can be enabled via `security.enable_pii_masking` in config
955

1056
## [0.1.0] - 2026-02-14
1157

README.md

Lines changed: 128 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,10 @@ CoreMCP by CoreBaseHQ provides a secure, extensible bridge between AI assistants
1212
## 🚀 Features
1313
**⚠️ Safety First: CoreMCP is designed to be Read-Only by default. We strongly recommend creating a specific database user with SELECT permissions only.**
1414
- 🔌 **Multiple Database Support**: MSSQL, Firebird (coming soon), and extensible adapter system
15+
- 🧠 **Automatic Schema Discovery**: CoreMCP automatically scans your database tables, columns, foreign keys, and descriptions to provide AI context
16+
- 📝 **Column Comments Support**: Extracts and presents database column comments/descriptions to the AI for better query understanding
17+
- 🛠️ **Dynamic Tool Generation**: Built-in tools for common operations (list tables, describe schema) plus custom tool support
18+
- 🎯 **Custom Query Tools**: Define reusable SQL queries as MCP tools in your config file
1519
- 🛡️ **Secure**: Read-only mode support, connection string isolation
1620
- 🎯 **MCP Native**: Built specifically for Model Context Protocol
1721
- 🔧 **Easy Configuration**: Simple YAML-based setup
@@ -72,6 +76,40 @@ sqlserver://username:password@host:port?database=dbname&encrypt=disable
7276
dummy://test
7377
```
7478

79+
### Security Configuration
80+
81+
CoreMCP includes enterprise-grade security features:
82+
83+
```yaml
84+
security:
85+
# Maximum rows to return (prevents DB overload)
86+
max_row_limit: 1000
87+
88+
# Enable PII masking
89+
enable_pii_masking: true
90+
91+
# PII patterns to mask
92+
pii_patterns:
93+
- name: "credit_card"
94+
pattern: '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'
95+
replacement: "****-****-****-****"
96+
enabled: true
97+
- name: "email"
98+
pattern: '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
99+
replacement: "***@***.***"
100+
enabled: true
101+
- name: "turkish_id"
102+
pattern: '\b[1-9]\d{10}\b'
103+
replacement: "***********"
104+
enabled: true
105+
```
106+
107+
**Security Features:**
108+
- **AST-Based Query Validation**: Uses sqlparser to analyze SQL queries and block dangerous operations (DROP, ALTER, UPDATE, DELETE, TRUNCATE, EXEC, etc.)
109+
- **Automatic Row Limiting**: Adds LIMIT clause to prevent accidentally returning millions of rows
110+
- **PII Data Masking**: Automatically masks sensitive data like credit cards, emails, SSNs, Turkish IDs, IBANs
111+
- **Configurable Patterns**: Define custom regex patterns for your specific PII requirements
112+
75113
## 🎯 Usage
76114
77115
### Start the MCP Server
@@ -121,11 +159,13 @@ coremcp/
121159
└── coremcp.yaml # Configuration file
122160
```
123161

124-
## 🔌 Available Tools
162+
## 🔌 Available Tools & Prompts
125163

126-
### `query_database`
164+
### Built-in Tools
127165

128-
Executes SQL queries on configured database sources.
166+
#### `query_database`
167+
168+
Executes arbitrary SQL queries on configured database sources.
129169

130170
**Parameters:**
131171
- `source_name` (required): Name of the database source from config
@@ -136,6 +176,82 @@ Executes SQL queries on configured database sources.
136176
SELECT * FROM users WHERE id = 1
137177
```
138178

179+
#### `list_tables`
180+
181+
Lists all tables in a database with summary information.
182+
183+
**Parameters:**
184+
- `source_name` (required): Name of the database source
185+
186+
**Returns:** List of tables with column counts, primary keys, and foreign key counts.
187+
188+
#### `describe_table`
189+
190+
Shows detailed schema information for a specific table.
191+
192+
**Parameters:**
193+
- `source_name` (required): Name of the database source
194+
- `table_name` (required): Name of the table to describe
195+
196+
**Returns:** Complete table schema including:
197+
- Column names and data types
198+
- Nullable information
199+
- Primary keys
200+
- Foreign key relationships
201+
- Column descriptions/comments
202+
203+
### Custom Tools
204+
205+
You can define reusable SQL queries as custom MCP tools in your `coremcp.yaml`:
206+
207+
```yaml
208+
custom_tools:
209+
- name: "get_daily_sales"
210+
description: "Retrieves daily sales summary for a specific date"
211+
source: "production_db"
212+
query: "SELECT * FROM orders WHERE DATE(created_at) = '{{date}}'"
213+
parameters:
214+
- name: "date"
215+
description: "Date in YYYY-MM-DD format"
216+
required: true
217+
218+
- name: "get_top_customers"
219+
description: "Lists top N customers by order count"
220+
source: "production_db"
221+
query: "SELECT user_id, COUNT(*) as order_count FROM orders GROUP BY user_id ORDER BY order_count DESC LIMIT {{limit}}"
222+
parameters:
223+
- name: "limit"
224+
description: "Number of customers to return"
225+
required: true
226+
default: "10"
227+
```
228+
229+
**Benefits:**
230+
- Encapsulate complex queries
231+
- Provide simple interfaces for common operations
232+
- Parameters are automatically validated
233+
- AI can discover and use these tools automatically
234+
235+
### `database_schema` Prompt
236+
237+
Automatically provides complete database schema context to the AI, including:
238+
- Table names
239+
- Column names with data types
240+
- Primary keys
241+
- Foreign key relationships
242+
- Column descriptions/comments from the database
243+
244+
When CoreMCP starts, it automatically:
245+
1. Connects to all configured databases
246+
2. Scans the schema (tables, columns, keys, relationships)
247+
3. Extracts column comments/descriptions (e.g., `MS_Description` in MSSQL)
248+
4. Creates a comprehensive context prompt for the AI
249+
250+
This allows Claude to understand your database structure and write accurate queries without you having to explain the schema manually.
251+
252+
**Example:**
253+
When you ask Claude "Show me all sales", Claude can see that you have a `TBLSATIS` table with specific columns and automatically write the correct query.
254+
139255
## 🛠️ Adding Custom Adapters
140256

141257
1. Create a new package in `pkg/adapter/yourdb/`
@@ -158,19 +274,26 @@ Apache License 2.0 - see [LICENSE](LICENSE) for details.
158274

159275
## 🌟 Roadmap
160276

277+
- [x] **Automatic Schema Discovery** - Load database structure on startup
278+
- [x] **Column Comments/Descriptions** - Extract and display database metadata
279+
- [x] **Dynamic Tool Generation** - list_tables, describe_table tools
280+
- [x] **Custom Query Tools** - Define reusable queries in config
281+
- [x] **AST-Based Query Sanitization** - Block dangerous SQL operations
282+
- [x] **PII Data Masking** - Mask sensitive information in results
283+
- [x] **Automatic Row Limiting** - Prevent database overload
161284
- [ ] PostgreSQL adapter
162285
- [ ] MySQL adapter
163286
- [ ] Firebird adapter (in progress)
164-
- [ ] Schema introspection tools
165287
- [ ] Query result caching
166288
- [ ] HTTP transport support
167289
- [ ] Write operation support (with strict safety guards)
290+
- [ ] Audit logging
168291

169292
## 💬 Support
170293

171294
- 🐛 [Report a bug](https://github.com/corebasehq/coremcp/issues/new?template=bug_report.md)
172295
- 💡 [Request a feature](https://github.com/corebasehq/coremcp/issues/new?template=feature_request.md)
173-
- 📧 Email: support@corebase.com
296+
- 📧 Email: support@corebasehq.com
174297

175298
---
176299

cmd/coremcp/serve.go

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ import (
66
"os"
77

88
"github.com/corebasehq/coremcp/pkg/adapter"
9+
"github.com/corebasehq/coremcp/pkg/config"
10+
"github.com/corebasehq/coremcp/pkg/security"
911
"github.com/corebasehq/coremcp/pkg/server"
1012
"github.com/spf13/cobra"
1113
)
@@ -24,6 +26,21 @@ var serveCmd = &cobra.Command{
2426

2527
mcpSrv := server.NewMCPServer(cfg.Server.Name, cfg.Server.Version)
2628

29+
// Configure security features
30+
log.Println("Configuring security features...")
31+
piiPatterns := convertPIIPatterns(cfg.Security.PIIPatterns)
32+
if err := mcpSrv.ConfigureSecurity(
33+
cfg.Security.MaxRowLimit,
34+
cfg.Security.EnablePIIMasking,
35+
piiPatterns,
36+
cfg.Security.AllowedKeywords,
37+
cfg.Security.BlockedKeywords,
38+
); err != nil {
39+
log.Fatalf("CRITICAL: Failed to configure security: %v", err)
40+
}
41+
log.Printf("Security configured: MaxRowLimit=%d, PIIMasking=%v",
42+
cfg.Security.MaxRowLimit, cfg.Security.EnablePIIMasking)
43+
2744
for _, sourceCfg := range cfg.Sources {
2845
src, err := adapter.NewSource(sourceCfg.Type, sourceCfg.DSN)
2946
if err != nil {
@@ -39,6 +56,37 @@ var serveCmd = &cobra.Command{
3956
log.Printf("Source ready: %s (%s) [ReadOnly: %v]", sourceCfg.Name, sourceCfg.Type, sourceCfg.ReadOnly)
4057
}
4158

59+
// Load database schemas for AI context
60+
log.Println("Loading database schemas for AI context...")
61+
if err := mcpSrv.LoadSchemas(cmd.Context()); err != nil {
62+
log.Printf("WARNING: Failed to load schemas: %v", err)
63+
} else {
64+
log.Println("Database schemas loaded successfully!")
65+
}
66+
67+
// Register custom tools from config
68+
if len(cfg.CustomTools) > 0 {
69+
log.Printf("Registering %d custom tool(s)...", len(cfg.CustomTools))
70+
for _, toolCfg := range cfg.CustomTools {
71+
params := make([]string, len(toolCfg.Parameters))
72+
for i, p := range toolCfg.Parameters {
73+
params[i] = p.Name
74+
}
75+
76+
if err := mcpSrv.AddCustomTool(
77+
toolCfg.Name,
78+
toolCfg.Description,
79+
toolCfg.Source,
80+
toolCfg.Query,
81+
params,
82+
); err != nil {
83+
log.Printf("WARNING: Failed to register custom tool %s: %v", toolCfg.Name, err)
84+
} else {
85+
log.Printf("Custom tool registered: %s", toolCfg.Name)
86+
}
87+
}
88+
}
89+
4290
transport, _ := cmd.Flags().GetString("transport")
4391
if transport == "stdio" {
4492
log.Println("CoreMCP started on Stdio. Waiting for MCP client...")
@@ -56,3 +104,17 @@ func init() {
56104

57105
serveCmd.Flags().StringP("transport", "t", "stdio", "Transport type: stdio or http")
58106
}
107+
108+
// convertPIIPatterns converts config PII patterns to security PII patterns.
109+
func convertPIIPatterns(configPatterns []config.PIIMaskPattern) []security.MaskPattern {
110+
patterns := make([]security.MaskPattern, len(configPatterns))
111+
for i, p := range configPatterns {
112+
patterns[i] = security.MaskPattern{
113+
Name: p.Name,
114+
Pattern: p.Pattern,
115+
Replacement: p.Replacement,
116+
Enabled: p.Enabled,
117+
}
118+
}
119+
return patterns
120+
}

coremcp.example.yaml

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,75 @@ sources:
3434
# DSN Format Examples:
3535
# MSSQL: sqlserver://username:password@host:port?database=dbname&encrypt=disable
3636
# Dummy: dummy://anything
37+
38+
# Security configuration
39+
security:
40+
# Maximum number of rows to return from any query (prevents DB overload)
41+
max_row_limit: 1000
42+
43+
# Enable PII (Personally Identifiable Information) masking
44+
enable_pii_masking: true
45+
46+
# PII patterns to mask in query results
47+
pii_patterns:
48+
- name: "credit_card"
49+
pattern: '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'
50+
replacement: "****-****-****-****"
51+
enabled: true
52+
53+
- name: "email"
54+
pattern: '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
55+
replacement: "***@***.***"
56+
enabled: true
57+
58+
- name: "turkish_id"
59+
pattern: '\b[1-9]\d{10}\b'
60+
replacement: "***********"
61+
enabled: true
62+
63+
- name: "phone"
64+
pattern: '\b\d{3}[-.\s]?\d{3}[-.\s]?\d{4}\b'
65+
replacement: "***-***-****"
66+
enabled: true
67+
68+
- name: "iban"
69+
pattern: '\b[A-Z]{2}\d{2}[A-Z0-9]{1,30}\b'
70+
replacement: "********************"
71+
enabled: false # Disabled by default
72+
73+
# Additional SQL keywords to allow (beyond SELECT/WITH)
74+
allowed_keywords: []
75+
76+
# Additional SQL keywords to block (beyond default dangerous ones)
77+
blocked_keywords: []
78+
79+
# Custom tools configuration (optional)
80+
# Define reusable queries as MCP tools
81+
custom_tools:
82+
# Example: Get daily sales summary
83+
- name: "get_daily_sales"
84+
description: "Retrieves daily sales summary for a specific date"
85+
source: "test_db"
86+
query: "SELECT * FROM orders WHERE DATE(created_at) = '{{date}}'"
87+
parameters:
88+
- name: "date"
89+
description: "Date in YYYY-MM-DD format"
90+
required: true
91+
92+
# Example: Get top customers
93+
- name: "get_top_customers"
94+
description: "Lists top N customers by order count"
95+
source: "test_db"
96+
query: "SELECT user_id, COUNT(*) as order_count FROM orders GROUP BY user_id ORDER BY order_count DESC LIMIT {{limit}}"
97+
parameters:
98+
- name: "limit"
99+
description: "Number of top customers to return"
100+
required: true
101+
default: "10"
102+
103+
# Example: No-parameter custom query
104+
- name: "get_pending_orders"
105+
description: "Gets all orders with pending status"
106+
source: "test_db"
107+
query: "SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at DESC"
108+
parameters: []

0 commit comments

Comments
 (0)