SAP Workflow Mining - Security Architecture & Compliance
This document describes the security architecture, data handling practices, and compliance considerations for SAP Workflow Mining.
SAP Workflow Mining is designed for enterprise security requirements:
- Read-only access - No write operations to SAP
- On-premise only - No cloud dependencies, no external APIs
- No telemetry - No phone-home, no usage tracking
- PII protection - Automatic redaction enabled by default
- Audit logging - Complete request/response logging
- Minimal permissions - Principle of least privilege
+============================================================================+
|| YOUR CORPORATE NETWORK ||
|| ||
|| +---------------------------+ +-------------------------------+ ||
|| | SAP ECC 6.0 | | SAP Workflow Mining | ||
|| | | | Server | ||
|| | +-----------------+ | | | ||
|| | | SD Tables | | RFC | +------------------------+ | ||
|| | | VBAK, VBAP |<----------->| | MCP Server | | ||
|| | | LIKP, LIPS | | (1) | | (Node.js/TypeScript) | | ||
|| | | VBRK, VBRP | | | +------------------------+ | ||
|| | | STXH, STXL | | | | (2) | ||
|| | +-----------------+ | | v | ||
|| | | | +------------------------+ | ||
|| | +-----------------+ | | | Pattern Engine | | ||
|| | | Master Data | | | | (Python) | | ||
|| | | KNA1, MARA | | | +------------------------+ | ||
|| | +-----------------+ | | | (3) | ||
|| | | | v | ||
|| +---------------------------+ | +------------------------+ | ||
|| | | Local File System | | ||
|| | | ./output/ | | ||
|| +---------------------------+ | | - pattern_cards.json | | ||
|| | Analyst Workstation | | | - audit_log.json | | ||
|| | | | +------------------------+ | ||
|| | +-------------------+ | HTTP | | (4) | ||
|| | | Browser |<--------->| +------------------------+ | ||
|| | | (localhost:8080) | | (5) | | Web Viewer | | ||
|| | +-------------------+ | | | (localhost only) | | ||
|| | | | +------------------------+ | ||
|| +---------------------------+ +-------------------------------+ ||
|| ||
|| NO EXTERNAL CONNECTIONS ||
|| ||
+============================================================================+
|
X (No outbound traffic)
|
+--------------------+
| Internet |
+--------------------+
Data Flow Steps:
(1) RFC connection to SAP - read-only BAPIs only
(2) Internal processing - text extraction, normalization
(3) Pattern analysis and redaction
(4) Results stored locally
(5) Browser access via localhost only
| Table | Description | Data Type | Sensitivity |
|---|---|---|---|
| VBAK | Sales Order Header | Document metadata | Medium |
| VBAP | Sales Order Items | Line item details | Medium |
| LIKP | Delivery Header | Delivery metadata | Medium |
| LIPS | Delivery Items | Shipped quantities | Medium |
| VBRK | Invoice Header | Billing metadata | Medium |
| VBRP | Invoice Items | Billed amounts | Medium |
| VBFA | Document Flow | Document relationships | Low |
| STXH/STXL | Long Texts | Free-form text fields | High |
| KNA1 | Customer Master | Customer attributes | High |
| MARA | Material Master | Material attributes | Low |
- FI/CO tables (financial accounting)
- HR/HCM tables (employee data)
- Pricing conditions (KONV, A-tables)
- Credit management (KNKK)
- Bank details (BNKA, KNBK)
- Custom Z-tables (unless explicitly configured)
All data access is through SAP BAPIs - no direct SQL or RFC_READ_TABLE by default:
BAPI_SALESORDER_GETLIST - List sales orders (display only)
SD_SALESDOCUMENT_READ - Read order details (display only)
BAPI_SALESDOCU_GETRELATIONS - Document flow relationships
BAPI_OUTB_DELIVERY_GET_DETAIL - Delivery information
BAPI_BILLINGDOC_GETDETAIL - Invoice information
READ_TEXT - Long text content
BAPI_CUSTOMER_GETDETAIL2 - Customer master (display only)
BAPI_MATERIAL_GET_DETAIL - Material master (display only)
All data is stored on the local file system:
./output/
├── pattern_cards.json # Analysis results (redacted)
├── evidence_ledger.json # Document references (IDs only)
├── audit_log.json # Request/response log
├── cluster_analysis.json # Text clustering output
└── timing_analysis.json # Document flow timing
- No AWS S3, Azure Blob, GCP Storage
- No SaaS analytics platforms
- No external databases
- No CDN or edge caching
- No WebSocket connections to external servers
- No long-polling to cloud services
- No background sync operations
SAP Workflow Mining makes zero outbound network connections:
Outbound to Internet: NONE
Outbound to Cloud: NONE
Outbound to CDN: NONE
Outbound to Analytics: NONE
Outbound to Telemetry: NONE
The only network connections are within your corporate network:
| Source | Destination | Port | Protocol | Purpose |
|---|---|---|---|---|
| MCP Server | SAP ECC | 33XX | RFC | SAP data access |
| Browser | Web Viewer | 8080 | HTTP | Results viewing |
| Pattern Engine | MCP Server | 3000 | HTTP | Tool calls |
# ALLOW (internal only)
ALLOW TCP from MCP-Server to SAP-ECC:33XX # RFC
ALLOW TCP from Browser to localhost:8080 # Web viewer
# DENY (external)
DENY ALL from MCP-Server to Internet
DENY ALL from Pattern-Engine to Internet
DENY ALL from Web-Viewer to Internet
- Usage statistics
- Error reports
- Feature analytics
- User behavior
- Performance metrics
- Crash dumps
- No update checks
- No license validation
- No heartbeat signals
- No capability negotiation
You can verify no external connections with:
# Monitor network connections during operation
netstat -an | grep ESTABLISHED
# Or use tcpdump
tcpdump -i any 'not (host YOUR-SAP-SERVER or host localhost)'The following patterns are automatically redacted in all output:
| Pattern Type | Example | Redacted To |
|---|---|---|
| Email addresses | john.doe@company.com | [EMAIL] |
| Phone numbers | +1-555-123-4567 | [PHONE] |
| SSN patterns | 123-45-6789 | [SSN] |
| Credit cards | 4111-1111-1111-1111 | [CARD] |
| IP addresses | 192.168.1.100 | [IP] |
When --mode shareable is enabled:
| Data Type | Treatment |
|---|---|
| Customer names | Hashed to anonymous IDs |
| Customer numbers | Hashed to anonymous IDs |
| Material numbers | Hashed to anonymous IDs |
| Pricing/values | Removed or bucketed |
| Addresses | Removed |
| PO numbers | Removed |
# Redaction is applied BEFORE any data is written to disk
# Located in: pattern-engine/src/redaction/
def redact_text(text: str, mode: str = "default") -> str:
# 1. Regex-based pattern matching
text = redact_emails(text)
text = redact_phones(text)
text = redact_ssn(text)
# 2. Named Entity Recognition (optional)
if ner_enabled:
text = redact_named_entities(text)
# 3. Shareable mode additional redaction
if mode == "shareable":
text = hash_identifiers(text)
text = remove_values(text)
return textEvery tool call is logged with:
{
"timestamp": "2024-01-15T10:30:45.123Z",
"tool": "search_doc_text",
"parameters": {
"pattern": "credit hold",
"date_from": "2024-01-01",
"date_to": "2024-01-15",
"sales_org": ["1000"]
},
"result": {
"row_count": 47,
"execution_ms": 234,
"truncated": false
},
"user_context": {
"session_id": "abc123",
"client_ip": "10.0.0.50"
}
}- Actual document content (only metadata)
- Actual text field values (only match counts)
- Customer names or PII
- SAP credentials
Logs are stored locally in ./output/logs/:
- Default retention: 90 days
- Configurable via environment variable
- No automatic upload or external shipping
See docs/SAP_AUTHORIZATION.md for complete details.
Minimum required authorizations:
S_RFC - RFC execution (Activity 16)
V_VBAK_VKO - Sales org access (Activity 03 - Display)
V_VBAK_AAT - Document type access (Activity 03 - Display)
The RFC user should have:
- Only display access (Activity 03)
- Only required sales organizations
- Only required document types
- No write permissions anywhere
- No SAP_ALL
- No SAP_NEW
- No S_DEVELOP
- No S_TABU_DIS (unless RFC_READ_TABLE enabled)
| Requirement | Implementation |
|---|---|
| Data minimization | Only SD/MM data accessed, no HR |
| Purpose limitation | Process analysis only |
| Storage limitation | Local only, configurable retention |
| Right to erasure | Delete output directory |
| Data portability | JSON output format |
| Privacy by design | Redaction enabled by default |
| Control | Implementation |
|---|---|
| Access control | SAP authorization, no shared accounts |
| Audit logging | Complete request logging |
| Data encryption | Use TLS for RFC (SNC) |
| Change management | Docker image versioning |
| Incident response | Local logs for investigation |
| Safeguard | Implementation |
|---|---|
| Access controls | SAP authorization |
| Audit controls | Complete logging |
| Transmission security | SNC for RFC |
| No PHI processing | Verify no healthcare data in SD texts |
| Requirement | Implementation |
|---|---|
| No card data storage | Credit card patterns redacted |
| Access restriction | SAP authorization |
| Audit trails | Complete logging |
| Network security | No external connections |
| Control | Implementation | Default |
|---|---|---|
| Read-only access | BAPIs only, no write | Enforced |
| Row limits | 200 per query | Enabled |
| Rate limiting | Configurable | Optional |
| PII redaction | Regex + patterns | Enabled |
| Shareable mode | Additional redaction | Optional |
| Audit logging | All requests | Enabled |
| No telemetry | No external calls | Enforced |
| Input validation | Sanitized parameters | Enabled |
| Timeout enforcement | 2 min max | Enabled |
If you discover a security vulnerability:
- Do NOT open a public GitHub issue
- Email security@your-org.com with:
- Description of the vulnerability
- Steps to reproduce
- Potential impact
- Suggested fix (if any)
- Allow 90 days for response before public disclosure
See docs/threat_model.md for:
- Known unmitigated threats
- Risk acceptance decisions
- Defense-in-depth architecture
Before deploying SAP Workflow Mining:
- Create dedicated RFC user with minimal permissions
- Test authorization with SU53 after failed access
- Enable SNC (Secure Network Communications) for RFC
- Review SAP authorization trace (ST01)
- Configure log retention policy
- Verify no outbound network access
- Document data classification of output
- Establish output file handling procedures
- Define access control for analysis results
- Schedule periodic authorization review
- Security architecture: See docs/architecture.md
- Threat model: See docs/threat_model.md
- SAP authorizations: See docs/SAP_AUTHORIZATION.md