Skip to content

Features Connection Monitor

Dennis Braun edited this page Mar 17, 2026 · 5 revisions

Connection Monitor

The Connection Monitor is a DOCSight-native always-on latency monitor. It continuously probes configured targets and tracks latency, packet loss, and outages -- replacing the need for a separate PingPlotter or Smokeping setup.

Availability: All modem types and Generic Router mode. Enable in Settings > Connection Monitor.

Why This Matters

Cable internet issues are often intermittent -- your connection drops for 30 seconds at 8 PM, but by the time you run a speedtest, everything looks fine. The Connection Monitor runs 24/7 and catches exactly these moments:

  • Prove packet loss -- Document drops your ISP claims don't exist
  • Catch evening congestion -- See latency spikes during peak hours
  • Correlate with DOCSIS data -- Compare outages against signal quality, modem events, and speedtest results
  • Build evidence -- Export CSV data for ISP complaints

What You See

Dashboard Card

A compact card on the main dashboard shows the current state across all targets:

Metric Description
Avg Latency Average response time across all targets
Packet Loss Percentage of failed probes in the last 60 seconds

Detail View

The dedicated Connection Monitor view (sidebar > Connection Monitor) provides:

Stats Cards

Top-level KPIs for the selected time range: average latency, packet loss percentage, and sample count.

Per-Target Statistics

A comparison table showing each target's average latency, P95 latency, packet loss, and sample count. Includes automatic diagnostics:

  • External issue -- Gateway is fine but external targets show packet loss
  • Internal/ISP issue -- Gateway is also affected

Combined Latency Chart

A PingPlotter-style chart showing latency for all targets on a single timeline with color-coded zones:

Zone Latency Meaning
Green < 20 ms Excellent
Yellow 20-50 ms Good
Orange 50-100 ms Degraded
Red > 100 ms Poor

Timeout markers (red dots) highlight failed probes.

Availability Band

A compact color bar showing uptime vs. downtime across the selected time range.

Outage Log

A table of detected outages with start time, end time, duration, and affected targets. Overlapping outages across targets are automatically grouped (likely same root cause).

Time Ranges

Range Resolution Description
1h Raw samples Last hour (default)
6h Raw samples Last 6 hours
24h Raw samples Last 24 hours
7d Raw samples Last 7 days
30d 1-minute averages Last 30 days
90d 5-minute averages Last 90 days

Short ranges (up to 24h) refresh every 10 seconds; longer ranges refresh every 60 seconds. A resolution indicator below the chart shows which data granularity is active.

Sample Aggregation

To keep long-term storage efficient, DOCSight automatically aggregates older samples into time-bucketed averages:

Age Stored as Granularity
0 - 7 days Raw samples Per-probe (e.g. every 5s)
7 - 30 days 1-minute buckets avg/min/max/P95 latency, packet loss
30 - 90 days 5-minute buckets Re-aggregated from 1-minute data
> 90 days 1-hour buckets Re-aggregated from 5-minute data

Aggregation runs automatically every 15 minutes alongside the regular cleanup cycle. Each bucket stores average, minimum, maximum, and P95 latency plus packet loss percentage, so charts still show meaningful detail at any zoom level.

For aggregated time ranges, the latency chart renders a shaded min/max band behind each target's average line to visualize the range of values within each bucket.

Configuration

Enable

Go to Settings > Connection Monitor and toggle Enable Connection Monitor.

Settings

Setting Default Description
Probe interval 5000 ms Time between probes per target. Minimum 1000 ms.
Probe method Auto Auto tries ICMP first, falls back to TCP. Can be forced to ICMP only or TCP only.
TCP port 443 Port used for TCP probes
Keep samples 0 (keep all) Delete samples older than this many days
Outage threshold 5 Consecutive timeouts before an outage event is triggered
Loss warning 2.0% Packet loss percentage that triggers a warning event

Targets

Add targets in the same settings section. Default targets (Cloudflare DNS 1.1.1.1, Google DNS 8.8.8.8) are seeded on first enable.

Each target has a label (display name) and a host (IP or hostname). New targets are disabled until a host is provided, then automatically enabled.

Recommended targets:

Target Why
Your gateway/router IP Detects local network issues
1.1.1.1 (Cloudflare) Reliable external reference
8.8.8.8 (Google) Second external reference
Your ISP's DNS Detects ISP-specific issues

Probe Methods

The Connection Monitor supports two probe methods:

ICMP (Preferred)

Standard ping. Most accurate latency measurement. Requires the NET_RAW capability in Docker:

services:
  docsight:
    cap_add:
      - NET_RAW

Without cap_add: [NET_RAW], DOCSight will usually show TCP only or fall back to TCP in Auto mode.

DOCSight now scopes raw-socket access to a small dedicated ICMP helper instead of giving that capability to the whole Python process. cap_add: NET_RAW is still required at the container level, but only the helper uses it and the main DOCSight application process remains unprivileged.

TCP

TCP connect probe to port 443. Works everywhere without special permissions. Slightly higher latency than ICMP due to TCP handshake overhead.

Auto Mode

Tries ICMP first. If unavailable (no NET_RAW), falls back to TCP automatically. The detail view shows a badge indicating which mode is active.

Traceroute

When problems are detected, DOCSight can capture the full network path (hop-by-hop) to show exactly where packets are being lost or delayed. This turns "my internet is slow" into "the bottleneck is at hop 5, my ISP's peering point."

How It Works

Traceroute runs automatically when the Connection Monitor detects an outage or significant packet loss. It sends ICMP packets with incrementing TTL values to map every router between you and the target. Each hop reports its IP, hostname (via reverse DNS), latency, and how many of 3 probes responded.

A dedicated setuid helper binary (docsight-traceroute-helper) handles the raw socket operations with immediate privilege dropping after socket creation.

Automatic Triggers

Event Traceroute triggered
Target unreachable (outage) Yes
Packet loss warning Yes
Stable connection No

A 5-minute cooldown per target prevents excessive traces during flapping connections. If your connection is stable, no automatic traces are generated.

Manual Traceroute

Click the Run Traceroute button in the Connection Monitor detail view to trigger an immediate trace to the selected target. The result appears inline with the full hop list.

Trace History

Below the Run Traceroute button, a history table shows all past traces for the selected target:

Column Description
Timestamp When the trace was captured
Hops Number of hops to the target
Fingerprint First 12 characters of the route's SHA256 hash
Reached Whether the target was reached
Trigger What caused the trace (manual, outage, packet loss)

Click any row to expand the full hop list with per-hop details: IP address, hostname, latency, and how many of the 3 probes responded.

Route Fingerprint

Each trace generates a SHA256 hash of the hop IP sequence. When the fingerprint changes between traces, the network path has changed -- useful for detecting ISP routing changes or failovers.

Partial Results

If a trace times out (30-second hard cap), DOCSight returns whatever hops were collected up to that point. The trace is marked as "target not reached" with the partial hop list.

Traceroute API Endpoints

Method Path Description
POST /api/connection-monitor/traceroute/<target_id> Run manual traceroute (synchronous, returns result)
GET /api/connection-monitor/traces/<target_id> List traces (supports start, end, limit)
GET /api/connection-monitor/trace/<trace_id> Single trace with all hops

Events

The Connection Monitor generates events that appear in the Event Log:

Event Severity Trigger
Target unreachable Critical Consecutive timeouts exceed the outage threshold
Target recovered Info First successful probe after an outage
Packet loss warning Warning Windowed packet loss exceeds the configured percentage (5-minute cooldown)

Pin This Day

The Connection Monitor automatically aggregates older samples to save storage (see Sample Aggregation below). When you need to preserve the full raw-resolution data for a specific day, for example to document an ISP outage, you can pin that day. Pinned days are excluded from all cleanup and aggregation, so their raw samples are kept indefinitely.

How to Pin

  1. Open the Connection Monitor detail view
  2. Select the 24h time range (the pin button only appears in this view)
  3. Click Pin this day next to the time range buttons
  4. The day is now preserved and appears as a chip below the time range buttons

Viewing Pinned Days

Pinned days appear as clickable chips below the time range buttons. Clicking a chip loads the full 24-hour period at raw resolution, regardless of how old the data is. Auto-refresh is disabled while viewing a pinned day.

Each chip shows the date and an optional label. Click the x on a chip to unpin the day and allow normal cleanup to reclaim the samples.

When to Use

  • ISP outages -- Pin the day of a major outage before the 7-day raw retention window expires
  • Before/after evidence -- Pin a "bad day" to compare against current performance
  • Complaint preparation -- Preserve raw evidence for ISP complaints or BNetzA filings

API Endpoints

Method Path Description
GET /api/connection-monitor/pinned-days List all pinned days with UTC epoch ranges
POST /api/connection-monitor/pinned-days Pin a day (accepts date or timestamp, optional label)
DELETE /api/connection-monitor/pinned-days/<date> Unpin a day

CSV Export

Each target can be exported as CSV from the detail view. The export automatically uses the same resolution as the current chart view -- raw samples for short ranges, aggregated data for 30d/90d. Columns depend on resolution:

Resolution Columns
Raw datetime, latency_ms, timeout, probe_method
Aggregated datetime, avg_latency_ms, min_latency_ms, max_latency_ms, p95_latency_ms, packet_loss_pct, sample_count

Backup

Connection Monitor data (connection_monitor.db) is automatically included in DOCSight backups and restored alongside the main database. See Backup & Restore.

API Endpoints

Method Path Description
GET /api/connection-monitor/targets List all targets
POST /api/connection-monitor/targets Create a target
PUT /api/connection-monitor/targets/<id> Update a target
DELETE /api/connection-monitor/targets/<id> Delete a target
GET /api/connection-monitor/samples/<id> Get samples (supports start, end, limit, resolution)
GET /api/connection-monitor/summary Summary stats for all targets
GET /api/connection-monitor/outages/<id> Derived outages (supports start, end, threshold)
GET /api/connection-monitor/export/<id> CSV export (supports start, end, resolution)
GET /api/connection-monitor/capability Current probe method and reason
POST /api/connection-monitor/traceroute/<id> Run manual traceroute (synchronous)
GET /api/connection-monitor/traces/<id> List traces for a target
GET /api/connection-monitor/trace/<id> Single trace with hops
GET /api/connection-monitor/pinned-days List all pinned days with UTC epoch ranges
POST /api/connection-monitor/pinned-days Pin a day (date or timestamp, optional label)
DELETE /api/connection-monitor/pinned-days/<date> Unpin a day

All endpoints require authentication when an admin password is set. See API Reference.

Troubleshooting

DOCSight shows TCP only

Check these points:

  1. Your container has cap_add: NET_RAW
  2. You redeployed after updating the image
  3. The probe method is not explicitly set to TCP only

TCP works but ICMP still does not

That usually means the container was started without the required capability, or the running container still uses an older image. Recreate the container after pulling the latest image.

Running without Docker

Native Linux installs generally do not need Docker-specific capability settings. If ICMP is unavailable there, use TCP only temporarily and verify the runtime permissions of the DOCSight process.

Related

Clone this wiki locally