Skip to content

Commit fb0d1f0

Browse files
authored
Perf: DB query optimization (#29)
## Database & Grafana query performance optimizations With 130M data points (364 CMLs × 2 sublinks, 26 days at 10s resolution) on an 8 GB / 4 CPU VM, loading raw data in the real-time dashboard was slow. The zoomed-out 1h aggregate view was already fast. This PR addresses the raw data path through three changes: **1. PostgreSQL memory tuning (`docker-compose.yml`)** The database container previously ran with PostgreSQL defaults (128 MB `shared_buffers`). Parameters are now tuned for the VM size, keeping recently-used data chunks in RAM and steering the query planner toward index scans over sequential scans. **2. TimescaleDB compression (`database/init.sql`)** Chunks older than 7 days are compressed automatically via a background policy, using `(cml_id, sublink_id)` as the segment key so queries for a single CML decompress only ~1/728th of a chunk. The current week stays uncompressed for zero-overhead real-time ingestion. At ~21–51× compression ratio on existing data, the entire compressed history fits in `shared_buffers`. This scales well: as new streams are added the compressed footprint grows slowly while the hot uncompressed window stays bounded at one week. **3. Adaptive query bucketing in the real-time Grafana dashboard** Replaced the binary 1h-aggregate / raw-10s switch with a three-tier system based on the selected time range. For the middle tier (≤ 3 days, Auto mode), a single CTE scan computes MIN/MAX/AVG via `time_bucket('$__interval', time)`, matching the panel's pixel density — so the min/max fill band and avg line are generated at no extra query cost. The `Raw` mode is now explicit-only, preventing accidental slow queries on wide time ranges. Minor: disabled point rendering on the RSL panel to suppress dots that appeared on the band boundary at low data density. Individual commits: * perf: tune PostgreSQL memory settings for 8 GB VM * perf: adaptive $__interval bucketing, three-tier auto/raw query split * perf: optimize SQL query for CML time series data with min, max, and avg calculations * perf: change timeseries panel to never show points for improved clarity * perf: enable TimescaleDB compression after 7 days, increase shared_buffers to 2GB
1 parent 4dcf26e commit fb0d1f0

3 files changed

Lines changed: 62 additions & 8 deletions

File tree

database/init.sql

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,4 +112,26 @@ SELECT add_continuous_aggregate_policy('cml_data_1h',
112112
start_offset => INTERVAL '2 days',
113113
end_offset => INTERVAL '1 hour',
114114
schedule_interval => INTERVAL '1 hour'
115-
);
115+
);
116+
117+
-- ---------------------------------------------------------------------------
118+
-- Compression for cml_data chunks older than 7 days.
119+
--
120+
-- compress_segmentby: each compressed segment contains one (cml_id, sublink_id)
121+
-- pair, so a query filtered to a single CML decompresses only ~1/728th of a
122+
-- chunk — not the whole thing.
123+
-- compress_orderby: matches the query pattern (time range scans), allowing
124+
-- skip-scan decompression for narrow time windows within a segment.
125+
--
126+
-- At ~10-20x compression ratio, the last month of data fits in shared_buffers
127+
-- after a single cache warm-up, regardless of how many new streams are added.
128+
-- The current uncompressed week chunk is left untouched so real-time ingestion
129+
-- and detail-view queries on recent data have no decompression overhead.
130+
-- ---------------------------------------------------------------------------
131+
ALTER TABLE cml_data SET (
132+
timescaledb.compress,
133+
timescaledb.compress_segmentby = 'cml_id, sublink_id',
134+
timescaledb.compress_orderby = 'time DESC'
135+
);
136+
137+
SELECT add_compression_policy('cml_data', INTERVAL '7 days');

docker-compose.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,18 @@ services:
7373
build: ./database
7474
ports:
7575
- "5432:5432"
76+
# Tune PostgreSQL memory for an 8 GB VM.
77+
# shared_buffers: keep recently-used chunks in RAM (default is only 128 MB).
78+
# effective_cache_size: hints to the planner how much OS page cache is available.
79+
# work_mem: memory per sort/hash operation; speeds up ORDER BY on large result sets.
80+
# random_page_cost: tell the planner data is effectively cached, prefer index scans.
81+
command: >
82+
postgres
83+
-c shared_buffers=2GB
84+
-c effective_cache_size=4GB
85+
-c work_mem=64MB
86+
-c maintenance_work_mem=256MB
87+
-c random_page_cost=1.1
7688
healthcheck:
7789
test: ["CMD-SHELL", "pg_isready -U myuser -d mydatabase"]
7890
interval: 5s

grafana/provisioning/dashboards/definitions/cml-realtime.json

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@
105105
"scaleDistribution": {
106106
"type": "linear"
107107
},
108-
"showPoints": "auto",
108+
"showPoints": "never",
109109
"spanNulls": false,
110110
"stacking": {
111111
"group": "A",
@@ -373,7 +373,7 @@
373373
},
374374
"format": "time_series",
375375
"rawQuery": true,
376-
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' min' AS metric,\n rsl_min AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
376+
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' min' AS metric,\n rsl_min AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'auto'\n AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) > 259200\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
377377
"refId": "A"
378378
},
379379
{
@@ -383,7 +383,7 @@
383383
},
384384
"format": "time_series",
385385
"rawQuery": true,
386-
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' max' AS metric,\n rsl_max AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
386+
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' max' AS metric,\n rsl_max AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'auto'\n AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) > 259200\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
387387
"refId": "B"
388388
},
389389
{
@@ -403,8 +403,18 @@
403403
},
404404
"format": "time_series",
405405
"rawQuery": true,
406-
"rawSql": "SELECT\n time AS \"time\",\n sublink_id AS metric,\n rsl AS value\nFROM cml_data\nWHERE cml_id = '${cml_id}'\n AND (\n ('${interval}' = 'auto' AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) <= 259200)\n OR '${interval}' = 'raw'\n )\n AND time >= $__timeFrom()::timestamptz\n AND time <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
406+
"rawSql": "WITH bucketed AS (\n SELECT\n time_bucket('$__interval', time) AS bucket,\n sublink_id,\n MIN(rsl) AS rsl_min,\n MAX(rsl) AS rsl_max,\n AVG(rsl) AS rsl_avg\n FROM cml_data\n WHERE cml_id = '${cml_id}'\n AND '${interval}' = 'auto'\n AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) <= 259200\n AND time >= $__timeFrom()::timestamptz\n AND time <= $__timeTo()::timestamptz\n GROUP BY 1, 2\n)\nSELECT bucket AS \"time\", sublink_id || ' min' AS metric, rsl_min AS value FROM bucketed\nUNION ALL\nSELECT bucket AS \"time\", sublink_id || ' max' AS metric, rsl_max AS value FROM bucketed\nUNION ALL\nSELECT bucket AS \"time\", sublink_id || ' avg' AS metric, rsl_avg AS value FROM bucketed\nORDER BY 1 ASC",
407407
"refId": "D"
408+
},
409+
{
410+
"datasource": {
411+
"type": "grafana-postgresql-datasource",
412+
"uid": "PostgreSQL"
413+
},
414+
"format": "time_series",
415+
"rawQuery": true,
416+
"rawSql": "SELECT\n time AS \"time\",\n sublink_id AS metric,\n rsl AS value\nFROM cml_data\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'raw'\n AND time >= $__timeFrom()::timestamptz\n AND time <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
417+
"refId": "E"
408418
}
409419
],
410420
"title": "CML Time Series - Received Signal Level",
@@ -708,7 +718,7 @@
708718
},
709719
"format": "time_series",
710720
"rawQuery": true,
711-
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' min' AS metric,\n tsl_min AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
721+
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' min' AS metric,\n tsl_min AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'auto'\n AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) > 259200\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
712722
"refId": "A"
713723
},
714724
{
@@ -718,7 +728,7 @@
718728
},
719729
"format": "time_series",
720730
"rawQuery": true,
721-
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' max' AS metric,\n tsl_max AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
731+
"rawSql": "SELECT\n bucket AS \"time\",\n sublink_id || ' max' AS metric,\n tsl_max AS value\nFROM cml_data_1h\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'auto'\n AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) > 259200\n AND bucket >= $__timeFrom()::timestamptz\n AND bucket <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
722732
"refId": "B"
723733
},
724734
{
@@ -738,8 +748,18 @@
738748
},
739749
"format": "time_series",
740750
"rawQuery": true,
741-
"rawSql": "SELECT\n time AS \"time\",\n sublink_id AS metric,\n tsl AS value\nFROM cml_data\nWHERE cml_id = '${cml_id}'\n AND (\n ('${interval}' = 'auto' AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) <= 259200)\n OR '${interval}' = 'raw'\n )\n AND time >= $__timeFrom()::timestamptz\n AND time <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
751+
"rawSql": "SELECT\n time_bucket('$__interval', time) AS \"time\",\n sublink_id AS metric,\n AVG(tsl) AS value\nFROM cml_data\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'auto'\n AND EXTRACT(EPOCH FROM ($__timeTo()::timestamptz - $__timeFrom()::timestamptz)) <= 259200\n AND time >= $__timeFrom()::timestamptz\n AND time <= $__timeTo()::timestamptz\nGROUP BY 1, 2\nORDER BY 1 ASC",
742752
"refId": "D"
753+
},
754+
{
755+
"datasource": {
756+
"type": "grafana-postgresql-datasource",
757+
"uid": "PostgreSQL"
758+
},
759+
"format": "time_series",
760+
"rawQuery": true,
761+
"rawSql": "SELECT\n time AS \"time\",\n sublink_id AS metric,\n tsl AS value\nFROM cml_data\nWHERE cml_id = '${cml_id}'\n AND '${interval}' = 'raw'\n AND time >= $__timeFrom()::timestamptz\n AND time <= $__timeTo()::timestamptz\nORDER BY 1 ASC",
762+
"refId": "E"
743763
}
744764
],
745765
"title": "CML Time Series - Transmitted Signal Level",

0 commit comments

Comments
 (0)