Context
Spine's current logging mechanism (SPINE_LOG and spine_log()) is highly effective for human readability but deviates from modern SRE and observability best practices. As Cacti deployments scale and integrate with centralized logging stacks (ELK, Splunk, Datadog), the current free-form string approach causes friction.
Identified Gaps
- Lack of Structured Logging: Logs are emitted as free-text strings. Downstream aggregators must use fragile regex to parse out device IDs, thread IDs, or data query numbers.
- In-String Severity: Severity levels (
ERROR:, WARNING:, DEBUG:) are hardcoded inside the format strings rather than being passed as metadata to the logging engine.
- No Rate Limiting: High-frequency failures (e.g., database disconnects, massive SNMP timeouts) will flood the I/O subsystem and log files without deduplication or backoff.
- Source Leakage: Many error messages expose internal C file names (e.g.,
util.c), which is noisy for operators.
Proposed Modernization
We should consider an incremental refactor to introduce Structured Logging capabilities to Spine, bringing it in line with 2026 industry norms.
- Phase 1: Macro Refactor (Metadata Extraction)
Separate the log level/severity from the string itself.
Current: SPINE_LOG(("WARNING: Device[%i] ...", host_id))
Proposed: SPINE_LOG_WARN("Device connection failed", "host_id", host_id, "thread", host_thread)
- Phase 2: Structured Output Formats
Update spine_log() to optionally output JSON lines ({"level":"warn", "host_id": 10, "msg":"..."}) based on a config flag (log_format = json).
- Phase 3: Rate Limiting
Implement a token-bucket or "seen-recently" cache in spine_log() to squelch identical error strings bursting within a short time window.
Benefits
- Better Observability: Allows operators to natively query Spine logs in external systems without writing custom log parsing filters.
- Cleaner Codebase: Removes the need to manually prefix every string with
DEBUG: or ERROR:.
- System Stability: Protects disk I/O during catastrophic upstream failures.
Context
Spine's current logging mechanism (
SPINE_LOGandspine_log()) is highly effective for human readability but deviates from modern SRE and observability best practices. As Cacti deployments scale and integrate with centralized logging stacks (ELK, Splunk, Datadog), the current free-form string approach causes friction.Identified Gaps
ERROR:,WARNING:,DEBUG:) are hardcoded inside the format strings rather than being passed as metadata to the logging engine.util.c), which is noisy for operators.Proposed Modernization
We should consider an incremental refactor to introduce Structured Logging capabilities to Spine, bringing it in line with 2026 industry norms.
Separate the log level/severity from the string itself.
Current:
SPINE_LOG(("WARNING: Device[%i] ...", host_id))Proposed:
SPINE_LOG_WARN("Device connection failed", "host_id", host_id, "thread", host_thread)Update
spine_log()to optionally output JSON lines ({"level":"warn", "host_id": 10, "msg":"..."}) based on a config flag (log_format = json).Implement a token-bucket or "seen-recently" cache in
spine_log()to squelch identical error strings bursting within a short time window.Benefits
DEBUG:orERROR:.