Skip to content

Feature Request: Modernize Logging Architecture (Structured Logging & Rate Limiting) #506

@somethingwithproof

Description

@somethingwithproof

Context

Spine's current logging mechanism (SPINE_LOG and spine_log()) is highly effective for human readability but deviates from modern SRE and observability best practices. As Cacti deployments scale and integrate with centralized logging stacks (ELK, Splunk, Datadog), the current free-form string approach causes friction.

Identified Gaps

  1. Lack of Structured Logging: Logs are emitted as free-text strings. Downstream aggregators must use fragile regex to parse out device IDs, thread IDs, or data query numbers.
  2. In-String Severity: Severity levels (ERROR:, WARNING:, DEBUG:) are hardcoded inside the format strings rather than being passed as metadata to the logging engine.
  3. No Rate Limiting: High-frequency failures (e.g., database disconnects, massive SNMP timeouts) will flood the I/O subsystem and log files without deduplication or backoff.
  4. Source Leakage: Many error messages expose internal C file names (e.g., util.c), which is noisy for operators.

Proposed Modernization

We should consider an incremental refactor to introduce Structured Logging capabilities to Spine, bringing it in line with 2026 industry norms.

  1. Phase 1: Macro Refactor (Metadata Extraction)
    Separate the log level/severity from the string itself.
    Current: SPINE_LOG(("WARNING: Device[%i] ...", host_id))
    Proposed: SPINE_LOG_WARN("Device connection failed", "host_id", host_id, "thread", host_thread)
  2. Phase 2: Structured Output Formats
    Update spine_log() to optionally output JSON lines ({"level":"warn", "host_id": 10, "msg":"..."}) based on a config flag (log_format = json).
  3. Phase 3: Rate Limiting
    Implement a token-bucket or "seen-recently" cache in spine_log() to squelch identical error strings bursting within a short time window.

Benefits

  • Better Observability: Allows operators to natively query Spine logs in external systems without writing custom log parsing filters.
  • Cleaner Codebase: Removes the need to manually prefix every string with DEBUG: or ERROR:.
  • System Stability: Protects disk I/O during catastrophic upstream failures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions