-
Notifications
You must be signed in to change notification settings - Fork 9
fix(iam): expose sql metrics for different dashboards #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,105 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| # Copyright (c) 2019-present, BigCommerce Pty. Ltd. All rights reserved | ||
| # | ||
| # Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated | ||
| # documentation files (the "Software"), to deal in the Software without restriction, including without limitation the | ||
| # rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit | ||
| # persons to whom the Software is furnished to do so, subject to the following conditions: | ||
| # | ||
| # The above copyright notice and this permission notice shall be included in all copies or substantial portions of the | ||
| # Software. | ||
| # | ||
| # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE | ||
| # WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR | ||
| # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR | ||
| # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
| # | ||
| module Bigcommerce | ||
| module Prometheus | ||
| module Integrations | ||
| ## | ||
| # Subscribes to ActiveSupport sql.active_record notifications and pushes a per-operation | ||
| # SQL query duration histogram to the Prometheus exporter. | ||
| # | ||
| class ActiveRecord | ||
|
cursor[bot] marked this conversation as resolved.
|
||
| IGNORED_NAMES = %w[SCHEMA CACHE].freeze | ||
| TYPE = 'bc_sql' | ||
|
|
||
| # Idempotent: returns the same instance on repeated calls within a process, | ||
| # so calling .start more than once (e.g. from both the gem's instrumentor and a | ||
| # consuming app's initializer) does not register duplicate subscribers and | ||
| # double-count every SQL query. | ||
| def self.start(client: nil) | ||
| @start ||= new(client: client || ::Bigcommerce::Prometheus.client).tap(&:subscribe!) | ||
| end | ||
|
|
||
| # Wrapper for instrumentor wiring: registers the AR type collector on the given server, | ||
| # gated on the active_record_enabled config flag, and swallows errors so an additive | ||
| # feature failure cannot take down core instrumentation. | ||
| def self.register_type_collector(server, process_name: nil) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we need this method? this can be done directly on the hutch and web processes with
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The wrapper exists for two reasons flagged in earlier review: Same wiring is needed in both Web and Hutch — Cursor's bugbot flagged the duplication when it was inline. |
||
| return unless ::Bigcommerce::Prometheus.active_record_enabled | ||
|
|
||
| server.add_type_collector(::Bigcommerce::Prometheus::TypeCollectors::ActiveRecord.new) | ||
| rescue StandardError => e | ||
| log_warn(process_name, "Failed to register ActiveRecord type collector: #{e.message}") | ||
| end | ||
|
|
||
| # Wrapper for instrumentor wiring: starts the AR integration, gated on the | ||
| # active_record_enabled config flag, and swallows errors so an additive feature | ||
| # failure cannot take down core instrumentation. | ||
| def self.start_safe(client: nil, process_name: nil) | ||
| return unless ::Bigcommerce::Prometheus.active_record_enabled | ||
|
|
||
| start(client: client) | ||
| rescue StandardError => e | ||
| log_warn(process_name, "Failed to start ActiveRecord integration: #{e.message}") | ||
| end | ||
|
|
||
| def self.log_warn(process_name, message) | ||
| process_name ||= ::Bigcommerce::Prometheus.process_name | ||
| ::Bigcommerce::Prometheus.logger&.warn("[bigcommerce-prometheus][#{process_name}] #{message}") | ||
| end | ||
| private_class_method :log_warn | ||
|
|
||
| def initialize(client:) | ||
| @client = client | ||
| end | ||
|
|
||
| def subscribe! | ||
| ActiveSupport::Notifications.subscribe('sql.active_record') do |*args| | ||
| call(ActiveSupport::Notifications::Event.new(*args)) | ||
| end | ||
| end | ||
|
|
||
| def call(event) | ||
| return if IGNORED_NAMES.include?(event.payload[:name]) | ||
|
|
||
| @client.send_json( | ||
| type: TYPE, | ||
| duration_seconds: event.duration / 1000.0, | ||
| custom_labels: { operation: classify(event.payload[:sql]) } | ||
| ) | ||
| rescue StandardError | ||
| # Never let metric instrumentation propagate into the request path. | ||
| nil | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def classify(sql) | ||
| return 'other' if sql.nil? | ||
|
|
||
| first_token = sql.lstrip.split(/\s+/, 2).first&.upcase | ||
| case first_token | ||
| when 'SELECT' then 'select' | ||
| when 'INSERT' then 'insert' | ||
| when 'UPDATE' then 'update' | ||
| when 'DELETE' then 'delete' | ||
| else 'other' | ||
| end | ||
| end | ||
|
cursor[bot] marked this conversation as resolved.
|
||
| end | ||
| end | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| # Copyright (c) 2019-present, BigCommerce Pty. Ltd. All rights reserved | ||
| # | ||
| # Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated | ||
| # documentation files (the "Software"), to deal in the Software without restriction, including without limitation the | ||
| # rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit | ||
| # persons to whom the Software is furnished to do so, subject to the following conditions: | ||
| # | ||
| # The above copyright notice and this permission notice shall be included in all copies or substantial portions of the | ||
| # Software. | ||
| # | ||
| # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE | ||
| # WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR | ||
| # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR | ||
| # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
| # | ||
| module Bigcommerce | ||
| module Prometheus | ||
| module TypeCollectors | ||
| ## | ||
| # Render-side counterpart to Integrations::ActiveRecord. Aggregates per-operation | ||
| # SQL query duration into a Prometheus Histogram exposed at /metrics. | ||
| # | ||
| class ActiveRecord < Bigcommerce::Prometheus::TypeCollectors::Base | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like OTEL would be a better place to get this info from. Not opposed to adding this here, but we already get this info OOB from the OTEL collectors
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OTEL captures this at the span level, which is great for tracing. The motivation here is to fix existing Grafana panels that query Thanos/Prometheus for aggregate p99/p95 SQL latency over 5m windows — that's much cheaper as a precomputed histogram than running aggregations over OTEL traces in Lightstep. Both can coexist; this isn't replacing OTEL, just exposing the same signal in the format the dashboards already expect. |
||
| DEFAULT_BUCKETS = [0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10].freeze | ||
|
|
||
| def initialize(default_labels: {}, buckets: DEFAULT_BUCKETS) | ||
| @buckets = buckets | ||
| super(type: Bigcommerce::Prometheus::Integrations::ActiveRecord::TYPE, default_labels: default_labels) | ||
| end | ||
|
|
||
| def build_metrics | ||
| { | ||
| sql_query_duration_seconds: PrometheusExporter::Metric::Histogram.new( | ||
| 'sql_query_duration_seconds', | ||
| 'ActiveRecord SQL query duration in seconds, labeled by operation.', | ||
| buckets: @buckets || DEFAULT_BUCKETS | ||
| ) | ||
| } | ||
| end | ||
|
|
||
| def collect_metrics(data:, labels: {}) | ||
| duration = data['duration_seconds'] | ||
| return if duration.nil? | ||
|
|
||
| metric(:sql_query_duration_seconds).observe(duration.to_f, labels) | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,151 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| require 'spec_helper' | ||
| require 'active_support/notifications' | ||
|
|
||
| describe Bigcommerce::Prometheus::Integrations::ActiveRecord do | ||
| let(:client) { instance_double(Bigcommerce::Prometheus::Client, send_json: nil) } | ||
|
|
||
| def event_for(sql:, name: nil, duration_ms: 12.5) | ||
| started = Time.now | ||
| finished = started + (duration_ms / 1000.0) | ||
| ActiveSupport::Notifications::Event.new('sql.active_record', started, finished, 'id', { sql: sql, name: name }) | ||
| end | ||
|
|
||
| describe '.start' do | ||
| before { described_class.instance_variable_set(:@start, nil) } | ||
| after { described_class.instance_variable_set(:@start, nil) } | ||
|
|
||
| it 'registers a listener on sql.active_record' do | ||
| expect(ActiveSupport::Notifications).to receive(:subscribe).with('sql.active_record') | ||
| described_class.start(client: client) | ||
| end | ||
|
|
||
| it 'is idempotent — calling .start twice only registers one subscriber' do | ||
| allow(ActiveSupport::Notifications).to receive(:subscribe).with('sql.active_record') | ||
| described_class.start(client: client) | ||
| described_class.start(client: client) | ||
| expect(ActiveSupport::Notifications).to have_received(:subscribe).with('sql.active_record').once | ||
| end | ||
| end | ||
|
|
||
| describe '#call' do | ||
| let(:integration) { described_class.new(client: client) } | ||
|
|
||
| it 'pushes SELECT events as operation "select" with duration in seconds' do | ||
| event = event_for(sql: 'SELECT * FROM users WHERE id = 1', duration_ms: 25) | ||
|
|
||
| expect(client).to receive(:send_json).with( | ||
| hash_including( | ||
| type: 'bc_sql', | ||
| duration_seconds: a_value_within(0.001).of(0.025), | ||
| custom_labels: { operation: 'select' } | ||
| ) | ||
| ) | ||
|
|
||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'pushes INSERT events as operation "insert"' do | ||
| event = event_for(sql: 'INSERT INTO users (name) VALUES (?)') | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'insert' })) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'pushes UPDATE events as operation "update"' do | ||
| event = event_for(sql: 'UPDATE users SET name = ? WHERE id = ?') | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'update' })) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'pushes DELETE events as operation "delete"' do | ||
| event = event_for(sql: 'DELETE FROM users WHERE id = ?') | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'delete' })) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'pushes unrecognized statements as operation "other"' do | ||
| event = event_for(sql: 'BEGIN TRANSACTION') | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'other' })) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'ignores SCHEMA events' do | ||
| event = event_for(sql: 'SHOW FULL FIELDS FROM users', name: 'SCHEMA') | ||
| expect(client).not_to receive(:send_json) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'ignores CACHE events' do | ||
| event = event_for(sql: 'SELECT * FROM users', name: 'CACHE') | ||
| expect(client).not_to receive(:send_json) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'classifies case-insensitively' do | ||
| event = event_for(sql: 'select * from users') | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'select' })) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'tolerates leading whitespace and newlines' do | ||
| event = event_for(sql: " \n SELECT 1") | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'select' })) | ||
| integration.call(event) | ||
| end | ||
|
|
||
| it 'classifies nil SQL as "other" without raising' do | ||
| event = event_for(sql: nil) | ||
| expect(client).to receive(:send_json).with(hash_including(custom_labels: { operation: 'other' })) | ||
| expect { integration.call(event) }.not_to raise_error | ||
| end | ||
|
|
||
| it 'never raises if the client raises' do | ||
| event = event_for(sql: 'SELECT 1') | ||
| allow(client).to receive(:send_json).and_raise(StandardError, 'boom') | ||
| expect { integration.call(event) }.not_to raise_error | ||
| end | ||
| end | ||
|
|
||
| describe '.register_type_collector' do | ||
| let(:server) { instance_double(Bigcommerce::Prometheus::Server, add_type_collector: nil) } | ||
|
|
||
| it 'registers the AR type collector when the feature is enabled' do | ||
| allow(Bigcommerce::Prometheus).to receive(:active_record_enabled).and_return(true) | ||
| expect(server).to receive(:add_type_collector).with(instance_of(Bigcommerce::Prometheus::TypeCollectors::ActiveRecord)) | ||
| described_class.register_type_collector(server) | ||
| end | ||
|
|
||
| it 'does nothing when the feature is disabled' do | ||
| allow(Bigcommerce::Prometheus).to receive(:active_record_enabled).and_return(false) | ||
| expect(server).not_to receive(:add_type_collector) | ||
| described_class.register_type_collector(server) | ||
| end | ||
|
|
||
| it 'never raises if registration fails' do | ||
| allow(Bigcommerce::Prometheus).to receive(:active_record_enabled).and_return(true) | ||
| allow(server).to receive(:add_type_collector).and_raise(StandardError, 'boom') | ||
| expect { described_class.register_type_collector(server) }.not_to raise_error | ||
| end | ||
| end | ||
|
|
||
| describe '.start_safe' do | ||
| it 'starts the integration when the feature is enabled' do | ||
| allow(Bigcommerce::Prometheus).to receive(:active_record_enabled).and_return(true) | ||
| expect(described_class).to receive(:start).with(client: client) | ||
| described_class.start_safe(client: client) | ||
| end | ||
|
|
||
| it 'does nothing when the feature is disabled' do | ||
| allow(Bigcommerce::Prometheus).to receive(:active_record_enabled).and_return(false) | ||
| expect(described_class).not_to receive(:start) | ||
| described_class.start_safe(client: client) | ||
| end | ||
|
|
||
| it 'never raises if start fails' do | ||
| allow(Bigcommerce::Prometheus).to receive(:active_record_enabled).and_return(true) | ||
| allow(described_class).to receive(:start).and_raise(StandardError, 'boom') | ||
| expect { described_class.start_safe(client: client) }.not_to raise_error | ||
| end | ||
| end | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this start after the prometheus server is started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, intentional. The subscriber pushes through Bigcommerce::Prometheus.client which is independent of the local server lifecycle, but starting it after server.start ensures the exporter is ready to receive when the first SQL event fires.