From 91f4e5522ee24f7fffe8cff229cae19a363e45c0 Mon Sep 17 00:00:00 2001 From: Farid Zakaria Date: Wed, 6 Aug 2025 13:25:59 -0700 Subject: [PATCH] Add documentation on metrics --- METRICS.md | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 METRICS.md diff --git a/METRICS.md b/METRICS.md new file mode 100644 index 000000000..4c900610f --- /dev/null +++ b/METRICS.md @@ -0,0 +1,45 @@ +# Metrics + +The following is a list of interesting [promql](https://prometheus.io/docs/prometheus/latest/querying/basics/) metrics that you can use to visualize about the effectiveness and utlization of your remote cache. + +### Cache Hit Percentage By Type + +```promql +sum by (kind) (rate(bazel_remote_incoming_requests_total{status="hit"}[$__rate_interval])) +/ +sum by (kind) (rate(bazel_remote_incoming_requests_total[$__rate_interval])) +* 100 +``` + +### Cache Hit Percentage Overall + +```promql +sum(rate(bazel_remote_incoming_requests_total{status="hit"}[$__rate_interval])) +/ +sum(rate(bazel_remote_incoming_requests_total[$__rate_interval])) +* 100 +``` + +### Request Rate + +```promql +sum(rate(bazel_remote_incoming_requests_total[$__rate_interval])) +``` + +### Request Duration Quantiles + +```promql +histogram_quantile(0.99, sum by(le) (rate(http_request_duration_seconds_bucket{k8s_cluster_name="bazel-remote-cache"}[$__rate_interval]) )) +``` + +### S3 Cache Hit Percentage Overall + +```promql +sum(rate(bazel_remote_s3_cache_hits_total[$__rate_interval])) +/ +(sum(rate(bazel_remote_s3_cache_hits_total[$__rate_interval])) + + + sum(rate(bazel_remote_s3_cache_misses_total[$__rate_interval])) +) +* 100 +``` \ No newline at end of file