NETOBSERV-2608: TLS metrics by jotak · Pull Request #2707 · netobserv/netobserv-operator

jotak · 2026-05-05T08:40:24Z

Description

Create specific metrics for TLS, counting flow, labelled by:

TLSVersion
TLSCipherSuite
TLSGroup

Add related charts in the monitoring dashboard

Add related otel mapping

Dependencies

Console (pf5): NETOBSERV-2608: TLS, metrics and charts [pf5] netobserv-web-console#1478

Checklist

Does the changes in PR need specific configuration or environment set up for testing?
- if so please describe it in PR description.
I have added thorough unit tests for the change.
QE requirements (check 1 from the list):
- Standard QE validation, with pre-merge tests unless stated otherwise.
- Regression tests only (e.g. refactoring with no user-facing change).
- No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

New Features
- Added TLS flow metrics for nodes, namespaces, and workloads.
- Added a TLS section with charts (TLS traffic, per-version and per-group breakdowns).
UI
- Renamed TLS field/column from "TLS Curve" to "TLS Group" in the console.
Documentation
- Updated FlowCollector CRD and docs to include new TLS metrics.
Chores
- Deployment flow updated to support PF5 console-plugin image handling.

Create specific metrics for TLS, counting flow, labelled by: - TLSVersion - TLSCipherSuite - TLSGroup - TLSTypes - Proto

openshift-ci · 2026-05-05T08:40:28Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci-robot · 2026-05-05T08:40:29Z

@jotak: This pull request references NETOBSERV-2608 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Description

Create specific metrics for TLS, counting flow, labelled by:

TLSVersion

TLSCipherSuite

TLSGroup

TLSTypes

Proto

Dependencies

Agent: Rename TLSCurve => TLSGroup netobserv-ebpf-agent#968

Checklist

Does the changes in PR need specific configuration or environment set up for testing?

if so please describe it in PR description.

I have added thorough unit tests for the change.

QE requirements (check 1 from the list):

Standard QE validation, with pre-merge tests unless stated otherwise.

Regression tests only (e.g. refactoring with no user-facing change).

No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2026-05-05T08:40:30Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mffiedler for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

DOWNSTREAM_OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-05-05T08:40:32Z

📝 Walkthrough

Walkthrough

Adds TLS flow metrics and TLSTracking gating, extends CRD/type enums and defaults, updates core metric definitions and TLS charts, adds OTEL TLS mappings, renames frontend TLSCurve→TLSGroup, enables TLSTracking in test specs, and extends Makefile/CSV handling for a PF5 console-plugin image variant.

Changes

Plugin Image PF5 Variant Support

Layer / File(s)	Summary
Build Configuration `.mk/development.mk`	`set-plugin-image` now sets `RELATED_IMAGE_CONSOLE_PLUGIN_PF5` when `CSV` is empty and performs CSV swaps for `console-plugin-pf5` / `RELATED_IMAGE_CONSOLE_PLUGIN_PF5` when `CSV` is provided; updates static plugin image on redeploy.
Console Frontend Config `internal/controller/consoleplugin/config/static-frontend-config.yaml`	Replaces `TLSCurve` with `TLSGroup` for UI columns, filters, and fields documentation.

TLS Metrics Feature

Layer / File(s)	Summary
Type Definitions & Validation / Feature-gate `api/flowcollector/v1beta2/flowcollector_types.go`, `api/flowcollector/v1beta2/flowcollector_defaults.go`, `api/flowcollector/v1beta2/flowcollector_alert_types.go`	Default include lists extended with TLS metrics; `FLPMetric` kubebuilder enum updated; `GetIncludeList` filters out `_tls_` metrics when TLSTracking is disabled.
CRD & Manifests `config/crd/bases/flows.netobserv.io_flowcollectors.yaml`, `bundle/manifests/flows.netobserv.io_flowcollectors.yaml`, `helm/crds/flows.netobserv.io_flowcollectors.yaml`	Added `node_tls_flows_total`, `namespace_tls_flows_total`, and `workload_tls_flows_total` to `includeList` / `additionalIncludeList` enums.
API Documentation `docs/FlowCollector.md`	`additionalIncludeList` and adjacent include-list enums updated with TLS metric names.
Metrics Core Implementation `internal/pkg/metrics/predefined_metrics.go`	Removed `TLSVersion` from baseline labels; added per-group RTT, drop, DNS, netpol, TLS, and IPsec metrics; refactored metric construction and adjusted label-removal logic.
Metrics Charts `internal/pkg/metrics/predefined_charts.go`	Added `tlsStatusChart()` returning three TLS charts: TLS traffic (SingleStat), flows rate per TLS version (Line), and flows rate per TLS group (Line).
Observability Integration `internal/pkg/helper/otel/otel-config.json`	Added OTEL transform mappings for `TLSVersion`, `TLSGroup`, `TLSTypes`, and `TLSCipherSuite`.
Tests & Utilities `internal/controller/flp/flp_pipeline_builder_test.go`, `internal/pkg/dashboards/dashboard_test.go`, `internal/pkg/metrics/predefined_metrics_test.go`, `internal/pkg/test/util/utils.go`, `internal/pkg/helper/otel/otel_config_test.go`	Test expectations updated for TLS metrics and dashboards; TLSTracking added to test FlowCollector spec; OTEL transform test updated; minor test formatting changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

netobserv/netobserv-operator#2598: Both PRs modify the operator deployment/image pipeline to add and handle a PF5 console-plugin variant (introducing/using RELATED_IMAGE_CONSOLE_PLUGIN_PF5 and adding CSV/Makefile swaps for console-plugin-pf5).
netobserv/netobserv-operator#2546: Modifies the same FlowCollectorSpec.GetIncludeList function with overlapping metric-filtering changes.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'NETOBSERV-2608: TLS metrics' clearly describes the main change: adding TLS-related metrics functionality referenced by the Jira ticket.
Description check	✅ Passed	The description covers the main changes (TLS metrics, charts, otel mapping) and includes dependencies and QE requirements. All required template sections are present and substantively filled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-05T09:18:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.40%. Comparing base (d483a69) to head (322576c).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2707      +/-   ##
==========================================
- Coverage   72.48%   72.40%   -0.09%     
==========================================
  Files         107      107              
  Lines       11482    11479       -3     
==========================================
- Hits         8323     8311      -12     
- Misses       2659     2667       +8     
- Partials      500      501       +1

Flag	Coverage Δ
unittests	`72.40% <100.00%> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...flowcollector/v1beta2/flowcollector_alert_types.go	`97.77% <100.00%> (+0.05%)`	⬆️
api/flowcollector/v1beta2/flowcollector_types.go	`100.00% <ø> (ø)`
internal/pkg/metrics/predefined_charts.go	`99.66% <100.00%> (+0.04%)`	⬆️
internal/pkg/metrics/predefined_metrics.go	`98.05% <100.00%> (-0.31%)`	⬇️
internal/pkg/test/util/utils.go	`100.00% <100.00%> (ø)`

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Also add otel mapping

Amoghrd · 2026-05-06T15:43:16Z

/ok-to-test

github-actions · 2026-05-06T15:47:35Z

New images:

quay.io/netobserv/network-observability-operator:cdee50e
quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-cdee50e
quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-cdee50e

They will expire in two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:cdee50e make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-cdee50e

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-cdee50e
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

leandroberetta

lgtm

openshift-ci · 2026-05-07T11:12:08Z

New changes are detected. LGTM label has been removed.

github-actions · 2026-05-07T11:16:40Z

New images:

quay.io/netobserv/network-observability-operator:7583751
quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-7583751
quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-7583751

They will expire in two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:7583751 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-7583751

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-7583751
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

internal/pkg/dashboards/dashboard_test.go (1)
90-170: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add an edge-case test for TLS chart generation.

The new TLS assertions cover happy-path titles only. Please add one test where TLS metrics are absent/empty and assert TLS panels are not generated (or are handled deterministically), so chart selection regressions are caught.

As per coding guidelines **/*_test.go: include tests for valid cases, invalid input, and edge cases (empty, nil values).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/pkg/dashboards/dashboard_test.go` around lines 90 - 170, The test
TestCreateFlowMetricsDashboard_DefaultList only asserts TLS happy-path titles;
add an edge-case unit test that simulates missing/empty TLS metrics by calling
CreateFlowMetricsDashboards with a metrics.Definitions set that has TLS metrics
removed (e.g., manipulate the result from
metrics.GetDefinitions(util.SpecForMetrics(), false) to clear TLS-related
entries) and then parse the dashboard JSON via FromBytes and assert that
d.FindRow("TLS") either returns nil/empty or that its Titles() is empty
(deterministic behavior), ensuring TLS panels are not generated when TLS metrics
are absent.
internal/pkg/metrics/predefined_metrics.go (1)
130-143: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use one family tag for both drop counters.

ignoreTags matching is exact. The packets metric uses "drops", but the bytes metric uses "drop", so disabling the drop family will still leave *_drop_bytes_total enabled.
Suggested fix
-			tags: []string{group, tagBytes, "drop"},
+			tags: []string{group, tagBytes, "drops"},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/pkg/metrics/predefined_metrics.go` around lines 130 - 143, The bytes
drop metric added to predefinedMetrics uses an inconsistent family tag ("drop")
compared to the packets drop metric ("drops"), so ignoreTags won't match; update
the taggedMetricDefinition for the bytes metric (the entry constructing
metricslatest.FlowMetricSpec within predefinedMetrics) to use the same family
tag as the packets counter (make the tags slice contain "drops" instead of
"drop" or otherwise align both metrics to the same family tag) so disabling the
drop family via ignoreTags correctly disables both *_drop_bytes_total and
*_drops_* counters.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/pkg/metrics/predefined_charts.go`:
- Line 257: The TLS percentage denominator uses a hardcoded OR of
namespace/workload/node totals which yields the leftmost match in mixed-scope
environments; update tlsStatusChart() to accept an extra parameter (e.g.,
matchingTotalMetric string) and use that parameter in the PromQL denominator
instead of the OR expression, then update every call site that constructs TLS
metric charts to pass the appropriate matching total metric name
(namespace_flows_total, workload_flows_total or node_flows_total) so each
scope’s TLS numerator divides by its own total.

In `@internal/pkg/metrics/predefined_metrics.go`:
- Around line 24-26: The change removed the TLSVersion label from the shared
label slices (tagNodes, tagNamespaces, tagWorkloads) which silently mutates the
existing node_*, namespace_*, and workload_* metric schemas; revert or preserve
the legacy TLSVersion label in those slices to avoid breaking existing
PromQL/recording rules/dashboards, and if you need a TLS-specific breakdown
instead, add new metric names (or new dedicated label slices and metric
definitions) that include TLSVersion rather than removing it from
tagNodes/tagNamespaces/tagWorkloads so existing metrics remain backwards
compatible.

---

Outside diff comments:
In `@internal/pkg/dashboards/dashboard_test.go`:
- Around line 90-170: The test TestCreateFlowMetricsDashboard_DefaultList only
asserts TLS happy-path titles; add an edge-case unit test that simulates
missing/empty TLS metrics by calling CreateFlowMetricsDashboards with a
metrics.Definitions set that has TLS metrics removed (e.g., manipulate the
result from metrics.GetDefinitions(util.SpecForMetrics(), false) to clear
TLS-related entries) and then parse the dashboard JSON via FromBytes and assert
that d.FindRow("TLS") either returns nil/empty or that its Titles() is empty
(deterministic behavior), ensuring TLS panels are not generated when TLS metrics
are absent.

In `@internal/pkg/metrics/predefined_metrics.go`:
- Around line 130-143: The bytes drop metric added to predefinedMetrics uses an
inconsistent family tag ("drop") compared to the packets drop metric ("drops"),
so ignoreTags won't match; update the taggedMetricDefinition for the bytes
metric (the entry constructing metricslatest.FlowMetricSpec within
predefinedMetrics) to use the same family tag as the packets counter (make the
tags slice contain "drops" instead of "drop" or otherwise align both metrics to
the same family tag) so disabling the drop family via ignoreTags correctly
disables both *_drop_bytes_total and *_drops_* counters.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 56bc636b-716d-4e2b-b8b1-75834d8c073f

📥 Commits

Reviewing files that changed from the base of the PR and between d483a69 and 6327c8e.

📒 Files selected for processing (17)

.mk/development.mk
api/flowcollector/v1beta2/flowcollector_alert_types.go
api/flowcollector/v1beta2/flowcollector_defaults.go
api/flowcollector/v1beta2/flowcollector_types.go
bundle/manifests/flows.netobserv.io_flowcollectors.yaml
config/crd/bases/flows.netobserv.io_flowcollectors.yaml
docs/FlowCollector.md
helm/crds/flows.netobserv.io_flowcollectors.yaml
internal/controller/consoleplugin/config/static-frontend-config.yaml
internal/controller/flp/flp_pipeline_builder_test.go
internal/pkg/dashboards/dashboard_test.go
internal/pkg/helper/otel/otel-config.json
internal/pkg/helper/otel/otel_config_test.go
internal/pkg/metrics/predefined_charts.go
internal/pkg/metrics/predefined_metrics.go
internal/pkg/metrics/predefined_metrics_test.go
internal/pkg/test/util/utils.go

coderabbitai · 2026-05-07T11:22:44Z

+			Unit:          metricslatest.UnitPercent,
+			Queries: []metricslatest.Query{
+				{
+					PromQL: `sum(rate($METRIC[2m])) / (sum(rate(netobserv_namespace_flows_total[2m])) OR sum(rate(netobserv_workload_flows_total[2m])) OR sum(rate(netobserv_node_flows_total[2m])))`,


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify tlsStatusChart call sites and check whether mixed totals can coexist. set -euo pipefail echo "== tlsStatusChart call sites ==" rg -n --type=go '\btlsStatusChart\s*$' echo echo "== default include lists containing flow totals ==" rg -n --type=go '"(namespace_flows_total|workload_flows_total|node_flows_total|namespace_tls_flows_total|workload_tls_flows_total|node_tls_flows_total)"'

Repository: netobserv/netobserv-operator

Length of output: 5095

🏁 Script executed:

#!/bin/bash # Examine the tlsStatusChart function and its call site context echo "== tlsStatusChart function definition (lines 240-290) ==" sed -n '240,290p' internal/pkg/metrics/predefined_charts.go echo echo "== Call site in predefined_metrics.go (lines 200-220) ==" sed -n '200,220p' internal/pkg/metrics/predefined_metrics.go

Repository: netobserv/netobserv-operator

Length of output: 2429

🏁 Script executed:

#!/bin/bash # Check how TLS metrics relate to flow metrics - are they paired? echo "== TLS metric definitions ==" rg -A 3 'tls.*_total' internal/pkg/metrics/predefined_metrics.go | head -30 echo echo "== How namespace/workload/node metrics are structured ==" rg -B 2 -A 5 'tlsStatusChart\($' internal/pkg/metrics/predefined_metrics.go

Repository: netobserv/netobserv-operator

Length of output: 661

🏁 Script executed:

#!/bin/bash # Find where $METRIC gets substituted/replaced in the code echo "== $METRIC substitution ==" rg -n '\$METRIC' internal/pkg/metrics/ echo echo "== Check if tests mix multiple metric scopes ==" rg -B 2 -A 2 'GetDefinitions.*node.*namespace.*workload' internal/pkg/metrics/predefined_metrics_test.go

Repository: netobserv/netobserv-operator

Length of output: 3225

🏁 Script executed:

#!/bin/bash # Look at how charts are used - does replacement happen? echo "== Chart Query usage and replacement ==" rg -n 'Query.*PromQL\|PromQL.*\$METRIC' internal/pkg/metrics/ | head -20 echo echo "== Monitor or dashboard code that processes these charts ==" rg -l 'Chart\|PromQL' internal/controller/monitoring/ internal/pkg/ | head -5

Repository: netobserv/netobserv-operator

Length of output: 173

🏁 Script executed:

#!/bin/bash # Search for where charts are processed - look in monitoring controller echo "== monitoring_controller.go - full scan for Chart handling ==" rg -n 'Chart|PromQL' internal/controller/monitoring/monitoring_controller.go | head -30 echo echo "== Search for $METRIC replacement in codebase ==" rg -n 'Replace.*\$METRIC\|strings.Replace.*METRIC' internal/ echo echo "== Look for Sprintf or formatting of PromQL ==" rg -n 'fmt.Sprintf.*PromQL\|Sprintf.*%' internal/pkg/metrics/ | head -20

Repository: netobserv/netobserv-operator

Length of output: 225

🏁 Script executed:

#!/bin/bash # Check if there's a dashboard/query builder that handles substitution echo "== Files that import or use predefined_charts ==" rg -l 'predefined_charts\|tlsStatusChart' internal/ echo echo "== Check FlowMetricSpec usage and how charts are stored ==" rg -B 3 -A 3 'Charts:.*\[\]' internal/pkg/metrics/predefined_metrics.go | head -40

Repository: netobserv/netobserv-operator

Length of output: 181

🏁 Script executed:

#!/bin/bash # Read the full tlsStatusChart function and surrounding context echo "== Full tlsStatusChart and context (lines 246-286) ==" sed -n '246,286p' internal/pkg/metrics/predefined_charts.go echo echo "== Examine how metrics are built and charts assigned (lines 195-225) ==" sed -n '195,225p' internal/pkg/metrics/predefined_metrics.go echo echo "== Check MetricName vs Chart relationship ==" sed -n '200,215p' internal/pkg/metrics/predefined_metrics.go | cat -A

Repository: netobserv/netobserv-operator

Length of output: 3433

🏁 Script executed:

#!/bin/bash # Look for where charts might be processed - check for any replacement/substitution echo "== All files importing FlowMetricSpec or Chart types ==" fd -e go -x grep -l 'FlowMetricSpec\|metricslatest\.Chart' {} \; 2>/dev/null | head -10 echo echo "== Search for where metric definitions are consumed ==" rg 'GetDefinitions\|predefinedMetrics' --type go -l

Repository: netobserv/netobserv-operator

Length of output: 636

🏁 Script executed:

#!/bin/bash # Check dashboard.go to see how charts are rendered/processed echo "== dashboard.go - search for METRIC substitution or chart processing ==" grep -n 'METRIC\|PromQL\|Chart' internal/pkg/dashboards/dashboard.go | head -40 echo echo "== helper.go - might have utilities for chart/metric processing ==" cat -n internal/pkg/metrics/helper.go | head -50

Repository: netobserv/netobserv-operator

Length of output: 1518

🏁 Script executed:

#!/bin/bash # Check if MetricName from FlowMetricSpec connects to chart rendering echo "== Examine dashboard.go structure ==" head -100 internal/pkg/dashboards/dashboard.go | tail -80 echo echo "== Search for where $METRIC or placeholders are replaced ==" rg -C 3 'strings\.Replace|Sprintf.*METRIC|METRIC.*Replace' internal/pkg/dashboards/

Repository: netobserv/netobserv-operator

Length of output: 4098

TLS traffic percentage will be incorrect in mixed-scope configurations.

The denominator at line 257 is hardcoded to (netobserv_namespace_flows_total OR netobserv_workload_flows_total OR netobserv_node_flows_total), but the numerator is dynamically substituted per metric. When multiple scopes exist, PromQL OR returns the leftmost match, causing the workload and node TLS charts to divide by namespace totals instead of their own scope.

Pass the matching flow metric name to tlsStatusChart() and use it in the denominator instead of hardcoded OR logic:

Proposed direction

-func tlsStatusChart() []metricslatest.Chart { +func tlsStatusChart(totalFlowsMetric string) []metricslatest.Chart { ... - PromQL: `sum(rate($METRIC[2m])) / (sum(rate(netobserv_namespace_flows_total[2m])) OR sum(rate(netobserv_workload_flows_total[2m])) OR sum(rate(netobserv_node_flows_total[2m])))`, + PromQL: fmt.Sprintf(`sum(rate($METRIC[2m])) / sum(rate(%s[2m]))`, totalFlowsMetric), ... }

Then pass the matching total metric (namespace_flows_total, workload_flows_total, or node_flows_total) from each TLS metric definition call site.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/pkg/metrics/predefined_charts.go` at line 257, The TLS percentage denominator uses a hardcoded OR of namespace/workload/node totals which yields the leftmost match in mixed-scope environments; update tlsStatusChart() to accept an extra parameter (e.g., matchingTotalMetric string) and use that parameter in the PromQL denominator instead of the OR expression, then update every call site that constructs TLS metric charts to pass the appropriate matching total metric name (namespace_flows_total, workload_flows_total or node_flows_total) so each scope’s TLS numerator divides by its own total.

the namespace/workload/node variants differ by their aggregation keys but they should all be identical when not aggregated by key. That's what we do here: sum(...) and not sum by (...)(...). Their sum is expected to be equal.

@jotak, you're right. Since the query uses sum(...) with no by (...) clause, the result collapses to a global scalar — and all three metrics represent the same underlying flows just with different label dimensions, so their fully-aggregated sums are equal. The OR is just a safety net for whichever metric happens to be present. My concern was mistaken.

The metrics tagging system is a leftover from old API, and is dead code since that API was removed.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/pkg/metrics/predefined_metrics.go`:
- Around line 162-176: The tlsStatusChart helper is missing the group argument
so all TLS metrics use identical chart variants; update the loop that builds
FlowMetricSpec (using group, groupTrimmed, labels, tlsLabels, predefinedMetrics)
to call tlsStatusChart(group) instead of tlsStatusChart() so chartVariantsFor
can generate group-specific variants for tagNodes/tagNamespaces/tagWorkloads
(ensuring namespace/workload "infra"/"app" and node-specific configs are
produced).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc3fb3b9-5707-4b36-88be-3fc21f6d257f

📥 Commits

Reviewing files that changed from the base of the PR and between 6327c8e and 322576c.

📒 Files selected for processing (2)

internal/pkg/metrics/predefined_metrics.go
internal/pkg/metrics/predefined_metrics_test.go

💤 Files with no reviewable changes (1)

internal/pkg/metrics/predefined_metrics_test.go

coderabbitai · 2026-05-07T12:40:42Z

+	// TLS
+	for _, group := range []string{tagNodes, tagNamespaces, tagWorkloads} {
+		groupTrimmed := strings.TrimSuffix(group, "s")
+		labels := mapLabels[group]
+		tlsLabels := labels
+		tlsLabels = append(tlsLabels, "TLSVersion", "TLSCipherSuite", "TLSGroup")
+		predefinedMetrics = append(predefinedMetrics, metricslatest.FlowMetricSpec{
+			MetricName: fmt.Sprintf("%s_tls_flows_total", groupTrimmed),
+			Type:       metricslatest.CounterMetric,
+			Help:       fmt.Sprintf("Total TLS flows per %s", groupTrimmed),
+			Filters:    []metricslatest.MetricFilter{{Field: "TLSTypes", MatchType: metricslatest.MatchPresence}},
+			Labels:     tlsLabels,
+			Charts:     tlsStatusChart(),
+		})
+	}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the tlsStatusChart and ipsecStatusChart definitions and compare signatures rg -n "func tlsStatusChart\|func ipsecStatusChart" --type go -A 5

Repository: netobserv/netobserv-operator

Length of output: 54

🏁 Script executed:

#!/bin/bash # Try broader search patterns for chart functions echo "=== Searching for tlsStatusChart definition ===" rg -n "tlsStatusChart" --type go -A 3 -B 1 echo "" echo "=== Searching for ipsecStatusChart definition ===" rg -n "ipsecStatusChart" --type go -A 3 -B 1 echo "" echo "=== Searching for other chart helpers ===" rg -n "rttCharts|dropCharts|dnsCharts|netpolCharts" --type go -A 2 -B 1

Repository: netobserv/netobserv-operator

Length of output: 4157

🏁 Script executed:

#!/bin/bash # Check the tlsStatusChart implementation echo "=== tlsStatusChart implementation ===" sed -n '246,280p' internal/pkg/metrics/predefined_charts.go echo "" echo "=== ipsecStatusChart implementation (for comparison) ===" sed -n '218,245p' internal/pkg/metrics/predefined_charts.go

Repository: netobserv/netobserv-operator

Length of output: 2181

🏁 Script executed:

#!/bin/bash # Check what chartVariantsFor does and how it uses the group parameter echo "=== Search for chartVariantsFor definition ===" rg -n "func chartVariantsFor" --type go -A 10

Repository: netobserv/netobserv-operator

Length of output: 1182

🏁 Script executed:

#!/bin/bash # Get the full chartVariantsFor function sed -n '288,330p' internal/pkg/metrics/predefined_charts.go

Repository: netobserv/netobserv-operator

Length of output: 1666

Pass group to tlsStatusChart() to generate proper chart variants for each metric type.

tlsStatusChart() is the only chart helper lacking a group argument. All others—rttCharts(group), dropCharts(group, ...), dnsCharts(group), netpolCharts(group), ipsecStatusChart(group)—receive it so chartVariantsFor() can generate group-specific variants (namespace/workload "infra" and "app" layers, node-specific configs). Without it, all three group types get identical charts, breaking dashboard configurations for namespace and workload metrics.

Diff

- Charts: tlsStatusChart(), + Charts: tlsStatusChart(group),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/pkg/metrics/predefined_metrics.go` around lines 162 - 176, The tlsStatusChart helper is missing the group argument so all TLS metrics use identical chart variants; update the loop that builds FlowMetricSpec (using group, groupTrimmed, labels, tlsLabels, predefinedMetrics) to call tlsStatusChart(group) instead of tlsStatusChart() so chartVariantsFor can generate group-specific variants for tagNodes/tagNamespaces/tagWorkloads (ensuring namespace/workload "infra"/"app" and node-specific configs are produced).

not doing so because the charts are not displayed per group in this case (yes, it's a new pattern)
Charts are deduplicated in createFlowMetricsDashboard so it won't create multiple instances of the same

@jotak Got it — the TLS charts are intentionally group-agnostic, relying on deduplication in createFlowMetricsDashboard rather than per-group variants. Makes sense as a new pattern.

( ˙ ɥ ˙ ) ~ learning noted

✏️ Learnings added

Learnt from: jotak Repo: netobserv/netobserv-operator PR: 2707 File: internal/pkg/metrics/predefined_metrics.go:162-176 Timestamp: 2026-05-07T12:45:12.308Z Learning: In `internal/pkg/metrics/predefined_metrics.go` (netobserv/netobserv-operator), `tlsStatusChart()` intentionally takes no `group` argument, unlike other chart helpers (e.g. `ipsecStatusChart(group)`, `rttCharts(group)`). TLS charts are not displayed per group; deduplication in `createFlowMetricsDashboard` prevents duplicate chart instances, so passing `group` is unnecessary and undesirable for this metric type.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

github-actions · 2026-05-07T12:51:06Z

New images:

quay.io/netobserv/network-observability-operator:cfbe85d
quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-cfbe85d
quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-cfbe85d

They will expire in two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:cfbe85d make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-cfbe85d

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-cfbe85d
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

NETOBSERV-2608: TLS metrics

bea700e

Create specific metrics for TLS, counting flow, labelled by: - TLSVersion - TLSCipherSuite - TLSGroup - TLSTypes - Proto

openshift-ci Bot added the do-not-merge/work-in-progress label May 5, 2026

openshift-ci-robot added the jira/valid-reference label May 5, 2026

jotak mentioned this pull request May 5, 2026

Rename TLSCurve => TLSGroup netobserv/netobserv-ebpf-agent#968

Merged

6 tasks

refresh bundle, fix tests

72ca819

jotak mentioned this pull request May 5, 2026

NETOBSERV-2608: TLS, metrics and charts [pf5] netobserv/netobserv-web-console#1478

Open

6 tasks

jotak added 3 commits May 6, 2026 08:54

add tls metrics by default

1029db7

Remove TLSTypes label from metrics, but use it as a filter instead

cce2d1d

Add monitoring charts for TLS

e768dee

Also add otel mapping

jotak marked this pull request as ready for review May 6, 2026 15:11

openshift-ci Bot removed the do-not-merge/work-in-progress label May 6, 2026

jotak requested a review from leandroberetta May 6, 2026 15:11

openshift-ci Bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 6, 2026

leandroberetta previously approved these changes May 7, 2026

View reviewed changes

openshift-ci Bot assigned leandroberetta May 7, 2026

openshift-ci Bot added the lgtm label May 7, 2026

Use consistant rate interval in TLS dashboard

6327c8e

jotak dismissed leandroberetta’s stale review via 6327c8e May 7, 2026 11:12

openshift-ci Bot removed the lgtm label May 7, 2026

github-actions Bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 7, 2026

jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 7, 2026

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

Remove old dead code / leftover from old API

322576c

The metrics tagging system is a leftover from old API, and is dead code since that API was removed.

github-actions Bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 7, 2026

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 7, 2026

Amoghrd added ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. and removed ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. labels May 7, 2026

Conversation

jotak commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Checklist

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented May 5, 2026

Uh oh!

openshift-ci-robot commented May 5, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Checklist

Uh oh!

openshift-ci Bot commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Amoghrd commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

leandroberetta left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jotak May 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jotak May 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jotak commented May 5, 2026 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented May 5, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited by openshift-ci Bot

Loading

codecov Bot commented May 5, 2026 •

edited

Loading

coderabbitai Bot May 7, 2026 •

edited

Loading

coderabbitai Bot May 7, 2026 •

edited

Loading