Skip to content

libpcp_web, pmproxy: support for optional labels#2591

Open
lmchilton wants to merge 1 commit into
performancecopilot:mainfrom
lmchilton:pmproxy-optional-labels
Open

libpcp_web, pmproxy: support for optional labels#2591
lmchilton wants to merge 1 commit into
performancecopilot:mainfrom
lmchilton:pmproxy-optional-labels

Conversation

@lmchilton
Copy link
Copy Markdown
Contributor

@lmchilton lmchilton commented May 13, 2026

Added support for optional labels to be ingested into a key-value server by pmproxy. SHA1 hash calculations are not affected.

libpcp_web code additions include creating a new buffer to separate hash calculation from display calculation and creating a new context hash function to separate context hash w/ identifying labels and context hash w/ identifying + optional labels.

Added qa test 1744 to test optional label ingestion and display.

  • Added support for optional labels in pmproxy/libpcp_web to be ingested into key-value servers
  • Separated hash calculation from display calculation in libpcp_web to allow optional labels in display while maintaining consistent SHA1 hashes for identity
  • Implemented context hash function distinguishing between identifying labels and identifying plus optional labels
  • Added QA test 1744 to validate optional label ingestion and display with OpenMetrics
  • Added OpenMetrics sample test data with labeled metrics for QA testing
  • Improved key server version detection in QA utilities for better redis/valkey compatibility

Review Change Stack

Added support for optional labels to be ingested into a key-value
server by pmproxy. SHA1 hashes calculations are not affected.

Added qa test 1744 to test optional label ingestion and display.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

📝 Walkthrough

Walkthrough

This PR introduces optional label support for pmseries metrics by refactoring label handling in libpcp_web, fixing a key server detection bug in QA infrastructure, and adding a comprehensive QA test to validate the label storage behavior.

Changes

Label Storage and QA Validation

Layer / File(s) Summary
Label caching and hashing refactoring
src/libpcp_web/src/util.c
Adds all_labels() callback to include optional labels in cached context->labels, metric->labels, indom->labels, and instance->labels, while using a separate identifying-label JSON for SHA1 hashing inputs in pmwebapi_context_hash(), pmwebapi_metric_hash(), and pmwebapi_instance_hash().
QA infrastructure key server detection fix
qa/common.check, qa/group
Updates _check_key_server_version() to use _get_pids_by_name for both valkey-server and redis-server detection, ensuring correct CLI tool selection when either server is running; adds group list entry.
PCP QA Test 1744 for label storage validation
qa/1744, qa/openmetrics/samples/pmseries_label_test.txt
Implements comprehensive test that configures temporary pmseries with OpenMetrics PMDA, captures labeled metric data via pmlogger, loads archive into pmseries, and validates label storage by querying two metrics (test_metric1 and test_metric2) with identifying and optional labels through filtering.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 Optional labels now cache with pride,
While hashes use identifying identities,
Our QA test rides the pmseries tide,
Validating label fidelities! 🏷️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding support for optional labels to libpcp_web and pmproxy, which aligns with the core objective and all major file changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
qa/openmetrics/samples/pmseries_label_test.txt (1)

1-9: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Change # Type to # TYPE per the OpenMetrics/Prometheus exposition spec.

The PMDA parser uses case-sensitive matching for the TYPE directive (line 798 of pmdaopenmetrics.python: elif lp[1] == 'TYPE':). Using # Type instead of # TYPE causes the parser to silently ignore the type metadata, leaving the metric without a declared type. This impacts proper type handling in the archive.

Suggested fix
-# Type test_metric1 gauge
+# TYPE test_metric1 gauge
@@
-# Type test_metric2 gauge
+# TYPE test_metric2 gauge
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/openmetrics/samples/pmseries_label_test.txt` around lines 1 - 9, The
sample exposition uses lowercase "# Type" which the PMDA parser's case-sensitive
check (see the token handling that inspects lp[1] == 'TYPE' in
pmdaopenmetrics.python) will ignore; update the sample lines in
pmseries_label_test.txt to use uppercase "# TYPE" for each metric definition
(e.g., change "# Type test_metric1 gauge" and "# Type test_metric2 gauge" to "#
TYPE test_metric1 gauge" and "# TYPE test_metric2 gauge") so the parser
correctly recognizes and applies metric type metadata.
🧹 Nitpick comments (4)
src/libpcp_web/src/util.c (2)

460-474: 💤 Low value

idlabels is recomputed on every call, even when context->labels is cached.

Previously the pmMergeLabelSets/sds construction was guarded by context->labels == NULL. Now context_labels_str_identifying() runs unconditionally on every invocation of pmwebapi_context_hash(). This is fine functionally (the SHA1 input is identical to the prior identifying-only JSON), but it does mean repeated label-set merges if this is ever called more than once per context. If callers may invoke this on a hot path, consider caching the identifying-label digest input alongside context->labels, or hoisting the computation to be guarded by an identity-cache flag.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libpcp_web/src/util.c` around lines 460 - 474, pmwebapi_context_hash
currently always calls context_labels_str_identifying() causing repeated merges;
change it to compute and cache the identifying-label string only when it is
first needed (same condition used for caching context->labels) by adding a
cached sds (e.g. context->idlabels or similar) or an identity-cache flag on the
context struct, populate that cache inside the block where context->labels is
set (or guarded by context->labels == NULL), and then pass the cached
identifying string and length to pmwebapi_source_hash instead of recomputing
each call; ensure the new cache is allocated with sdsnewlen and cleaned up where
context->labels is freed.

487-513: 💤 Low value

Dead len assignment on the idbuf fallback path.

idbuf is consumed via %s in sdscatfmt (null-terminated), so the len = pmsprintf(idbuf, sizeof(idbuf), "null") reassignment is never read. Either drop the assignment or rename the second variable to keep the intent obvious that len only tracks the cached metric->labels length. Also note that the same pattern is duplicated in pmwebapi_instance_hash() (lines 548-550) — extracting a small helper that returns an "identifying labels" C-string would remove the duplication across context_hash, metric_hash, and instance_hash.

♻️ Sketch of minimal cleanup
-    len = metric_labelsets(metric, idbuf, sizeof(idbuf), labels, NULL);
-	if (len <= 0)
-	    len = pmsprintf(idbuf, sizeof(idbuf), "null");
+    if (metric_labelsets(metric, idbuf, sizeof(idbuf), labels, NULL) <= 0)
+	(void)pmsprintf(idbuf, sizeof(idbuf), "null");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libpcp_web/src/util.c` around lines 487 - 513, The second pmsprintf
assigns to the same local `len` but that value is never used (idbuf is consumed
as a C-string), so either remove the redundant `len = pmsprintf(idbuf, ...)` or
use a separate variable name to make the intent clear; better yet, extract a
small helper (e.g. get_identifying_labels or similar) that encapsulates the
logic around `metric_labelsets(...)` / fallback `pmsprintf("null")` and returns
the C-string to use in sdscatfmt, then call that helper from
context_hash/metric_hash/instance_hash (and from pmwebapi_instance_hash) to
eliminate duplication and avoid clobbering `len`.
qa/1744 (2)

27-161: 💤 Low value

Inconsistent use of $seq.full vs $seq_full.

The script mixes $here/$seq.full (lines 27-35, 82, 143, 144, 147, 150) with the $seq_full shell variable (lines 133, 161, 190-193). Most other PCP QA tests pick one form consistently. If $seq_full ends up unset (depending on which common helper sources it), the tee -a $seq_full on line 161 will tee to a file literally named "" and the cat >>$seq_full lines will silently lose output. Pick one form to keep the test's debug log path predictable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/1744` around lines 27 - 161, The test mixes two log variables ($seq.full
and $seq_full) causing silent loss of output; pick one name and replace all
occurrences for consistency (I suggest using $seq_full since other tests use it)
— update the echo/redirects that write to "$here/$seq.full" (and any uses of
$seq.full) to use "$here/$seq_full" and change the tee and any concatenations
that reference $seq_full if you instead choose $seq.full; ensure all references
in the key-server shutdown block, pmlogger start/stop messages, archive
listings, flush/verify logs, and the pmseries load tee (symbols: _cleanup,
_prepare_pmda, _pmdaopenmetrics_install, pmloggerpid, flush_result, pmseries
invocation) are updated so every write uses the same variable.

124-140: ⚡ Quick win

Race: archive load may run before pmlogger has actually flushed.

pmsleep 3 then kill -TERM then pmsleep 1 is a fragile way to ensure the archive is complete. kill -TERM is asynchronous and pmlogger needs time to flush its final volume — on slow/loaded CI, this can yield an incomplete archive and the later pmseries --load will then store fewer/no labels, causing intermittent ERROR: No series found for test_metricN failures rather than reproducible diffs. Consider $sudo $signal -s $pmloggerpid; wait $pmloggerpid (or use pmlogger -s 3 for a fixed-sample run instead of background + SIGTERM).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/1744` around lines 124 - 140, The test currently starts pmlogger in
background (pmlogger, pmloggerpid) and uses pmsleep + kill -TERM + pmsleep which
can cause a race where pmlogger hasn't flushed before pmseries --load; replace
the async stop with a synchronous shutdown: after starting pmlogger (the
pmlogger/pmloggerpid code) send the termination signal and then wait for
pmlogger to exit (use wait on pmloggerpid) OR alternatively run pmlogger in
fixed-sample mode (pmlogger -s 3) instead of background + SIGTERM so the archive
is guaranteed complete before continuing; update the stop sequence that
references kill -TERM, pmsleep, and pmloggerpid accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@qa/1744`:
- Around line 22-43: The cleanup function unconditionally removes
$PCP_ETC_DIR/pcp/labels/* which can delete real system files if _save_config
never ran; modify the test so that _save_config sets a guard flag (e.g.,
labels_saved=1) after it successfully completes (reference the _save_config call
and the _restore_config function) and change _cleanup to only perform the
destructive "$sudo rm -rf $PCP_ETC_DIR/pcp/labels/*" when that guard is set (or
alternatively move the rm into the same conditional block that runs the test
body), ensuring _restore_config is only relied on when a prior backup was made.
- Around line 170-187: The script sets status=1 when a series is missing (in the
checks around series1 and series2) but later unconditionally resets status=0,
masking failures; fix by preserving failure state instead of clearing it: remove
the unconditional status=0 reset (or introduce a separate boolean like
had_failure that you set to true on any "ERROR" and only set status=0 at the
very end if had_failure is false), ensure the checks in the pmseries blocks
(series1, series2) only set status=1/had_failure and never overwrite it with
success later so the exit value reflects any missing-series errors.
- Line 17: The test currently gates execution on presence of pmseries_import
(checked via the test invoking pmseries_import and calling _notrun), but the
test never calls pmseries_import; instead it uses pmseries to load archives, so
remove the pmseries_import existence check and either drop that _notrun gate
entirely or replace it with a check for pmseries (the binary actually used by
the test). Update the check that calls _notrun to reference pmseries (or remove
it) and keep the rest of the test logic unchanged.
- Around line 102-108: The FILTER line in the test config uses the label value
instead of the label name so pmdaopenmetrics (pmdaopenmetrics.python matching
logic) will never match; update the here-doc content that adds "FILTER: OPTIONAL
LABEL some_optional_label" to use the label name "optional_label" instead (i.e.,
replace some_optional_label with optional_label in the cat <<EOF block that
writes $tmp.tmp) and verify the optional-label feature is enabled (the test
comment "*** NOTE: optional labels are not implemented yet") before merging so
the test will exercise the behavior.

---

Outside diff comments:
In `@qa/openmetrics/samples/pmseries_label_test.txt`:
- Around line 1-9: The sample exposition uses lowercase "# Type" which the PMDA
parser's case-sensitive check (see the token handling that inspects lp[1] ==
'TYPE' in pmdaopenmetrics.python) will ignore; update the sample lines in
pmseries_label_test.txt to use uppercase "# TYPE" for each metric definition
(e.g., change "# Type test_metric1 gauge" and "# Type test_metric2 gauge" to "#
TYPE test_metric1 gauge" and "# TYPE test_metric2 gauge") so the parser
correctly recognizes and applies metric type metadata.

---

Nitpick comments:
In `@qa/1744`:
- Around line 27-161: The test mixes two log variables ($seq.full and $seq_full)
causing silent loss of output; pick one name and replace all occurrences for
consistency (I suggest using $seq_full since other tests use it) — update the
echo/redirects that write to "$here/$seq.full" (and any uses of $seq.full) to
use "$here/$seq_full" and change the tee and any concatenations that reference
$seq_full if you instead choose $seq.full; ensure all references in the
key-server shutdown block, pmlogger start/stop messages, archive listings,
flush/verify logs, and the pmseries load tee (symbols: _cleanup, _prepare_pmda,
_pmdaopenmetrics_install, pmloggerpid, flush_result, pmseries invocation) are
updated so every write uses the same variable.
- Around line 124-140: The test currently starts pmlogger in background
(pmlogger, pmloggerpid) and uses pmsleep + kill -TERM + pmsleep which can cause
a race where pmlogger hasn't flushed before pmseries --load; replace the async
stop with a synchronous shutdown: after starting pmlogger (the
pmlogger/pmloggerpid code) send the termination signal and then wait for
pmlogger to exit (use wait on pmloggerpid) OR alternatively run pmlogger in
fixed-sample mode (pmlogger -s 3) instead of background + SIGTERM so the archive
is guaranteed complete before continuing; update the stop sequence that
references kill -TERM, pmsleep, and pmloggerpid accordingly.

In `@src/libpcp_web/src/util.c`:
- Around line 460-474: pmwebapi_context_hash currently always calls
context_labels_str_identifying() causing repeated merges; change it to compute
and cache the identifying-label string only when it is first needed (same
condition used for caching context->labels) by adding a cached sds (e.g.
context->idlabels or similar) or an identity-cache flag on the context struct,
populate that cache inside the block where context->labels is set (or guarded by
context->labels == NULL), and then pass the cached identifying string and length
to pmwebapi_source_hash instead of recomputing each call; ensure the new cache
is allocated with sdsnewlen and cleaned up where context->labels is freed.
- Around line 487-513: The second pmsprintf assigns to the same local `len` but
that value is never used (idbuf is consumed as a C-string), so either remove the
redundant `len = pmsprintf(idbuf, ...)` or use a separate variable name to make
the intent clear; better yet, extract a small helper (e.g.
get_identifying_labels or similar) that encapsulates the logic around
`metric_labelsets(...)` / fallback `pmsprintf("null")` and returns the C-string
to use in sdscatfmt, then call that helper from
context_hash/metric_hash/instance_hash (and from pmwebapi_instance_hash) to
eliminate duplication and avoid clobbering `len`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ff2ebaba-8485-46fa-a894-b5aff961bcb3

📥 Commits

Reviewing files that changed from the base of the PR and between 6aa5eb1 and 10c68e2.

⛔ Files ignored due to path filters (1)
  • qa/1744.out is excluded by !**/*.out
📒 Files selected for processing (5)
  • qa/1744
  • qa/common.check
  • qa/group
  • qa/openmetrics/samples/pmseries_label_test.txt
  • src/libpcp_web/src/util.c

Comment thread qa/1744

_pmdaopenmetrics_check || _notrun "openmetrics pmda not installed"

test -x $PCP_BINADM_DIR/pmseries_import || _notrun "No pmseries_import script"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

pmseries_import is checked here but never used.

The script loads archives directly via pmseries $options --load "{source.path: ...}" (line 160), and never invokes pmseries_import. This _notrun gate will spuriously skip the test on systems that ship pmseries but not pmseries_import (or are renamed in a future packaging change), without any actual dependency on that tool. Drop the check, or replace it with whatever the test actually depends on (e.g., pmseries itself).

🧹 Suggested fix
-test -x $PCP_BINADM_DIR/pmseries_import || _notrun "No pmseries_import script"
-_check_series
+_check_series
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/1744` at line 17, The test currently gates execution on presence of
pmseries_import (checked via the test invoking pmseries_import and calling
_notrun), but the test never calls pmseries_import; instead it uses pmseries to
load archives, so remove the pmseries_import existence check and either drop
that _notrun gate entirely or replace it with a check for pmseries (the binary
actually used by the test). Update the check that calls _notrun to reference
pmseries (or remove it) and keep the rest of the test logic unchanged.

Comment thread qa/1744
Comment on lines +22 to +43
_cleanup()
{
cd $here
[ -n "$pmloggerpid" ] && $sudo kill -TERM $pmloggerpid 2>/dev/null
if [ -n "$key_server_port" ]; then
echo "Shutting down key server on port $key_server_port..." >>$here/$seq.full
$keys_cli -p $key_server_port shutdown >>$here/$seq.full 2>&1
sleep 1
# Verify it's actually dead
if $keys_cli -p $key_server_port PING >>$here/$seq.full 2>&1; then
echo "WARNING: Key server still running after shutdown!" >>$here/$seq.full
else
echo "Key server shutdown confirmed" >>$here/$seq.full
fi
fi
$sudo rm -rf $PCP_ETC_DIR/pcp/labels/*
_restore_config $PCP_ETC_DIR/pcp/labels
_sighup_pmcd
_pmdaopenmetrics_cleanup
_restore_config $PCP_SYSCONF_DIR/pmseries
$sudo rm -rf $tmp $tmp.*
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Cleanup deletes $PCP_ETC_DIR/pcp/labels/* even if _save_config failed.

Line 37 unconditionally rm -rfs everything under $PCP_ETC_DIR/pcp/labels/ before line 38 attempts _restore_config. If _save_config on line 91 was never reached (e.g., earlier _notrun / failure), the cleanup will still wipe a system directory it never backed up. Guard the destructive rm with a flag set after _save_config succeeds, or move the rm inside the same conditional that runs the test body.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/1744` around lines 22 - 43, The cleanup function unconditionally removes
$PCP_ETC_DIR/pcp/labels/* which can delete real system files if _save_config
never ran; modify the test so that _save_config sets a guard flag (e.g.,
labels_saved=1) after it successfully completes (reference the _save_config call
and the _restore_config function) and change _cleanup to only perform the
destructive "$sudo rm -rf $PCP_ETC_DIR/pcp/labels/*" when that guard is set (or
alternatively move the rm into the same conditional block that runs the test
body), ensuring _restore_config is only relied on when a prior backup was made.

Comment thread qa/1744
Comment on lines +102 to +108
# add a bunch of label filters to exercise various filtering options
cat <<EOF >>$tmp.tmp

FILTER: OPTIONAL LABEL some_optional_label

EOF
$sudo cp $tmp.tmp $PCP_PMDAS_DIR/openmetrics/config.d/$urlbase.url
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate openmetrics PMDA FILTER parsing to confirm whether the argument is a label name, value, or regex.
fd -t f 'pmda(openmetrics|.*\.python)' src/pmdas/openmetrics 2>/dev/null
rg -nP -C5 '\bFILTER\s*:?\s*OPTIONAL\b' --type=py --type=sh --type=md
rg -nP -C5 'OPTIONAL\s+LABEL' 
rg -nP -C3 'PM_LABEL_OPTIONAL' --type=c --type=py

Repository: performancecopilot/pcp

Length of output: 20928


🏁 Script executed:

# Find and examine the label test file to confirm label name and value
find . -name "*pmseries_label_test*" -o -name "*label_test*" | head -20

Repository: performancecopilot/pcp

Length of output: 115


🏁 Script executed:

# Search for the actual label name and value in openmetrics samples
rg -n "test_metric2|optional_label" --type=txt --type=py src/pmdas/openmetrics/ qa/

Repository: performancecopilot/pcp

Length of output: 447


🏁 Script executed:

# Look for the filter parsing code in the openmetrics PMDA
fd -t f "openmetrics" src/pmdas/openmetrics/ | grep -E "\.(py|c)$"

Repository: performancecopilot/pcp

Length of output: 48


🏁 Script executed:

# Find the openmetrics PMDA Python implementation
find . -path "*/pmdas/openmetrics*.python" -type f

Repository: performancecopilot/pcp

Length of output: 113


🏁 Script executed:

# Search for the OPTIONAL LABEL filter parsing in Python code
rg -n "OPTIONAL|LABEL" src/pmdas/openmetrics/pmdaopenmetrics.python -A 5 -B 5

Repository: performancecopilot/pcp

Length of output: 6815


🏁 Script executed:

# Check the entire config parsing section
rg -n "FILTER" src/pmdas/openmetrics/pmdaopenmetrics.python -A 3 -B 3 | head -100

Repository: performancecopilot/pcp

Length of output: 1636


Fix filter to use label name instead of value: change some_optional_label to optional_label

The filter incorrectly uses the label value instead of the label name. The label is defined as optional_label="some_optional_label", where optional_label is the name and some_optional_label is the value. The PMDA filter code (pmdaopenmetrics.python line 632) matches against label names, not values, so the filter FILTER: OPTIONAL LABEL some_optional_label will never match. It should be FILTER: OPTIONAL LABEL optional_label.

Note: The test files contain a comment stating "*** NOTE: optional labels are not implemented yet" — confirm whether this feature is active before merging.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/1744` around lines 102 - 108, The FILTER line in the test config uses the
label value instead of the label name so pmdaopenmetrics (pmdaopenmetrics.python
matching logic) will never match; update the here-doc content that adds "FILTER:
OPTIONAL LABEL some_optional_label" to use the label name "optional_label"
instead (i.e., replace some_optional_label with optional_label in the cat <<EOF
block that writes $tmp.tmp) and verify the optional-label feature is enabled
(the test comment "*** NOTE: optional labels are not implemented yet") before
merging so the test will exercise the behavior.

Comment thread qa/1744
Comment on lines +170 to +187
series1=`pmseries $options 'openmetrics.pmseries_label_test.test_metric1'`
if [ -z "$series1" ]; then
echo "ERROR: No series found for test_metric1"
status=1
else
pmseries $series1 | _filter_series | _filter
fi

# Query for the second metric (with optional label)
echo
echo "=== test_metric2 (with optional label) ==="
series2=`pmseries $options 'openmetrics.pmseries_label_test.test_metric2'`
if [ -z "$series2" ]; then
echo "ERROR: No series found for test_metric2"
status=1
else
pmseries $series2 | _filter_series | _filter
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Test sets status=1 on missing series but continues to next check.

If series1 is empty on line 171, the test prints ERROR: and sets status=1 but falls through to query test_metric2. That's fine for debugging, but the trailing status=0 on line 198 unconditionally clears the failure flag, so a missed-series error on test_metric1/test_metric2 is silently masked at exit time.

🐛 Suggested fix
-# success, all done
-status=0
-exit
+# success, all done — but only if no earlier ERROR set status=1
+[ "$status" = 1 ] || status=0
+exit

Or simpler: track failures with a separate flag and only clear it at the end if no failure occurred.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qa/1744` around lines 170 - 187, The script sets status=1 when a series is
missing (in the checks around series1 and series2) but later unconditionally
resets status=0, masking failures; fix by preserving failure state instead of
clearing it: remove the unconditional status=0 reset (or introduce a separate
boolean like had_failure that you set to true on any "ERROR" and only set
status=0 at the very end if had_failure is false), ensure the checks in the
pmseries blocks (series1, series2) only set status=1/had_failure and never
overwrite it with success later so the exit value reflects any missing-series
errors.

@lmchilton lmchilton requested review from natoscott and wcohen May 14, 2026 14:52
@natoscott
Copy link
Copy Markdown
Member

@lmchilton please go through all the coderabbit reported issues and resolve/comment on each, then I'll take a look (so I don't double up on anything already reported). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants