Skip to content

feat: add pidstat post-processing for per-process CPU metrics#54

Merged
k-rister merged 1 commit into
masterfrom
add-pidstat-post-processing
Mar 30, 2026
Merged

feat: add pidstat post-processing for per-process CPU metrics#54
k-rister merged 1 commit into
masterfrom
add-pidstat-post-processing

Conversation

@k-rister
Copy link
Copy Markdown
Contributor

Summary

Add pidstat parsing to sysstat-post-process that produces per-process CPU metrics following the same Busy-CPU/NonBusy-CPU pattern used by mpstat:

  • Metrics: %usr, %system, %guest → Busy-CPU; %wait → NonBusy-CPU
  • Names: cmd (process name), pid (process ID), type (usr/system/guest/wait)
  • Zero filtering: Two-pass approach — first pass identifies PIDs with any non-zero activity, second pass only processes those PIDs. Configurable via $skip_zero_pids variable. Filters ~90% of idle process data while preserving time-series continuity.

Test plan

  • Run crucible with pidstat subtool enabled and verify metric-data-pidstat.csv.xz and metric-data-pidstat.json.xz are created
  • Verify zero-only PIDs are filtered out
  • Verify crucible get metric --source pidstat --type Busy-CPU --breakout cstype,csid,cmd,pid,type returns data

🤖 Generated with Claude Code

@k-rister k-rister self-assigned this Mar 30, 2026
@k-rister k-rister requested a review from a team March 30, 2026 12:46
@project-crucible-tracking project-crucible-tracking Bot moved this to In Progress in Crucible Tracking Mar 30, 2026
@k-rister k-rister requested review from atheurer and removed request for a team March 30, 2026 12:53
Add pidstat parsing to sysstat-post-process that produces per-process
CPU metrics using the same Busy-CPU/NonBusy-CPU pattern as mpstat.
Each process is identified by command name and PID in the metric
names, with the CPU field type (usr, system, guest, wait) as a
breakout dimension.

Uses a two-pass approach: first scan identifies PIDs with any
non-zero activity, then only those PIDs are processed (configurable
via $skip_zero_pids). This filters out ~90% of idle process data
while preserving time-series continuity for active processes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@k-rister k-rister force-pushed the add-pidstat-post-processing branch from b0434da to 2c1145c Compare March 30, 2026 13:04
@k-rister
Copy link
Copy Markdown
Contributor Author

Here is an example get metric query from a test using the kube endpoint to run fio on an OCP cluster:

$ crucible get metric --source=pidstat --type=Busy-CPU --period=${PERIOD_ID} --breakout=hostname,cstype,csid,engine-type,engine-role,engine-id,endpoint-label,cmd,type --filter=gt:5
Checking for httpd...appears to be running
Checking for OpenSearch...appears to be running
Resolving cdmq dependencies...
up to date in 455ms

From Opensearch instance: localhost:9200 and cdm: v9dev

Available breakouts:  benchmark-name,benchmark-role,hosted-by,osruntime,pid,tool-name,userenv

                                                                                                                                             28-03-2026
  source     type           hostname   cstype             csid engine-type engine-role        engine-id endpoint-label            cmd   type   23:59:43
-------------------------------------------------------------------------------------------------------------------------------------------------------
 pidstat Busy-CPU crucible-master-01 profiler kube-1-sysstat-1    profiler    profiler kube-1-sysstat-1         kube-1           etcd    usr       6.13
 pidstat Busy-CPU crucible-master-01 profiler kube-1-sysstat-1    profiler    profiler kube-1-sysstat-1         kube-1 kube-apiserver    usr       8.83
 pidstat Busy-CPU crucible-master-01 profiler kube-1-sysstat-1    profiler    profiler kube-1-sysstat-1         kube-1        kubelet    usr      12.72
 pidstat Busy-CPU crucible-master-01 profiler kube-1-sysstat-1    profiler    profiler kube-1-sysstat-1         kube-1     prometheus    usr      33.70
 pidstat Busy-CPU crucible-master-01 profiler kube-1-sysstat-1    profiler    profiler kube-1-sysstat-1         kube-1 roadblocker.py    usr       5.92
 pidstat Busy-CPU crucible-master-02 profiler kube-1-sysstat-2    profiler    profiler kube-1-sysstat-2         kube-1           etcd    usr       5.58
 pidstat Busy-CPU crucible-master-02 profiler kube-1-sysstat-2    profiler    profiler kube-1-sysstat-2         kube-1 kube-apiserver    usr       9.92
 pidstat Busy-CPU crucible-master-02 profiler kube-1-sysstat-2    profiler    profiler kube-1-sysstat-2         kube-1        kubelet    usr       9.78
 pidstat Busy-CPU crucible-master-02 profiler kube-1-sysstat-2    profiler    profiler kube-1-sysstat-2         kube-1     prometheus    usr      32.96
 pidstat Busy-CPU crucible-master-02 profiler kube-1-sysstat-2    profiler    profiler kube-1-sysstat-2         kube-1 roadblocker.py    usr       5.91
 pidstat Busy-CPU crucible-master-03 profiler kube-1-sysstat-3    profiler    profiler kube-1-sysstat-3         kube-1           etcd    usr       5.55
 pidstat Busy-CPU crucible-master-03 profiler kube-1-sysstat-3    profiler    profiler kube-1-sysstat-3         kube-1            fio system      46.99
 pidstat Busy-CPU crucible-master-03 profiler kube-1-sysstat-3    profiler    profiler kube-1-sysstat-3         kube-1            fio    usr      50.09
 pidstat Busy-CPU crucible-master-03 profiler kube-1-sysstat-3    profiler    profiler kube-1-sysstat-3         kube-1 kube-apiserver    usr       9.45
 pidstat Busy-CPU crucible-master-03 profiler kube-1-sysstat-3    profiler    profiler kube-1-sysstat-3         kube-1        kubelet    usr       7.14
 pidstat Busy-CPU crucible-master-03 profiler kube-1-sysstat-3    profiler    profiler kube-1-sysstat-3         kube-1 roadblocker.py    usr       5.93

Copy link
Copy Markdown
Contributor

@atheurer atheurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall, I think my main issue with not writing this sooner was that handling pidstat with threads present or not present was possibly going to be different. I don't recall exactly what the issue was, but it might have been the %busy in the PID vs children TIDs.

@k-rister
Copy link
Copy Markdown
Contributor Author

If I recall, I think my main issue with not writing this sooner was that handling pidstat with threads present or not present was possibly going to be different. I don't recall exactly what the issue was, but it might have been the %busy in the PID vs children TIDs.

Does the tool even support that (threads mode)? From a quick I don't think so:

https://github.com/perftool-incubator/tool-sysstat/blob/master/sysstat-start#L60-L66

@k-rister k-rister merged commit 4499cb8 into master Mar 30, 2026
30 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Crucible Tracking Mar 30, 2026
@k-rister k-rister deleted the add-pidstat-post-processing branch March 30, 2026 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants