Improve error reporting in final JSON results by cjonas9 · Pull Request #29 · stellar/stellar-rpc-blaster

cjonas9 · 2026-04-03T00:08:59Z

What

Following the stepped-pacer redesign (#23):

Adds a per-step-interval timeline of windows/snapshots to the final JSON results. Each entry captures target RPS, success/error counts, error rate (%), and latency percentiles (p50, p95, p99, p99.9) for that window only
Adds CLI argument --error-percent which specifies the tolerance for errors (e.g. --error-percent 60 would kill the test after receiving over 60% error/timeout in any 5s window).
Removes the redundant log file mirror (as it's now superseded by the richer JSON)

Visually, the results JSON now looks like:

{
  /*
  Old: cumulative summary data of the entire run
  */
  "start": ...,
  "end": ...,
  "duration_seconds": 123,
  "endpoints": {
    "getLedgers": {
       // summary for the entire run of performance, error breakdowns by type, etc.
      },

    /*
    NEW: data of how the RPC performed at each RPS phase for this endpoint
    */
      "timeline": [
        {
          "target_rps": 7.14, "success": 142, "errors": 0, "error_rate_pct": 0,
          "p50_ms": 165.631, "p95_ms": 537.087, "p99_ms": 900.095, "p99.9_ms": 935.423
        },
        {
          "target_rps": 14.29, "success": 277, "errors": 0, "error_rate_pct": 0,
          "p50_ms": 262.399, "p95_ms": 734.719, "p99_ms": 974.335, "p99.9_ms": 1098.751
        },
        ...,
        {
          "target_rps": 50, "success": 0, "errors": 116, "error_rate_pct": 100,
          "p50_ms": 15007.743, "p95_ms": 15007.743, "p99_ms": 15007.743, "p99.9_ms": 15007.743
        }
      ]
    }
  }
}

The timeline intervals default to 5s when no ramp-up is provided as there's no natural "RPS phase" notion without a ramp-up that steps through various RPS levels. Degradation still occurs with a fixed RPS level, so this allows for a timeline-d degradation curve without a ramp-up period.

Why

Conceptually, this is a minor rework of the aggregator: stats are accumulated in windows which are deposited into the aggregator and are reset thereafter. This allows for the aggregation of stats by their current RPS level, which provides useful/digestible instantaneous info (as opposed to relying on cumulative stats).

The richer JSON gives operators a clear view of how the RPC degraded as load increased. Previously, the results JSON flattened all metrics into a single summary bucket per endpoint without any notion of an RPS-timeline, so there was no way to tell at what RPS the RPC buckled. This gives an operator running a load test a concise/minimal degradation curve (e.g. when errors started, how latency + error rates changed at each step) without requiring them to dig through through logs.

Known limitations

N/A

…p up

…pped-pacer

…r into improve-error-reporting

socket-security · 2026-04-10T19:17:45Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	github.com/stellar/go-stellar-sdk@v0.3.0 ⏵ v0.5.0	⁺¹

View full report

socket-security · 2026-04-10T19:17:47Z

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action	Severity	Alert (click "▶" to expand/collapse)
Warn		License policy violation: golang `github.com/creachadair/mds` under BSD-3-Clause-HP License: BSD-3-Clause-HP - The applicable license policy does not permit this license (5) (LICENSE) From: `?` → `golang/github.com/creachadair/jrpc2@v1.3.5` → `golang/github.com/creachadair/mds@v0.26.2` ℹ Read more on: This package \| This alert \| What is a license policy violation? Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at `support@socket.dev`. Suggestion: Find a package that does not violate your license policy or adjust your policy to allow this package's license. Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment `@SocketSecurity ignore golang/github.com/creachadair/mds@v0.26.2`. You can also ignore all packages with `@SocketSecurity ignore-all`. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.
Warn		Obfuscated code: golang `github.com/pelletier/go-toml/v2` is 98.0% likely obfuscated Confidence: 0.98 Location: Package overview From: `?` → `golang/github.com/spf13/viper@v1.21.0` → `golang/github.com/pelletier/go-toml/v2@v2.3.0` ℹ Read more on: This package \| This alert \| What is obfuscated code? Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at `support@socket.dev`. Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code. Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment `@SocketSecurity ignore golang/github.com/pelletier/go-toml/v2@v2.3.0`. You can also ignore all packages with `@SocketSecurity ignore-all`. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

internal/run/metrics/aggregator.go

Shaptic · 2026-04-10T19:56:44Z

internal/run/metrics/aggregator.go

+// mergedHistogram builds a combined histogram from all windows.
+func (e *EndpointStats) mergedHistogram() *hdrhistogram.Histogram {
+	merged := hdrhistogram.New(1, 60000000, 3)
+	for i := range e.windows {


If you’re using the counter to index into the array, the expended range loop is more concise, ie for _, elem := range e.windows

internal/run/metrics/results.go

Shaptic · 2026-04-10T20:11:10Z

internal/run/metrics/results.go

+			return nil, err
+		}
+		fmt.Fprintf(&buf, "\n    %q: ", name)
+		buf.Write(epJson)


I’m almost certain that if you give each endpoint (and timeline) structure a custom marshalling routine that doesn’t do indentation, you can eliminate most if not all of the byte-by-byte serialization here.

I'd love that, I was unsuccessful in trying this before but I'll give that approach a shot. Thanks!

Spent some time looking into this, and I'm not confident it's possible (or, at this point, worthwhile).

The main issue is that any notion of "marshal + indent" happens in those two, disjoint steps: marshal things to bytes, then write those bytes. The latter step is a standalone json function rather than something we can implement, and encoding/json's MarshalJSON interface only lets us control what gets serialized rather than how it's formatted. json.Indent, seemingly, is completely agnostic to types or any idea higher-level than "is this a { or [ character"?

Separately, I really wanted this to be true, as it's a bit baffling that this doesn't natively exist. The alternative to this Fprintf byte-by-byte approach is to make MarshalJSON do our marshaling, then have a custom indenter that does what we want at the byte level (i.e. not adding newlines + indents to things in an array). Which is still ugly + more work than it should be, but I do feel like digging my heels in on keeping timeline entries on one line. But if there's some way of doing this that I missed, let me know!

cjonas9 added 15 commits April 1, 2026 03:07

added start RPS functionality, ensured compatibility with modes + ram…

1e29b0b

…p up

fix unlikely race on endpointStats.isActive

30fc402

replace linear pacer with stepped pacer

c6fb224

parameterized step interval as optional CLI arg

11b4657

reduced state in aggregator

4b4a4a6

patch config validation bug when stepInterval = 0

42ac85b

add pacer tests

0c7b23e

simplified if statement to max()

5ccb083

reorganize run log + results output to be in one folder per run

cc3ec2d

added summary windows to final results json

21c06ce

removed logs being written to disk

e9a0278

tidied go mod

feadd23

Merge branch 'dev' of github.com:stellar/stellar-rpc-blaster into ste…

94170c8

…pped-pacer

Merge branch 'stepped-pacer' of github.com:stellar/stellar-rpc-blaste…

f956923

…r into improve-error-reporting

fixed default behavior when startRPS is omitted vs. explicitly 0

c593fdf

cjonas9 mentioned this pull request Apr 7, 2026

Replace linear pacer with stepped pacer #27

Merged

cjonas9 added 4 commits April 8, 2026 17:03

added CLI argument for error percent to tolerate

4cec909

made timeline fit on one json line

d5ddbb5

fix SteppedPacer.Pace() bug when startRPS=0

bcf445a

fixed minor bug with error percent not cancelling

76fbfc1

Base automatically changed from stepped-pacer to dev April 10, 2026 17:47

Merge branch 'dev' into improve-error-reporting

fad8658

cjonas9 force-pushed the improve-error-reporting branch from 169bb2e to fad8658 Compare April 10, 2026 18:47

cjonas9 marked this pull request as ready for review April 10, 2026 18:48

made output/ a part of the git repo

baf8826

Shaptic reviewed Apr 10, 2026

View reviewed changes

cjonas9 added 2 commits April 13, 2026 12:11

style fixes in aggregator.go

4924672

fix validation bug that occurs when no ramp up

9e680c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error reporting in final JSON results#29

Improve error reporting in final JSON results#29
cjonas9 wants to merge 23 commits intodevfrom
improve-error-reporting

cjonas9 commented Apr 3, 2026 •

edited

Loading

Uh oh!

socket-security bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

socket-security bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Shaptic Apr 10, 2026

Uh oh!

Uh oh!

Shaptic Apr 10, 2026

Uh oh!

cjonas9 Apr 10, 2026

Uh oh!

cjonas9 Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cjonas9 commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Known limitations

Uh oh!

socket-security bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

socket-security bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Shaptic Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Shaptic Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

cjonas9 Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

cjonas9 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cjonas9 commented Apr 3, 2026 •

edited

Loading

socket-security bot commented Apr 10, 2026 •

edited

Loading

socket-security bot commented Apr 10, 2026 •

edited

Loading