Skip to content

fix(jfr): skip emitting empty constant pools#603

Merged
jbachorik merged 2 commits into
mainfrom
jb/jfr-skip-empty-pools
Jun 18, 2026
Merged

fix(jfr): skip emitting empty constant pools#603
jbachorik merged 2 commits into
mainfrom
jb/jfr-skip-empty-pools

Conversation

@jbachorik

@jbachorik jbachorik commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

What does this PR do?:

Stops the JFR checkpoint from emitting empty (0-element) constant-pool sections. Each variable pool writer (writeThreads, writeStackTraces, writeMethods, writeClasses, writePackages, and the three writeConstantPoolSection overloads) now skips emission and returns 0 when its pool is empty. writeCpool writes a 1-byte placeholder pool-count and sums the actually-emitted pools; finishChunk back-patches the real count via pwrite after flush, mirroring the existing cpool size patch. writeLogLevels is moved up alongside the other always-present constant pools.

Motivation:

OpenJDK jfr rejects a checkpoint that declares a constant pool with zero elements:

jfr print: unexpected internal error, Pool java.lang.String must contain at least one element
java.lang.InternalError: Pool java.lang.String must contain at least one element

The previous code hard-coded the pool count to 12 and always emitted every pool, so a chunk produced before any string label / context attribute / package existed contained an empty pool and was unreadable by these tools.

Additional Notes:

Stacked on #602; rebase to main after it merges. The diff here is only the empty-pool skipping + flush-safe count back-patch (Set B); the method-pool id allocator lives in #602.

The count cannot be precomputed: the Method/Class/Package/Symbol pools are only fully populated as a side effect of writeStackTraces/writeMethods, and empty variable pools are skipped. Hence the placeholder + back-patch.

How to test the change?:

Validated with OpenJDK jfr print on recordings produced by the MethodIdReuseTest dump driver.

Before (build without this change):

$ jfr print --events 'datadog.*' prof15130-reuse-p2-6-….jfr
jfr print: unexpected internal error, Pool java.lang.String must contain at least one element
java.lang.InternalError: Pool java.lang.String must contain at least one element
(exit 1)

After (this change):

$ jfr print --events 'datadog.*' prof15130-reuse-….jfr
(exit 0, no pool error)

Local results (debug build, macOS):

  • :ddprof-lib:buildDebug -Pskip-gtest → BUILD SUCCESSFUL
  • :ddprof-lib:gtestDebug_methodMapId_ut[ PASSED ] 5 tests (no regression)
  • jfr print --events 'datadog.*' on a produced recording → exit 0 (was exit 1 / must contain at least one element before)

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a security review (run the dd:platform-security-review
    skill, or file a request via the PSEC review form).
    bewaire also runs automatically on every PR.

  • This PR doesn't touch any of that.

  • JIRA: PROF-15130

🤖 Generated with Claude Code

@datadog-official

This comment has been minimized.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the JFR checkpoint constant-pool (cpool) emission logic to avoid writing empty (0-element) constant-pool sections, which strict JFR readers reject. It does so by making each variable-pool writer return whether it actually emitted a pool section, writing a placeholder pool-count byte in writeCpool, and then back-patching the real pool count in finishChunk() using pwrite() after the buffer has been flushed.

Changes:

  • Make variable pool writers (writeThreads, writeStackTraces, writeMethods, writeClasses, writePackages, and writeConstantPoolSection overloads) return 0/1 depending on whether a pool section was emitted.
  • Emit a 1-byte placeholder for the constant pool count in writeCpool, compute the actual count as pools are emitted, and return it.
  • Back-patch the real constant pool count in finishChunk() flush-safe via pwrite(), mirroring the existing cpool size patch behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
ddprof-lib/src/main/cpp/flightRecorder.h Updates writer method signatures and documents new “skip empty pool + return emitted count” semantics.
ddprof-lib/src/main/cpp/flightRecorder.cpp Implements skipping empty pool sections, dynamic pool counting with placeholder + pwrite() back-patch, and reorders writeLogLevels among always-present pools.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ddprof-lib/src/main/cpp/flightRecorder.cpp Outdated
@dd-octo-sts

dd-octo-sts Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Reliability & Chaos Results

24 failure(s) detected Pipeline: https://gitlab.ddbuild.io/DataDog/java-profiler/-/pipelines/119645822

❌ chaos: profiler gmalloc aarch64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler gmalloc aarch64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler gmalloc amd64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler gmalloc amd64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler jemalloc aarch64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler jemalloc aarch64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler jemalloc amd64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler jemalloc amd64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tcmalloc aarch64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tcmalloc aarch64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tcmalloc amd64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tcmalloc amd64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer gmalloc aarch64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer gmalloc aarch64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer gmalloc amd64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer gmalloc amd64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer jemalloc aarch64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer jemalloc aarch64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer jemalloc amd64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer jemalloc amd64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer tcmalloc aarch64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer tcmalloc aarch64 25 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer tcmalloc amd64 21 0 3 temXchaos
ddprof jar unavailable
❌ chaos: profiler tracer tcmalloc amd64 25 0 3 temXchaos
ddprof jar unavailable

@jbachorik

Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

Reviewed commit: 73a14ef384

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@dd-octo-sts

dd-octo-sts Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

CI Test Results

Run: #27785068042 | Commit: 6d4229b | Duration: 3h 0m 16s (longest job)

All 32 test jobs passed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - 🚫 - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal 🚫 - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary: Total: 32 | Passed: 30 | Failed: 0 | Cancelled: 2


Updated: 2026-06-18 22:50:50 UTC

Base automatically changed from jb/prof-15130-method-pool-ids to main June 18, 2026 17:05
@jbachorik

Copy link
Copy Markdown
Collaborator Author

@copilot resolve the merge conflicts in this pull request

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Merged origin/main into the branch. The only conflict was in MethodIdReuseTest.java (added ASM imports in main via PR #602); resolved by taking the main version. The diff vs main is now clean — only flightRecorder.cpp and flightRecorder.h remain changed.

@jbachorik jbachorik marked this pull request as ready for review June 18, 2026 17:21
@jbachorik jbachorik requested a review from a team as a code owner June 18, 2026 17:21

@rkennke rkennke left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor thing, up to you to fix or not.
What might be useful (maybe as a follow-up?) is a regression test. A gtest that produces a chunk with empty variable pools and asserts (a) the back-patched count equals the number of emitted sections and (b) no empty section is present would be sufficient.

_thread_ids[i][old_index].clear();
}

if (threads.empty()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is effectively dead code, right? Because we do threads.insert(_tid) just a few lines up. This also means we have 5 non-empty cpools, not 4.

String idStr = el.attrs.get("id");
if (idStr != null) {
ClassDef cd = new ClassDef();
cd.id = Long.parseLong(idStr);
if ("field".equals(child.name)) {
FieldDef fd = new FieldDef();
fd.name = child.attrs.get("name");
fd.typeId = Long.parseLong(child.attrs.get("class"));
fd.typeId = Long.parseLong(child.attrs.get("class"));
fd.constantPool = "true".equals(child.attrs.get("constantPool"));
String dim = child.attrs.get("dimension");
fd.dimension = dim != null ? Integer.parseInt(dim) : 0;
jbachorik and others added 2 commits June 18, 2026 21:45
Strict JFR readers (OpenJDK jfr, JMC, Datadog backend) reject a checkpoint
declaring a constant pool with zero elements ("Pool ... must contain at
least one element"). The writers always emitted every pool with a
hard-coded count of 12, so a chunk produced before any string label /
context attribute / package existed was unreadable.

Each variable pool writer now skips emission and returns 0 when empty.
writeCpool writes a 1-byte placeholder pool-count and sums the emitted
pools; finishChunk back-patches the real count via pwrite after flush,
mirroring the existing cpool size patch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…de in writeThreads

writeThreads always emits (_tid inserted unconditionally), so the
threads.empty() guard was dead code and the pool count baseline was
wrong (4 instead of 5). Move writeThreads to the constant-pool group
and start pool_count at 5.

Add EmptyConstantPoolTest: parses raw JFR binary and asserts no pool
section has zero elements and pool count >= 5 in both early and
active-profiling dumps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik jbachorik force-pushed the jb/jfr-skip-empty-pools branch from 5cb40ce to 70e3534 Compare June 18, 2026 19:46
@jbachorik jbachorik merged commit 48d5522 into main Jun 18, 2026
12 of 18 checks passed
@jbachorik jbachorik deleted the jb/jfr-skip-empty-pools branch June 18, 2026 19:48
@github-actions github-actions Bot added this to the 1.45.0 milestone Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants