Skip to content

chore(bench): add javac compilation step to Java fixture for javacg-static comparison#1332

Merged
carlos-alm merged 8 commits into
mainfrom
chore/bench-javacg-static-1307
Jun 6, 2026
Merged

chore(bench): add javac compilation step to Java fixture for javacg-static comparison#1332
carlos-alm merged 8 commits into
mainfrom
chore/bench-javacg-static-1307

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Adds tests/benchmarks/resolution/fixtures/java/Makefile — compiles all .java sources to fixture.jar via javac -d . *.java && jar cf fixture.jar benchmark/
  • Adds scripts/compare-javacg.mjs — runs javacg-static on the compiled JAR, parses its M:pkg.Class:method (T)pkg.Class:method output, maps to ClassName.method form, and computes precision/recall vs expected-edges.json
  • Updates docs/benchmarks/RESOLUTION-COMPARISON.md with the javacg-static methodology, a placeholder comparison table, and reproduction instructions

Reproducing

# 1. Compile the fixture
cd tests/benchmarks/resolution/fixtures/java && make && cd -

# 2. Download javacg-static (or build from source)
# https://github.com/gousiosg/java-callgraph/releases

# 3. Run the comparison
node scripts/compare-javacg.mjs --jar /path/to/javacg-0.1-SNAPSHOT.jar

Name mapping

javacg-static uses pkg.ClassName:method(JVM-descriptors) form. The script maps:

  • Source <init>ClassName.ClassName (constructor-as-method, matching expected-edges.json)
  • Target <init>ClassName (constructor target = class name only)
  • JDK calls (HashMap, String, System.out, …) are filtered out — only edges between fixture classes count

Closes #1307

@claude
Copy link
Copy Markdown

claude Bot commented Jun 5, 2026

Claude finished @carlos-alm's task —— View job


PR Review: chore(bench): add javac compilation step to Java fixture for javacg-static comparison

  • Read changed files
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity
  • Phase 4: Code quality
  • Phase 5: Backlog compliance

Root Cause Analysis

Problem Being Solved: The Java fixture in tests/benchmarks/resolution/fixtures/java/ lacked compiled bytecode, making it impossible to run bytecode-based static analysis tools (WALA, Soot, javacg-static) against it for comparison. This PR adds a Makefile to compile the fixture and a comparison script modeled after the existing scripts/compare-tools.mjs.

Why This Approach: Consistent with the pattern established for JS/TS comparisons (ACG, Jelly) — a standalone comparison script, not integrated into CI or the test suite.

Risk Assessment: Low risk. No production code changed, no test assertions modified, no linting rules relaxed. Purely additive.


Phase 2: Configuration Changes

None. No biome.json, tsconfig.json, package.json, test configuration, or TypeScript settings were modified. ✓


Phase 3: Test Integrity

No test files modified. The new files are a standalone script and build tooling — not wired into vitest or the CI test matrix. No assertions weakened. ✓


Phase 4: Code Quality

scripts/compare-javacg.mjs

Significant concern — constructor target name mapping inconsistency:

scripts/compare-javacg.mjs:121-123toTargetName maps <init> to bare ClassName. But expected-edges.json uses two distinct target forms:

// Line 81-85: target is bare ClassName (class instantiation)
{ "name": "InMemoryUserRepository", "file": "InMemoryUserRepository.java" }

// Line 87-92: target is bare ClassName again
{ "name": "UserService", "file": "UserService.java" }

// Line 22-26: target is also bare ClassName
{ "name": "Validator", "file": "Validator.java" }

This matches toTargetName's behavior — ✓ the mapping is correct.

However, toSourceName at compare-javacg.mjs:116-118 maps source <init> to ClassName.ClassName:

// expected-edges.json line 20-26
{ "name": "UserService.UserService", ... }

That matches the ground truth convention ✓.

Potential false negative in parseJavacgOutput — self-edge filter too broad (compare-javacg.mjs:199-200):

if (sourceName === targetName && sourceFile === targetFile) continue;

This filters edges where the resolved name AND file match. But UserService.createDefault calls new UserService(repo) — here sourceName = "UserService.createDefault" and targetName = "UserService", which differ, so this edge is preserved. The filter is fine for the recursive case, but worth noting it only guards exact-name matches.

buildClassFileMap handles only top-level types (compare-javacg.mjs:73-95) — inner classes and anonymous classes will not be indexed, and edges involving them will be silently filtered as "JDK calls." The fixture has no inner classes currently, but this is a fragility to note if the fixture ever expands.

Makefile concern (Makefile:9-10):

javac -d . $(SRCS)
jar cf $(JAR) benchmark/

javac -d . *.java will fail if any .java file has a compilation error in a dependency. The source order matters (javac may need to see interface UserRepository before InMemoryUserRepository). In practice $(SRCS) from wildcard *.java provides alphabetical order, which works here (BaseService → InMemoryUserRepository → Main → UserRepository → UserService → Validator), but this is fragile for future fixture additions. A safer idiom is javac -d . -cp . $(SRCS) or explicitly listing in dependency order. Not a blocker for the existing fixture.

compare-javacg.mjs:150execFileSync with array args (no shell injection) ✓ — correctly avoids shell injection. javacgJar and fixtureJar are paths constructed from path.join over trusted constants.

Documentation placeholder in RESOLUTION-COMPARISON.md:220-225:

| javacg-static (CHA) ||||||

The table row is intentionally left empty pending a run. This is clearly communicated and consistent with how the JS table was initially populated. ✓


Phase 5: Backlog Compliance

Zero-dep: ✓ — No new dependencies in package.json. compare-javacg.mjs uses only Node built-ins (child_process, fs, path, url). javacg-static is a user-supplied external tool, not a package dependency.

Foundation-aligned: ✓

  • No cloud API calls
  • No new infrastructure requirements beyond java/javac (already required to analyze Java)
  • Does not generate code or make decisions — pure analysis/comparison
  • Programmatic API untouched

Problem-fit: 2/5 — This is benchmark instrumentation infrastructure. It helps quantify codegraph's Java call resolution recall gaps vs javacg-static, which indirectly surfaces actionable recall improvements (noted in the docs: same-file 0/2, static 0/2). But the script itself doesn't reduce token waste, prevent hallucinated deps, or catch blast-radius breakage for AI agents. It's useful tooling for developers working on the Java extractor.

Breaking: No — purely additive. No CLI output changes, no API changes, no DB schema changes.

Tier: 1 (zero-dep ✓ + foundation-aligned ✓)


Critical Concerns

None blocking. One minor issue worth considering before merge:

compare-javacg.mjs:200 — self-edge filter comment says "not in expected-edges" but self-edges (recursive calls) might legitimately appear in expected-edges.json for other fixtures if this script is ever generalized. The current fixture has none, so this is benign, but the comment is slightly misleading.

Makefile:9$(SRCS) compilation order — Works for the current fixture (alphabetical happens to be valid dependency order). Would silently fail if a new fixture class introduces a cross-file dependency that breaks alphabetical compilation order.


Final Assessment

## Root Cause Analysis
Problem Being Solved: Enable javacg-static (CHA) comparison for the Java fixture to
  quantify codegraph's Java call resolution gaps objectively.
Why This Approach: Consistent with the existing compare-tools.mjs pattern; no CI coupling,
  no runtime dep, manual reproduction via documented steps.
Risk Assessment: Very low — no production code, no test assertions, no config changes.

## Backlog Compliance
- Zero-dep: ✓ — no new npm runtime packages
- Foundation-aligned: ✓ — local-only, no cloud, no new infra, pure analysis
- Problem-fit: 2/5 — useful developer tooling; doesn't directly address AI-agent
  token waste or hallucination. Benchmark instrumentation, not a core feature.
- Breaking: No
- Tier: 1

## Critical Concerns
None blocking. Two minor robustness issues:
1. Makefile javac compilation order is fragile for future fixture expansion
2. `buildClassFileMap` silently drops inner/anonymous class edges

## Final Recommendation
- Rating: ⭐⭐⭐⭐ (4/5)
- Action: APPROVE
- Reasoning: Clean implementation, well-documented, follows established patterns
  (compare-tools.mjs), no risks introduced. The two fragility notes are pre-existing
  concerns for future fixture expansion, not current bugs. The doc placeholder is
  honestly communicated. Problem-fit is intentionally low (this is benchmarking
  infrastructure) — acceptable given the PR is scoped to `chore/bench`.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Codegraph Impact Analysis

9 functions changed2 callers affected across 1 files

  • findJavacgJar in scripts/compare-javacg.mjs:56 (1 transitive callers)
  • buildClassFileMap in scripts/compare-javacg.mjs:78 (1 transitive callers)
  • parseMethodSpec in scripts/compare-javacg.mjs:108 (2 transitive callers)
  • toSourceName in scripts/compare-javacg.mjs:123 (2 transitive callers)
  • toTargetName in scripts/compare-javacg.mjs:128 (2 transitive callers)
  • loadGroundTruth in scripts/compare-javacg.mjs:134 (1 transitive callers)
  • runJavacg in scripts/compare-javacg.mjs:149 (1 transitive callers)
  • parseJavacgOutput in scripts/compare-javacg.mjs:178 (1 transitive callers)
  • computeMetrics in scripts/compare-javacg.mjs:215 (1 transitive callers)

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 5, 2026

Greptile Summary

This PR completes the javacg-static comparison infrastructure for the Java call-graph benchmark: a Makefile compiles the Java fixture into a JAR, compare-javacg.mjs runs javacg-static on it and computes precision/recall vs expected-edges.json, and the benchmark doc is updated with the methodology and a placeholder results table.

  • Makefile — compiles all fixture .java sources via javac -d . *.java (outputting to benchmark/ due to package benchmark; declarations), bundles them into fixture.jar, with rm -rf benchmark/ before each build to avoid stale .class files.
  • compare-javacg.mjs — discovers the javacg-static JAR via --jar, JAVACG_JAR, or scripts/lib/; parses the M:… (T)… output; maps JVM <init> to the ClassName.ClassName / ClassName naming convention used by expected-edges.json; filters edges to fixture classes only; and reports precision/recall as a markdown table or --json blob.
  • Doc update — adds reproduction steps, name-mapping explanation, and a placeholder comparison table for javacg-static alongside the existing codegraph metrics.

Confidence Score: 5/5

Safe to merge — adds a self-contained benchmark tooling script and Makefile with no impact on production code paths.

All changes are additive benchmark infrastructure (a build script, a comparison script, a .gitignore, and documentation). The logic in compare-javacg.mjs is correct: the edge-key format matches between loadGroundTruth and parseJavacgOutput, the JVM name mapping handles constructors properly, the CLI validation covers the --jar-without-path case, and the clean-build rm -rf benchmark/ in the Makefile prevents stale artefacts. No production code is touched.

No files require special attention.

Important Files Changed

Filename Overview
scripts/compare-javacg.mjs New 297-line comparison script; CLI parsing, name mapping, metrics, and edge-key format are all correct and consistent with expected-edges.json. Previous thread issues (stale --jar path, self-edge filter) have been resolved.
tests/benchmarks/resolution/fixtures/java/Makefile Adds clean-build Makefile; rm -rf benchmark/ before each compile prevents stale .class files (previous thread concern already addressed).
tests/benchmarks/resolution/fixtures/java/.gitignore Correctly excludes generated fixture.jar and benchmark/ class output directory from version control.
docs/benchmarks/RESOLUTION-COMPARISON.md Documentation updated with compilation instructions, name-mapping explanation, placeholder comparison table, and reproduction steps.

Sequence Diagram

sequenceDiagram
    participant User
    participant Makefile
    participant javac
    participant jar
    participant compare as compare-javacg.mjs
    participant javacg as javacg-static JAR
    participant fs as expected-edges.json

    User->>Makefile: make
    Makefile->>Makefile: rm -rf benchmark/
    Makefile->>javac: "javac -d . *.java"
    javac-->>Makefile: "benchmark/*.class"
    Makefile->>jar: jar cf fixture.jar benchmark/
    jar-->>Makefile: fixture.jar

    User->>compare: node compare-javacg.mjs --jar javacg.jar
    compare->>compare: findJavacgJar()
    compare->>compare: buildClassFileMap(fixtureDir)
    compare->>fs: loadGroundTruth()
    fs-->>compare: Set of edge keys (17 edges)
    compare->>javacg: java -jar javacg.jar fixture.jar
    javacg-->>compare: M:class:method (T)class:method lines
    compare->>compare: parseJavacgOutput() → predicted edge Set
    compare->>compare: computeMetrics(predicted, groundTruth)
    compare-->>User: precision/recall table (or --json)
Loading

Reviews (8): Last reviewed commit: "Merge remote-tracking branch 'origin/cho..." | Re-trigger Greptile

Comment on lines +8 to +10
$(JAR): $(SRCS)
javac -d . $(SRCS)
jar cf $(JAR) benchmark/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Incremental rebuilds can silently include stale .class files from deleted Java sources. If a fixture .java file is removed and then make is run (without make clean), javac -d . *.java updates only the files it knows about, but the old .class file stays in benchmark/ and gets bundled into the JAR. Deleting the output directory before each compile guarantees a clean artefact.

Suggested change
$(JAR): $(SRCS)
javac -d . $(SRCS)
jar cf $(JAR) benchmark/
$(JAR): $(SRCS)
rm -rf benchmark/
javac -d . $(SRCS)
jar cf $(JAR) benchmark/

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added rm -rf benchmark/ before javac so each rebuild starts from a clean slate. Stale .class files from deleted sources can no longer slip into the JAR.

Comment thread scripts/compare-javacg.mjs Outdated
const args = process.argv.slice(2);
const jsonFlag = args.includes('--json');
const jarArgIdx = args.indexOf('--jar');
const jarArgPath = jarArgIdx !== -1 ? args[jarArgIdx + 1] : null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 When --jar is the last token on the command line (no path follows), args[jarArgIdx + 1] evaluates to undefined. Since undefined is falsy, findJavacgJar silently falls through to the env-var and scripts/lib/ lookups instead of surfacing the malformed invocation. The user ends up seeing "javacg-static JAR not found" with no indication that --jar was recognised but incomplete.

Suggested change
const jarArgPath = jarArgIdx !== -1 ? args[jarArgIdx + 1] : null;
const jarArgPath = jarArgIdx !== -1 ? (args[jarArgIdx + 1] ?? null) : null;
if (jarArgIdx !== -1 && !jarArgPath) {
console.error('Error: --jar requires a path argument');
process.exit(1);
}

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added explicit validation: if --jar is present but no path follows, the script now exits immediately with Error: --jar requires a path argument rather than silently falling through to the env-var/lib lookups.

Incremental make runs without a clean could silently bundle old .class
files from deleted .java sources into fixture.jar. Deleting benchmark/
before javac guarantees a clean artefact on every rebuild.
When --jar is the last argument, args[jarArgIdx + 1] is undefined and
findJavacgJar silently falls through to the env-var/lib lookups, giving
a confusing "JAR not found" message instead of flagging the bad
invocation. Fail fast with a clear diagnostic.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Fixed — anchored the buildClassFileMap regex to line start using ^…/m (multiline flag). Javadoc comments containing the word class before the actual type declaration can no longer produce a false match. See commit fab7780.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Fixed the flag-like string edge case raised in the Greptile summary: --jar --json now correctly errors with 'Error: --jar requires a path argument' rather than treating --json as the JAR path. Added a leading--- check before accepting the next argument as the path (commit 87136b9).

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 2c63d85 into main Jun 6, 2026
22 checks passed
@carlos-alm carlos-alm deleted the chore/bench-javacg-static-1307 branch June 6, 2026 05:28
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

research(bench): add javac compilation step to Java fixture for javacg-static comparison

1 participant