feat(ci): self-contained integration summary comment (real-vs-infra classification)#112
feat(ci): self-contained integration summary comment (real-vs-infra classification)#112gspivey wants to merge 1 commit into
Conversation
The PR summary comment previously showed only per-tier 'N tests, M failures' — to see WHICH testcase failed, WHY, or whether it was a real code issue vs an SSM/ENI infra flake, you had to download the JUnit artifact or dig through collapsed log <details>. generate_markdown_summary now emits a self-contained test-results/summary.md (posted via the existing post_pr_comment) with: - a verdict separating REAL failures from INFRA flakes, - a per-tier per-testcase table (result + time + reason), - a Real-failures section with detail excerpts, - an Infra-flakes section (SSM/ENI setup, labeled task #10). Classification uses the JUnit failure type: real tier failures are type=AssertionError (junit_add_failure); synthetic setup failures are type=ExecutionError (generate_failure_xml), with a keyword fallback. Report-only — exit-code behavior unchanged. Scripts-only, no workflow change. Verified locally against sample pass/real-fail/infra-fail XMLs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Synthetic Performance Results (run)Commit: ✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01) Synthetic UDP Performance ResultsMeasures framework overhead: sync IPv4 Baseline
IPv6
IPv6 vs IPv4 Comparison (sync path)
IPv4 avg sync/async ratio: 0.9x, worst: 1.0x | IPv6 vs IPv4 worst ratio: 1.29x (OK)
|
Synthetic Performance Results — Graviton (run)Commit: ✅ synthetic UDP socket bound to 10.0.0.1:9000 (MAC: 02:00:00:00:00:01) Synthetic UDP Performance ResultsMeasures framework overhead: sync IPv4 Baseline
IPv6
IPv6 vs IPv4 Comparison (sync path)
IPv4 avg sync/async ratio: 1.0x, worst: 1.1x | IPv6 vs IPv4 worst ratio: 1.23x (OK)
|
Integration Test Failure — Graviton (Run 28584885170)Branch: failure-summary.json{
"failed_step": "deploy_infrastructure",
"error": "Infrastructure deployment failed",
"exit_code": 2,
"timestamp": "2026-07-02T11:10:49.498635Z",
"sender_instance_id": "i-07593469e6a77ec92",
"receiver_instance_id": "i-015af845d6febe3dc",
"commit": "fe4e43a1a487109a334c6ba23c6a8c735affd987",
"run_url": "https://github.com/gspivey/dpdk-stdlib-rust/actions/runs/28584885170"
}```
### receiver-console-output.log (65535 bytes, last 80 lines)[ 2333.044723] Tainted: G U 6.18.35-68.129.amzn2023.aarch64 #2 [ 1276.809427] vfio-pci 0000:00:06.0: reset done |
❌ Integration Tests Failed — Graviton (run)Branch: Test ResultsApplication Logs (last 20 lines) |
Integration Test Failure (Run 28584885171)Branch: failure-summary.json{
"failed_step": "deploy_infrastructure",
"error": "Infrastructure deployment failed",
"exit_code": 2,
"timestamp": "2026-07-02T11:10:31.851420Z",
"sender_instance_id": "i-0a2119f94e88164d4",
"receiver_instance_id": "i-0f7c72af934d08f8c",
"commit": "fe4e43a1a487109a334c6ba23c6a8c735affd987",
"run_url": "https://github.com/gspivey/dpdk-stdlib-rust/actions/runs/28584885171"
}```
### receiver-console-output.log (65535 bytes, last 80 lines)[ 2332.706117] Tainted: G U 6.18.35-68.129.amzn2023.x86_64 #2 [ 1277.560705] vfio-pci 0000:00:06.0: reset done total 132 |
❌ Integration Tests Failed (Run 28584885171)Branch: Test ResultsApplication Logs (last 20 lines)Full Application Logs (last 200 lines each)
|
|
Consolidated into #111 (same squash-to-development). Folded the summary-report commit onto agent/tcp-smoke-tiers to avoid parallel open PRs. |
You shouldn't need my access / artifact-downloads to analyze a run. Today the PR summary shows only per-tier
N tests, M failures; to learn which testcase failed, why, or whether it was a real code failure vs an SSM/ENI infra flake, you have to pull the JUnit artifact or dig through collapsed log<details>.The data already existed (
generate_json_summaryparses per-testcase messages intosummary.json) — it just never reached a comment. This surfaces it.After
generate_markdown_summaryposts a self-contained summary. On the run that prompted this (#111 Graviton) it would have read:How
AssertionError(junit_add_failure); synthetic setup failures =ExecutionError(generate_failure_xml), keyword fallback.run-integration-tests.sh) — no workflow YAML change (posts via existingpost_pr_comment). Verified locally against sample pass/real-fail/infra-fail XMLs.🤖 Generated with Claude Code