The CDash GitHub check-run can get permanently stuck in in_progress for a PR whose builds have actually finished (configure/build/test data all present, dashboard shows green), if any one of the contributing builds never receives a Done.xml submission. This blocks merge for projects that gate on the CDash check.
This affects ITK (https://open.cdash.org/index.php?project=Insight) on a regular basis — frequently enough that we've added a temporary GitHub Actions shadow check that posts a passing CDash row to unblock merges (InsightSoftwareConsortium/ITK#6146). I would much rather drop that workaround and fix the root cause here.
Root cause (current behaviour)
In app/cdash/app/Lib/Repository/GitHub.php, getCheckSummaryForBuildRow() increments $this->numPending for any build row whose done column is not 1:
// app/cdash/app/Lib/Repository/GitHub.php (around lines 443-460)
} else {
if ((int) $row['done'] === 1) {
// Build completed without problems.
$icon = ':white_check_mark:';
$msg = 'success';
$this->numPassed++;
} else {
// Build hasn't finished reporting yet.
$icon = ':hourglass_flowing_sand:';
$msg = 'pending';
$this->numPending++;
PendingSubmissions::where('buildid', (int) $row['id'])->update([
'recheck' => true,
]);
}
}
Then in generateCheckPayloadFromBuildRows() (lines 345-348):
if ($this->numPending > 0) {
// Some builds haven't finished yet.
$output['title'] = 'Pending';
$summary = 'Some builds have not yet finished submitting their results to CDash.';
}
The check-run payload's status therefore stays in_progress and conclusion is never set, regardless of how much time has passed.
The done = 1 flag is set only when app/Http/Submission/Handlers/DoneHandler.php processes a Done.xml submission. CTest's dashboard scripts submit Done as the last ctest_submit(PARTS …) call, which means any failure between the last data-bearing submission and the Done submission leaves the build effectively complete on the CDash side but eternally pending on the GitHub check-run side. Common triggers we see in ITK CI:
- The dashboard script's
ci_completed_successfully helper (in itk_common.cmake) treats a non-zero compiler-warning count as a fatal error and exits 255 after test submission but before ctest_submit(PARTS Done). Affects all six Azure DevOps pipelines and the three ARMBUILD GHA runners.
- Network blip on the very last
ctest_submit call.
- Runner timeout / out-of-disk between the test submission and the Done submission.
Reproduction
- On any CDash project with the GitHub App enabled, push a PR.
- Run a CTest dashboard against the PR head SHA, but kill the process after
ctest_submit(PARTS Configure Build Test) and before ctest_submit(PARTS Done).
- Observe: CDash's web UI shows the build as fully green; the GitHub
CDash check on the PR stays in_progress indefinitely.
A live example from ITK (still in_progress hours after the build itself finished): InsightSoftwareConsortium/ITK#6147 (CDash row points at https://open.cdash.org/index.php?project=Insight&filtercount=1&showfilters=1&field1=revision&compare1=61&value1=da7d860c0c…). Many similar PRs over the past several months.
Proposed fix
Add a stale-build watchdog so the check-run is finalized after a configurable timeout even when Done.xml never arrives.
Option A — minimal change in getCheckSummaryForBuildRow() (preferred)
Treat a build whose submittime is older than a threshold as effectively complete for the purposes of the check-run, and reflect the actual data CDash has collected:
// app/cdash/app/Lib/Repository/GitHub.php
} else {
$is_stale = $this->isBuildStale($row); // submittime older than threshold and has compile/test data
if ((int) $row['done'] === 1 || $is_stale) {
// Build completed without problems (or watchdog timed out
// waiting for Done.xml, but we have all the data we need).
$icon = ':white_check_mark:';
$msg = $is_stale ? 'success (no Done.xml)' : 'success';
$this->numPassed++;
} else {
$icon = ':hourglass_flowing_sand:';
$msg = 'pending';
$this->numPending++;
PendingSubmissions::where('buildid', (int) $row['id'])->update([
'recheck' => true,
]);
}
}
isBuildStale($row) returns true when:
submittime is older than config('cdash.github_check_stale_minutes') (default e.g. 60); and
- the build has at least one of
configureerrors, builderrors, testfailed, testpassed populated (so we know it actually ran).
Option B — Laravel scheduled task
Add an artisan command cdash:finalize-stale-checks registered in app/Console/Kernel.php that runs every 5–10 minutes. It looks for builds with done = 0, submittime < NOW() - INTERVAL and known head SHAs, and either:
- Sets
done = 1 so the existing setStatus() path naturally completes them; or
- Calls
setStatus() directly with the data CDash already has.
Configuration knob
// config/cdash.php
'github_check_stale_minutes' => env('CDASH_GITHUB_CHECK_STALE_MINUTES', 60),
Defaulting to 60 minutes is conservative — well past any normal build duration but short enough to unblock human reviewers within the same workday.
Tests
Add a regression test in app/cdash/tests/case/CDash/Lib/Repository/GitHubTest.php that constructs a build row with done = 0 and submittime older than the configured threshold and asserts that generateCheckPayloadFromBuildRows() returns status=completed with conclusion=success (or whatever the actual collected data implies).
Why a watchdog rather than fixing the dashboard scripts
We can (and should) tighten ITK's itk_common.cmake so ctest_submit(PARTS Done) is always called even on warning failures. But:
- CDash should be robust to misbehaving submitters — many CI environments outside ITK will have similar bugs.
- A stuck
in_progress row is a UX problem that no project-side fix can fully eliminate (network blips, runner termination, etc.).
- A 60-minute watchdog has effectively zero false-positive risk: a real long-running build does not finish in CDash's web UI either, so the watchdog will not mark a still-running build as complete.
I'd be happy to put up a PR if the maintainers agree with Option A. Let me know if there's a preferred direction or any history I'm missing — there may be an existing knob (e.g., cdash.github_always_pass, which I see at line 384, but that's an all-projects all-builds escape hatch and not what we want).
cc @bradlowekamp @thewtex @jcfr (frequent CDash + ITK reviewers).
The
CDashGitHub check-run can get permanently stuck inin_progressfor a PR whose builds have actually finished (configure/build/test data all present, dashboard shows green), if any one of the contributing builds never receives aDone.xmlsubmission. This blocks merge for projects that gate on theCDashcheck.This affects ITK (https://open.cdash.org/index.php?project=Insight) on a regular basis — frequently enough that we've added a temporary GitHub Actions shadow check that posts a passing
CDashrow to unblock merges (InsightSoftwareConsortium/ITK#6146). I would much rather drop that workaround and fix the root cause here.Root cause (current behaviour)
In
app/cdash/app/Lib/Repository/GitHub.php,getCheckSummaryForBuildRow()increments$this->numPendingfor any build row whosedonecolumn is not1:Then in
generateCheckPayloadFromBuildRows()(lines 345-348):The check-run payload's
statustherefore staysin_progressandconclusionis never set, regardless of how much time has passed.The
done = 1flag is set only whenapp/Http/Submission/Handlers/DoneHandler.phpprocesses aDone.xmlsubmission. CTest's dashboard scripts submitDoneas the lastctest_submit(PARTS …)call, which means any failure between the last data-bearing submission and theDonesubmission leaves the build effectively complete on the CDash side but eternally pending on the GitHub check-run side. Common triggers we see in ITK CI:ci_completed_successfullyhelper (initk_common.cmake) treats a non-zero compiler-warning count as a fatal error and exits 255 after test submission but beforectest_submit(PARTS Done). Affects all six Azure DevOps pipelines and the three ARMBUILD GHA runners.ctest_submitcall.Reproduction
ctest_submit(PARTS Configure Build Test)and beforectest_submit(PARTS Done).CDashcheck on the PR staysin_progressindefinitely.A live example from ITK (still
in_progresshours after the build itself finished): InsightSoftwareConsortium/ITK#6147 (CDash row points at https://open.cdash.org/index.php?project=Insight&filtercount=1&showfilters=1&field1=revision&compare1=61&value1=da7d860c0c…). Many similar PRs over the past several months.Proposed fix
Add a stale-build watchdog so the check-run is finalized after a configurable timeout even when
Done.xmlnever arrives.Option A — minimal change in
getCheckSummaryForBuildRow()(preferred)Treat a build whose
submittimeis older than a threshold as effectively complete for the purposes of the check-run, and reflect the actual data CDash has collected:isBuildStale($row)returnstruewhen:submittimeis older thanconfig('cdash.github_check_stale_minutes')(default e.g.60); andconfigureerrors,builderrors,testfailed,testpassedpopulated (so we know it actually ran).Option B — Laravel scheduled task
Add an
artisancommandcdash:finalize-stale-checksregistered inapp/Console/Kernel.phpthat runs every 5–10 minutes. It looks for builds withdone = 0,submittime < NOW() - INTERVALand known head SHAs, and either:done = 1so the existingsetStatus()path naturally completes them; orsetStatus()directly with the data CDash already has.Configuration knob
Defaulting to 60 minutes is conservative — well past any normal build duration but short enough to unblock human reviewers within the same workday.
Tests
Add a regression test in
app/cdash/tests/case/CDash/Lib/Repository/GitHubTest.phpthat constructs a build row withdone = 0andsubmittimeolder than the configured threshold and asserts thatgenerateCheckPayloadFromBuildRows()returnsstatus=completedwithconclusion=success(or whatever the actual collected data implies).Why a watchdog rather than fixing the dashboard scripts
We can (and should) tighten ITK's
itk_common.cmakesoctest_submit(PARTS Done)is always called even on warning failures. But:in_progressrow is a UX problem that no project-side fix can fully eliminate (network blips, runner termination, etc.).I'd be happy to put up a PR if the maintainers agree with Option A. Let me know if there's a preferred direction or any history I'm missing — there may be an existing knob (e.g.,
cdash.github_always_pass, which I see at line 384, but that's an all-projects all-builds escape hatch and not what we want).cc @bradlowekamp @thewtex @jcfr (frequent CDash + ITK reviewers).