CI: Capture rendered output and unified diff as pytest artifacts#3405
Open
a-v-popov wants to merge 1 commit into
Open
CI: Capture rendered output and unified diff as pytest artifacts#3405a-v-popov wants to merge 1 commit into
a-v-popov wants to merge 1 commit into
Conversation
…ailure
When a transformation test fails in CI, the GitHub Actions log shows
only pytest's truncated assertion diff -- there is no way to retrieve
the actual rendered YAML without re-running locally. _report_mismatch
in tests/test_transformation.py writes the rendered output and a
unified diff to the case's tmp_path on mismatch, routes a diff preview
through pytest.fail so the in-terminal failure message is useful on
its own (with a verbose-scaled cap so CI logs stay tight while -vv
gets the full diff), and CI workflows point pytest at
$RUNNER_TEMP/pytest-tmp and upload it on failure via
actions/upload-artifact@v7.
Two related but distinct choices in the helper: constructing the
failure message ourselves (rather than letting `assert result ==
expected` go through) skips pytest's assertion rewriter, which would
otherwise emit an ndiff truncated to ~8 lines of context at default
verbosity and flooded with matching lines at -vv. Either pytest.fail
or raise AssertionError with a preformatted string achieves that
much. pytest.fail is preferred over raise AssertionError because it
takes pytrace=False (here pytrace=verbose >= 3) -- a fixture mismatch
needs no helper-frame traceback at default verbosity, and -vvv turns
it back on for harness debugging.
The conftest.py reprcrash trim is paired with the above: pytest 9.x
reprints the full pytest.fail message in its short-summary block when
it detects CI (CI/BUILD_NUMBER env vars) or runs at -vv, which would
duplicate the diff body that _report_mismatch already embeds.
Trimming report.longrepr.reprcrash.message to its first line keeps the
FAILURES section intact while suppressing the duplicate.
The upload step gates on `!cancelled() && steps.pytest.outcome ==
'failure'` rather than `failure()` -- an earlier mypy/pip failure
would otherwise trigger a misleading prune+upload run on a basetemp
that doesn't exist.
$RUNNER_TEMP can't move into a jobs.tests.env block because
${{ runner.* }} isn't available outside step context (GitHub rejects
with "Unrecognized named-value: 'runner'"). A small "Compute pytest
tmp path" step exports PYTEST_TMP to $GITHUB_ENV instead; subsequent
steps see it as both $PYTEST_TMP and ${{ env.PYTEST_TMP }}.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #3391 + #3402. After those two PRs each YAML case is its own
pytest node and per-case file writes are isolated to
tmp_path, but when atransformation or error-log fixture mismatches in CI the GitHub Actions log
only shows pytest's truncated assertion diff -- there is no way to retrieve
the actual rendered YAML without re-running locally. This PR plumbs the
artifact path end-to-end and makes the in-terminal failure message useful
in its own right.
Summary
_report_mismatchhelper intests/test_transformation.py, used by bothrun_transformation_testandrun_error_case. On mismatch it:tmp_path/<base>.{yml,actual},difflib.unified_diff(hunks + 3 lines of context) totmp_path/<base>.diff,pytest.fail(message, pytrace=...)with a diff preview embeddedin the failure message.
-vcount -- CI / defaultcap at 20 lines,
-vdoubles it,-vvremoves the cap,-vvvalsore-enables
pytrace. The full diff is on disk regardless.run_error_casegains atmp_pathparameter, threaded throughtest_error_cases/test_coverage_errorsso the helper dropserror_log.actual/error_log.diffper case.tests/conftest.py:pytest_runtest_makereporthookwrapper trimsreport.longrepr.reprcrash.messageto its first line so pytest 9.x'sshort summary doesn't duplicate the diff body that the helper already
embeds in the failure message.
t-push.yml,t-pull.yml): aCompute pytest tmp pathstep exports
PYTEST_TMP=$RUNNER_TEMP/pytest-tmpto$GITHUB_ENV,pytest gets
--basetemp="$PYTEST_TMP" --junit-xml="$PYTEST_TMP/junit.xml",an
actions/upload-artifact@v7step gated on!cancelled() && steps.pytest.outcome == 'failure'(AND'd with the existing fork-skipguard in
t-pull.yml), and a small prune step between them.Why
Constructing the failure message ourselves, and
pytest.failoverraise AssertionErrorTwo related but distinct choices in
_report_mismatch.Why not
assert actual == expected: pytest's assertion rewriterintrospects
assert a == band emits anndiffrepresentation. Twotruncation layers apply:
output is capped at ~8 lines with
use '-v'/'-vv' to showmessages.-vit would print the fist ~8 lines of the file potentially containing no diff-vvthe cap is removed -- butndiffemits every line of the file,so a one-line fixture mismatch surfaces as thousands of matching
lines with the single changed line lost in the middle.
difflib.unified_diff(hunks + 3 lines of context) is the right algorithmfor fixture comparisons -- a single-line YAML mismatch is ~7 readable lines.
Constructing the message ourselves (whether via
pytest.failorraise AssertionError) hands the rewriter a preformatted string with nointrospectable comparison, so it lands in the test output verbatim.
Why
pytest.failoverraise AssertionError:pytest.failtakespytrace=False(herepytrace=verbose >= 3). A fixture mismatch needsno helper-frame traceback at default verbosity -- the diff preview and
the on-disk pointer are the whole story.
-vvvre-enablespytraceforharness debugging.
Trimming
reprcrash.messagepytest 9.x prints the full
pytest.fail(...)message verbatim in theshort test summary infoblock whenever it detects CI (CI/BUILD_NUMBERenv vars) or runs at-vv+, which would duplicate thediff body the helper deliberately embeds in the failure message.
A
pytest_runtest_makereporthookwrapper trimsreport.longrepr.reprcrash.messageto its first line;longreprisuntouched, so the
FAILURESsection still carries the diff preview andthe
Full diff: ...pointer. Pytest's ownReprFileLocation.toterminalalready applies the same first-line trim at one display site; the hook
just enforces the invariant for every consumer (short summary, JUnit XML,
custom reporters).
$RUNNER_TEMPbasetempGitHub designates
$RUNNER_TEMPfor ephemeral writes on its runners --whatever they do with that mount today or tomorrow (tmpfs, separate
volume, faster device), the workflow inherits. The natural shape would
be a single
jobs.tests.enventry, but${{ runner.* }}is not availableoutside a step context (GitHub rejects with
Unrecognized named-value: 'runner'). ACompute pytest tmp pathstepwrites
PYTEST_TMP=$RUNNER_TEMP/pytest-tmpto$GITHUB_ENV, after which$PYTEST_TMPis in every subsequent step's shell env and${{ env.PYTEST_TMP }}resolves in every subsequent step's expressions.Single source of truth, no per-step
env:blocks.actions/upload-artifact@v7v4runs on Node 20, which GitHub forces off the runner on 2026-09-16(and switches to Node 24 by default on 2026-06-02).
v7is the lateststable, defaults to Node 24, and the inputs we use (
name,path,retention-days) are unchanged.Prune step before the artifact upload
--basetempcreates a directory per test invocation (~273 fortransformation + errors). Passing cases early-return from
_report_mismatchand leave their case dir empty, so without pruning theuploaded zip is dominated by them.
find -type d -empty -deleteclearsthose;
find -xtype l -deletethen catches pytest's<test>-currentsymlinks that go dangling after the empty-dir sweep, preventing
upload-artifact(which follows symlinks by default) from includingzero-byte placeholders.
junit.xmlsits at the basetemp root as a regularfile and is unaffected by either find expression.
find -deletereturns0 on empty match, so the step is idempotent.
!cancelled() && steps.pytest.outcome == 'failure'gatingThe earlier draft of this PR gated the prune + upload on
if: failure(),which fires on any prior step failure. If
pip installormypyfailedbefore pytest ran,
$PYTEST_TMPdid not exist, thefindin the prunestep exited nonzero, and CI showed a misleading second red step with no
diagnostic value.
!cancelled()overrides the implicitsuccess()thatGitHub prepends to every
if:, andsteps.pytest.outcome == 'failure'narrows the trigger to pytest itself -- the prune and upload steps now
run iff there is something to upload.
What changes
tests/test_transformation.py_report_mismatch(test_case, label, actual, expected, tmp_path, actual_name, verbose). Usespathlib.Path.with_suffix(".diff")toderive the diff filename from the actuals filename (so transformation
gets
result.yml+result.diff, errors geterror_log.actual+error_log.diff).run_transformation_testandrun_error_casebecome thin wrappersthat call their respective
*_resultshelper and then_report_mismatch.run_error_casegainstmp_path;test_error_casesandtest_coverage_errorspass it through.pytestconfigand forwardmax(0, pytestconfig.getoption("verbose"))so the helper can sizeits preview.
tests/conftest.pypytest_runtest_makereporthookwrapper that trimsreport.longrepr.reprcrash.messageto its first line when present..github/workflows/t-push.ymlandt-pull.ymlCompute pytest tmp pathstep exportsPYTEST_TMP=$RUNNER_TEMP/pytest-tmpto$GITHUB_ENV.pytest --basetemp="$PYTEST_TMP" --junit-xml="$PYTEST_TMP/junit.xml"and gains
id: pytest.Prune empty case dirsstep runs the twofindcalls.Upload pytest artifactsstep usesactions/upload-artifact@v7,named
pytest-tmp-py${{ matrix.python-version }}so the per-matrix-cellartifacts do not collide.
t-pull.ymlrepeats the existing fork-skip guard(
github.event.pull_request.head.repo.full_name != 'ipspace/netlab')on the compute, prune and upload steps.
Anticipated questions
.gitignoreentry forpytest-tmp/?" Nothing in the repopoints pytest at an in-tree basetemp anymore: CI writes to
$RUNNER_TEMP/pytest-tmpand pytest's local default is/tmp/pytest-of-<user>/. An entry would only fire on an explicit--basetemp=tests/pytest-tmpthat nothing in the repo does ordocuments -- dead weight in a config that should describe paths the
repo actually produces.
_report_mismatchhelper instead of inlining in both wrappers?"The two wrappers had ~8 lines of mechanical duplication (compute diff,
write actual, write diff, fail). One helper, two thin call sites.
files in
tmp_path?" Only deletes empty directories and brokensymlinks. A test that writes anything keeps its dir; the
inventory-output cases already do, and they survive the prune unchanged.
-q+Fatal...fortrailing lines without newlines." Python's
difflib.unified_diffdoesnot emit GNU diff's
\ No newline at end of filesentinel. The bytecontent matches what's on disk;
diff -uon the artifact files givesthe GNU-annotated view.
Test plan
cd tests && PYTHONPATH=../ pytest -q-- all green.topology/expected/6pe.yml(e.g. flipadvertise_loopback: true->false),pytest --basetemp=/tmp/pytest-tmp -k 6pe.Expect: failure message contains a
--- expected/+++ actualunified diff with hunk headers; no
Skipping N identical leading characters/use -vv to showmarkers; no helper-frame traceback;/tmp/pytest-tmp/<case>/result.ymlandresult.diffwritten;short summary holds only the first line. Revert.
echo q >> errors/<case>.log),rerun. Expect:
error-log mismatchmessage with diff body;error_log.actualanderror_log.diffin the casetmp_path.Revert.
find <basetemp> -type d -empty -delete && find <basetemp> -xtype l -deletereduces the layout to the failing case (with its.diffpair) plus the non-empty inventory-output cases plusjunit.xml.(introduced temporarily) uploads a
pytest-tmp-py3.Xartifactwith the expected contents.