CI: Capture rendered output and unified diff as pytest artifacts by a-v-popov · Pull Request #3405 · ipspace/netlab

a-v-popov · 2026-05-17T14:14:19Z

Follow-up to #3391 + #3402. After those two PRs each YAML case is its own
pytest node and per-case file writes are isolated to tmp_path, but when a
transformation or error-log fixture mismatches in CI the GitHub Actions log
only shows pytest's truncated assertion diff -- there is no way to retrieve
the actual rendered YAML without re-running locally. This PR plumbs the
artifact path end-to-end and makes the in-terminal failure message useful
in its own right.

Summary

_report_mismatch helper in tests/test_transformation.py, used by both
run_transformation_test and run_error_case. On mismatch it:
1. writes the produced content to tmp_path/<base>.{yml,actual},
2. writes a difflib.unified_diff (hunks + 3 lines of context) to
  tmp_path/<base>.diff,
3. calls pytest.fail(message, pytrace=...) with a diff preview embedded
  in the failure message.
Preview length and pytrace scale with pytest's -v count -- CI / default
cap at 20 lines, -v doubles it, -vv removes the cap, -vvv also
re-enables pytrace. The full diff is on disk regardless.
run_error_case gains a tmp_path parameter, threaded through
test_error_cases / test_coverage_errors so the helper drops
error_log.actual / error_log.diff per case.
tests/conftest.py: pytest_runtest_makereport hookwrapper trims
report.longrepr.reprcrash.message to its first line so pytest 9.x's
short summary doesn't duplicate the diff body that the helper already
embeds in the failure message.
CI workflows (t-push.yml, t-pull.yml): a Compute pytest tmp path
step exports PYTEST_TMP=$RUNNER_TEMP/pytest-tmp to $GITHUB_ENV,
pytest gets --basetemp="$PYTEST_TMP" --junit-xml="$PYTEST_TMP/junit.xml",
an actions/upload-artifact@v7 step gated on !cancelled() && steps.pytest.outcome == 'failure' (AND'd with the existing fork-skip
guard in t-pull.yml), and a small prune step between them.

Why

Constructing the failure message ourselves, and `pytest.fail` over `raise AssertionError`

Two related but distinct choices in _report_mismatch.

Why not assert actual == expected: pytest's assertion rewriter
introspects assert a == b and emits an ndiff representation. Two
truncation layers apply:

At default verbosity it strips the long identical prefix/suffix and the
output is capped at ~8 lines with use '-v'/'-vv' to show messages.
At -v it would print the fist ~8 lines of the file potentially containing no diff
At -vv the cap is removed -- but ndiff emits every line of the file,
so a one-line fixture mismatch surfaces as thousands of matching
lines with the single changed line lost in the middle.

difflib.unified_diff (hunks + 3 lines of context) is the right algorithm
for fixture comparisons -- a single-line YAML mismatch is ~7 readable lines.
Constructing the message ourselves (whether via pytest.fail or
raise AssertionError) hands the rewriter a preformatted string with no
introspectable comparison, so it lands in the test output verbatim.

Why pytest.fail over raise AssertionError: pytest.fail takes
pytrace=False (here pytrace=verbose >= 3). A fixture mismatch needs
no helper-frame traceback at default verbosity -- the diff preview and
the on-disk pointer are the whole story. -vvv re-enables pytrace for
harness debugging.

Trimming `reprcrash.message`

pytest 9.x prints the full pytest.fail(...) message verbatim in the
short test summary info block whenever it detects CI (CI /
BUILD_NUMBER env vars) or runs at -vv+, which would duplicate the
diff body the helper deliberately embeds in the failure message.
A pytest_runtest_makereport hookwrapper trims
report.longrepr.reprcrash.message to its first line; longrepr is
untouched, so the FAILURES section still carries the diff preview and
the Full diff: ... pointer. Pytest's own ReprFileLocation.toterminal
already applies the same first-line trim at one display site; the hook
just enforces the invariant for every consumer (short summary, JUnit XML,
custom reporters).

`$RUNNER_TEMP` basetemp

GitHub designates $RUNNER_TEMP for ephemeral writes on its runners --
whatever they do with that mount today or tomorrow (tmpfs, separate
volume, faster device), the workflow inherits. The natural shape would
be a single jobs.tests.env entry, but ${{ runner.* }} is not available
outside a step context (GitHub rejects with
Unrecognized named-value: 'runner'). A Compute pytest tmp path step
writes PYTEST_TMP=$RUNNER_TEMP/pytest-tmp to $GITHUB_ENV, after which
$PYTEST_TMP is in every subsequent step's shell env and
${{ env.PYTEST_TMP }} resolves in every subsequent step's expressions.
Single source of truth, no per-step env: blocks.

`actions/upload-artifact@v7`

v4 runs on Node 20, which GitHub forces off the runner on 2026-09-16
(and switches to Node 24 by default on 2026-06-02). v7 is the latest
stable, defaults to Node 24, and the inputs we use (name, path,
retention-days) are unchanged.

Prune step before the artifact upload

--basetemp creates a directory per test invocation (~273 for
transformation + errors). Passing cases early-return from
_report_mismatch and leave their case dir empty, so without pruning the
uploaded zip is dominated by them. find -type d -empty -delete clears
those; find -xtype l -delete then catches pytest's <test>-current
symlinks that go dangling after the empty-dir sweep, preventing
upload-artifact (which follows symlinks by default) from including
zero-byte placeholders. junit.xml sits at the basetemp root as a regular
file and is unaffected by either find expression. find -delete returns
0 on empty match, so the step is idempotent.

`!cancelled() && steps.pytest.outcome == 'failure'` gating

The earlier draft of this PR gated the prune + upload on if: failure(),
which fires on any prior step failure. If pip install or mypy failed
before pytest ran, $PYTEST_TMP did not exist, the find in the prune
step exited nonzero, and CI showed a misleading second red step with no
diagnostic value. !cancelled() overrides the implicit success() that
GitHub prepends to every if:, and steps.pytest.outcome == 'failure'
narrows the trigger to pytest itself -- the prune and upload steps now
run iff there is something to upload.

What changes

tests/test_transformation.py
- Adds _report_mismatch(test_case, label, actual, expected, tmp_path, actual_name, verbose). Uses pathlib.Path.with_suffix(".diff") to
  derive the diff filename from the actuals filename (so transformation
  gets result.yml + result.diff, errors get error_log.actual +
  error_log.diff).
- run_transformation_test and run_error_case become thin wrappers
  that call their respective *_results helper and then
  _report_mismatch.
- run_error_case gains tmp_path; test_error_cases and
  test_coverage_errors pass it through.
- All four parametrized test functions take pytestconfig and forward
  max(0, pytestconfig.getoption("verbose")) so the helper can size
  its preview.
tests/conftest.py
- pytest_runtest_makereport hookwrapper that trims
  report.longrepr.reprcrash.message to its first line when present.
.github/workflows/t-push.yml and t-pull.yml
- Compute pytest tmp path step exports
  PYTEST_TMP=$RUNNER_TEMP/pytest-tmp to $GITHUB_ENV.
- Pytest invocation becomes
  pytest --basetemp="$PYTEST_TMP" --junit-xml="$PYTEST_TMP/junit.xml"
  and gains id: pytest.
- Prune empty case dirs step runs the two find calls.
- Upload pytest artifacts step uses actions/upload-artifact@v7,
  named pytest-tmp-py${{ matrix.python-version }} so the per-matrix-cell
  artifacts do not collide.
- t-pull.yml repeats the existing fork-skip guard
  (github.event.pull_request.head.repo.full_name != 'ipspace/netlab')
  on the compute, prune and upload steps.

Anticipated questions

"Why no .gitignore entry for pytest-tmp/?" Nothing in the repo
points pytest at an in-tree basetemp anymore: CI writes to
$RUNNER_TEMP/pytest-tmp and pytest's local default is
/tmp/pytest-of-<user>/. An entry would only fire on an explicit
--basetemp=tests/pytest-tmp that nothing in the repo does or
documents -- dead weight in a config that should describe paths the
repo actually produces.
"Why a _report_mismatch helper instead of inlining in both wrappers?"
The two wrappers had ~8 lines of mechanical duplication (compute diff,
write actual, write diff, fail). One helper, two thin call sites.
"Will the prune step fight with a future test that legitimately leaves
files in tmp_path?" Only deletes empty directories and broken
symlinks. A test that writes anything keeps its dir; the
inventory-output cases already do, and they survive the prune unchanged.
"Cosmetic: unified-diff in pytest's terminal shows -q+Fatal... for
trailing lines without newlines." Python's difflib.unified_diff does
not emit GNU diff's \ No newline at end of file sentinel. The byte
content matches what's on disk; diff -u on the artifact files gives
the GNU-annotated view.

Test plan

cd tests && PYTHONPATH=../ pytest -q -- all green.
Corrupt topology/expected/6pe.yml (e.g. flip
advertise_loopback: true -> false),
pytest --basetemp=/tmp/pytest-tmp -k 6pe.
Expect: failure message contains a --- expected / +++ actual
unified diff with hunk headers; no Skipping N identical leading characters / use -vv to show markers; no helper-frame traceback;
/tmp/pytest-tmp/<case>/result.yml and result.diff written;
short summary holds only the first line. Revert.
Corrupt one error-log fixture (e.g. echo q >> errors/<case>.log),
rerun. Expect: error-log mismatch message with diff body;
error_log.actual and error_log.diff in the case tmp_path.
Revert.
Full run with one corruption in place: confirm
find <basetemp> -type d -empty -delete && find <basetemp> -xtype l -delete reduces the layout to the failing case (with its
.diff pair) plus the non-empty inventory-output cases plus
junit.xml.
CI on this branch: t-push and t-pull both green; failure case
(introduced temporarily) uploads a pytest-tmp-py3.X artifact
with the expected contents.

…ailure When a transformation test fails in CI, the GitHub Actions log shows only pytest's truncated assertion diff -- there is no way to retrieve the actual rendered YAML without re-running locally. _report_mismatch in tests/test_transformation.py writes the rendered output and a unified diff to the case's tmp_path on mismatch, routes a diff preview through pytest.fail so the in-terminal failure message is useful on its own (with a verbose-scaled cap so CI logs stay tight while -vv gets the full diff), and CI workflows point pytest at $RUNNER_TEMP/pytest-tmp and upload it on failure via actions/upload-artifact@v7. Two related but distinct choices in the helper: constructing the failure message ourselves (rather than letting `assert result == expected` go through) skips pytest's assertion rewriter, which would otherwise emit an ndiff truncated to ~8 lines of context at default verbosity and flooded with matching lines at -vv. Either pytest.fail or raise AssertionError with a preformatted string achieves that much. pytest.fail is preferred over raise AssertionError because it takes pytrace=False (here pytrace=verbose >= 3) -- a fixture mismatch needs no helper-frame traceback at default verbosity, and -vvv turns it back on for harness debugging. The conftest.py reprcrash trim is paired with the above: pytest 9.x reprints the full pytest.fail message in its short-summary block when it detects CI (CI/BUILD_NUMBER env vars) or runs at -vv, which would duplicate the diff body that _report_mismatch already embeds. Trimming report.longrepr.reprcrash.message to its first line keeps the FAILURES section intact while suppressing the duplicate. The upload step gates on `!cancelled() && steps.pytest.outcome == 'failure'` rather than `failure()` -- an earlier mypy/pip failure would otherwise trigger a misleading prune+upload run on a basetemp that doesn't exist. $RUNNER_TEMP can't move into a jobs.tests.env block because ${{ runner.* }} isn't available outside step context (GitHub rejects with "Unrecognized named-value: 'runner'"). A small "Compute pytest tmp path" step exports PYTEST_TMP to $GITHUB_ENV instead; subsequent steps see it as both $PYTEST_TMP and ${{ env.PYTEST_TMP }}. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Capture rendered output and unified diff as pytest artifacts#3405

CI: Capture rendered output and unified diff as pytest artifacts#3405
a-v-popov wants to merge 1 commit into
ipspace:devfrom
a-v-popov:test-harness-ci-artifacts-pr

a-v-popov commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

a-v-popov commented May 17, 2026

Summary

Why

Constructing the failure message ourselves, and pytest.fail over raise AssertionError

Trimming reprcrash.message

$RUNNER_TEMP basetemp

actions/upload-artifact@v7

Prune step before the artifact upload

!cancelled() && steps.pytest.outcome == 'failure' gating

What changes

Anticipated questions

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Constructing the failure message ourselves, and `pytest.fail` over `raise AssertionError`

Trimming `reprcrash.message`

`$RUNNER_TEMP` basetemp

`actions/upload-artifact@v7`

`!cancelled() && steps.pytest.outcome == 'failure'` gating