Reproducing CI Test Failures Locally

title: Reproducing CI Test Failures Locally

Author(s): Vibhoothi, Urvang Joshi, Krishna Rapaka, Leo Zhao

Table of Content

Introduction
Prerequisites
Check Stage + Style check
- PR Check
- Download Test Data
Build Stage
Test Stage
Large tests from Nightly CI
- Background
- Running Large Tests Locally in Parallel

Introduction

This wiki page explains how to reproduce failures seen on AVM GitHub CI locally.

Note that this wiki is intended as a guide, but GitHub workflows should always be considered the source of truth.

Prerequisites

The following tools are required:

[The exact versions used currently by GitHub CI are also noted below.]

cmake 4.0.3
git 2.43.0
perl 5.38.2
nasm 2.16.01
yasm 1.3.0
doxygen 1.9.8
python 3.12.3
cmake-format 0.6.13
clang, clang-tools and clang-format 18.1.8
gcc and g++ 14.2.0

Check Stage

Style check

This script checks the code style of files included in an PR using clang-format 18 and cmake-format 0.6.13.

[Versions used by CI are also noted in .clang-format and .cmake-format.py]

Ideally, you should automate the code-formatting using a pre-commit workflow.

But, if you need to format code manually, you can use the script below:

#!/usr/bin/bash
BRANCH_NAME=$1
CLANG_FORMAT_PATH=/usr/lib/llvm-18/bin
TARGET_BRANCH="${2:-upstream/main}"
echo "Checking $BRANCH_NAME against $TARGET_BRANCH"
echo "Applying clang-format ..."
for file in $(git diff --diff-filter=ACPR --name-only $BRANCH_NAME $TARGET_BRANCH -- "*.[hc]pp" "*.cc" "*.[ch]") ; do
    ${CLANG_FORMAT_PATH}/clang-format -i --style=file ${file}
    git add ${file}
    echo "Formatted file: $file"
done
echo "Done."

CMAKE_FORMAT_PATH=/usr/bin/
echo "Applying cmake-format ..."
for file in $(git diff --diff-filter=ACPR --name-only $BRANCH_NAME $TARGET_BRANCH -- '*.cmake' CMakeLists.txt) ; do
    ${CMAKE_FORMAT_PATH}/cmake-format -i ${file}
    git add ${file}
    echo "Formatted file: $file"
done
echo "Done."

Here:

First argument is the branch/commit to format (default: HEAD)
Second argument (optional) is the target branch to compare against (default: upstream/main).

Examples:

bash avm-fix-style.sh research-erp: Will format research-erp changes against upstream/main
bash avm-fix-style.sh 9dfe24c31b10d6b3295c51693878dd27115d51be Will format 9dfe24c31b10d6b3295c51693878dd27115d51be changes against upstream/main
bash avm-fix-style.sh 9dfe24c31b10d6b3295c51693878dd27115d51be upstream/research-erp Will check changes of 9dfe24c31b10d6b3295c51693878dd27115d51be against upstream/research-erp.
bash avm-fix-style.sh, will check the HEAD changes against upstream/main.

PR Check

This script performs the following check:

Executable file check: Fails if any of the non-script files in the PR have executable bit set.

Download Test Data

Testdata is downloaded as follows, and used in later stages for unit tests and example tests.

export LIBAVM_TEST_DATA_PATH=/your/path/here
cmake -B avm_testdata -DCMAKE_POLICY_VERSION_MINIMUM=3.5
cmake --build avm_testdata --target testdata

This will take sometime depending on your internet connection. avm_testdata is ~718MB.

Build Stage

Build AVM in Various Configurations

CI tests AVM codebase compilation in a number of different configurations.

To reproduce a particular compilation locally:

(1) Note the CMake Flags and Extra CMake Flags in the log of a particular job. Then set environment variables CMAKE_FLAGS and EXTRA_CMAKE_FLAGS accordingly.

For example, a log may have:

CMake Flags:       -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DENABLE_CCACHE=1 -DENABLE_WERROR=1 -DENABLE_DOCS=0 -DCONFIG_INSPECTION=1 -DCONFIG_ACCOUNTING=1
Extra CMake Flags: -DAVM_TARGET_CPU=generic

In that case, we set:

CMAKE_FLAGS="-DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DENABLE_CCACHE=1 -DENABLE_WERROR=1 -DENABLE_DOCS=0 -DCONFIG_INSPECTION=1 -DCONFIG_ACCOUNTING=1"
EXTRA_CMAKE_FLAGS="-DAVM_TARGET_CPU=generic"

(2) Compile as follows:

cmake -B avm_build -S . -DCMAKE_BUILD_TYPE=Release $CMAKE_FLAGS $EXTRA_CMAKE_FLAGS
cmake --build avm_build -j 2
cmake --build avm_build --target dist

Build Documentation

# Force doxygen warnings to be treated as errors.
sed -i 's/WARN_AS_ERROR\s\+= NO/WARN_AS_ERROR        = YES/g'  libs.doxy_template
cmake -B avm_docs_build -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DENABLE_DOCS=1 -DENABLE_EXAMPLES=0 -DBUILD_SHARED_LIBS=0 -DCMAKE_BUILD_TYPE=Debug
cmake --build avm_docs_build --target docs

Build Examples

cmake -B avm_example_build -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_EXAMPLES=1 -DBUILD_SHARED_LIBS=0
cmake --build avm_example_build

Build Unit Tests with Sanitizers

In AVM CI, build stage uses 5 types of sanitizers:

address
undefined
integer
thread
memory

Build with `address` / `undefined` / `integer` / `thread` Sanitizers:

Firstly, select the desired sanitizer:

AVM_SANITIZER_TYPE=address  # or other sanitizers above

Then, build AVM using desired sanitizer:

export PATH="/usr/lib/llvm-18/bin:${PATH}"
SANITIZER_BLACKLIST_FLAG="-fsanitize-blacklist=${PWD}/.gitlab/SanitizerIgnores.txt"

CMAKE_FLAGS="-DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DENABLE_CCACHE=1"
CMAKE_FLAGS="$CMAKE_FLAGS -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++"
CMAKE_FLAGS="$CMAKE_FLAGS -DSANITIZE=${AVM_SANITIZER_TYPE}"
CMAKE_FLAGS="$CMAKE_FLAGS -DAVM_EXTRA_C_FLAGS=${SANITIZER_BLACKLIST_FLAG}"
CMAKE_FLAGS="$CMAKE_FLAGS -DAVM_EXTRA_CXX_FLAGS=${SANITIZER_BLACKLIST_FLAG}"

cmake -B "avm_build/$AVM_SANITIZER_TYPE" -DCMAKE_BUILD_TYPE=RelWithDebInfo $CMAKE_FLAGS
cmake --build "avm_build/$AVM_SANITIZER_TYPE"

Build with `memory` Sanitizer (Special):

Memory sanitizer requires building some dependencies also with memory sanitizers; and also requires a C-only build (no assembly code).

Here is a script to build with memory sanitizer:

# Assumes LLVM 18.1.8 is installed on the machine.
VERSION_MAJOR=18
VERSION="${VERSION_MAJOR}.1.8"
export PATH="/usr/lib/llvm-${VERSION_MAJOR}/bin:${PATH}"

# Build cxx and cxxabi with msan.
if [[ ! -d llvm-project ]]; then
  git clone --depth=1 --branch llvmorg-${VERSION}  https://github.com/llvm/llvm-project
fi
LLVM_MSAN_BUILD_DIR=${PWD}/llvm-project/msan_out
rm -rf "${LLVM_MSAN_BUILD_DIR}"
cmake -G Ninja -S llvm-project/runtimes -B "${LLVM_MSAN_BUILD_DIR}" \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind" \
        -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DLLVM_USE_SANITIZER=MemoryWithOrigins \
        -DLIBCXXABI_USE_LLVM_UNWINDER=OFF
cmake --build "${LLVM_MSAN_BUILD_DIR}" --config Release -- cxx cxxabi unwind

# Build AVM with msan.
AVM_MSAN_BUILD_DIR=avm_build/memory
rm -rf "${AVM_MSAN_BUILD_DIR}"
MSAN_CFLAGS="-fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -g -O2 -I${LLVM_MSAN_BUILD_DIR}/include"
MSAN_CXXFLAGS="${MSAN_CFLAGS} -I${LLVM_MSAN_BUILD_DIR}/include/c++/v1 -stdlib=libc++"
MSAN_LD_FLAGS="${MSAN_CXXFLAGS} -L${LLVM_MSAN_BUILD_DIR}/lib -lc++abi"

export CFLAGS="${MSAN_CFLAGS}"
export CXXFLAGS="${MSAN_CXXFLAGS}"
export LDFLAGS="${MSAN_LD_FLAGS}"
cmake -B "${AVM_MSAN_BUILD_DIR}" -DAVM_TARGET_CPU=generic -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
cmake --build "${AVM_MSAN_BUILD_DIR}" -j64

Test Stage

Run Unit Tests with Sanitizers

These tests run unit tests under different sanitizers, to check for any issues.

(1) Build using desired sanitizer as described in Build Unit Tests with Sanitizers

(2) Set test data path if not set already:

export LIBAVM_TEST_DATA_PATH=/your/path/here

(3) Set common sanitizer options:

SANITIZER_OPTIONS=handle_segv=1:handle_abort=1:handle_sigfpe=1:fast_unwind_on_fatal=1:allocator_may_return_null=1

(4) Set specific options for desired sanitizer:

address sanitizer:

SANITIZER_OPTIONS="${SANITIZER_OPTIONS}:detect_stack_use_after_return=1"
SANITIZER_OPTIONS="${SANITIZER_OPTIONS}:max_uar_stack_size_log=17"
export ASAN_OPTIONS="${SANITIZER_OPTIONS}"

thread sanitizer:

TSAN_OPTIONS="handle_sigfpe=1"
TSAN_OPTIONS="$TSAN_OPTIONS handle_segv=1"
TSAN_OPTIONS="$TSAN_OPTIONS handle_abort=1"
export $TSAN_OPTIONS

undefined / integer sanitizer:

SANITIZER_OPTIONS="${SANITIZER_OPTIONS}:print_stacktrace=1"
SANITIZER_OPTIONS="${SANITIZER_OPTIONS}:report_error_type=1"
SANITIZER_OPTIONS="${SANITIZER_OPTIONS}:suppressions=.gitlab/UBSan.supp"
export UBSAN_OPTIONS="${SANITIZER_OPTIONS}"

memory sanitizer:

export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${AVM_MSAN_BUILD_DIR}/lib"

(5) Set unit test filter: From GitHub CI log, find the unit test that failed / had sanitizer errors and set unit test filter to run that particular test.

filter=<name_of_failing_unit_test>

(6) Finally, run the unit tests under sanitizer as follows:

./avm_build/${AVM_SANITIZER_TYPE}/test_libavm --gtest_filter="${filter}" 2> >(tee /tmp/sanitizer.log >&2)

The errors/warnings can be found in the /tmp/sanitizer.log file.

Example Tests

Unlike unit tests, this stage tests binaries (like avmenc, avmdec etc) directly with different command-line options.

To reproduce:

(1) Get testdata as described in Download Test Data section.

(2) Build examples as described in Build Examples section.

(3) Run the tests as follows:

cd avm_example_build
export LIBAVM_TEST_DATA_PATH=/your/path/here
TEST_NAME=failing_test_name_here

sh ../test/examples.sh \
        --bin-path examples \
        --verbose \
        --show-program-output \
        --filter "\b${TEST_NAME}\b"

Encoding Comparison

This stage compares MD5 of the encoded results between:

Previous Encode (base commit of the PR) and
Current Encode (last commit of the current PR)

to ensure that an PR doesn't unintentionally change the encoding output.

If there is a STATS_CHANGED keyword in any of the commit messages of the PR, then the script will assume that encoding output change is intentional, and will exit gracefully.

The previous and current encodes are performed on the clip Vertical_Bayshore_270x480_2997.y4m, which can be downloaded as below:

curl -s -S -f -O https://github.com/AOMediaCodec/aom-testing/raw/master/test-files/Vertical_Bayshore_270x480_2997.xz
unxz Vertical_Bayshore_270x480_2997.y4m

The clip is encoded in 3 configurations as noted below:

1. All-Intra:

${AVMENC} --debug --cpu-used=0 --passes=1 --end-usage=q --qp=210 --kf-min-dist=0 --kf-max-dist=0 --use-fixed-qp-offsets=1 --deltaq-mode=0 --enable-tpl-model=0 --enable-keyframe-filtering=0 --psnr --obu --limit=30 -o "${all-intra}.obu" Vertical_Bayshore_270x480_2997.y4m 2>&1 | tee "${all-intra}.psnr.log"

2. Random-Access:

${AVMENC} --debug --cpu-used=0 --passes=1 --lag-in-frames=19 --auto-alt-ref=1 --min-gf-interval=16 --max-gf-interval=16 --gf-min-pyr-height=4 --gf-max-pyr-height=4 --kf-min-dist=65 --kf-max-dist=65 --use-fixed-qp-offsets=1 --deltaq-mode=0 --enable-tpl-model=0 --end-usage=q --qp=210 --enable-keyframe-filtering=0 --obu --limit=30 -o random-access.obu Vertical_Bayshore_270x480_2997.y4m 2>&1 | tee "random-access.psnr.log"

3. Low-Delay:

${AVMENC}  --debug --cpu-used=0 --passes=1 --lag-in-frames=0 --min-gf-interval=16 --max-gf-interval=16 --gf-min-pyr-height=4 --gf-max-pyr-height=4 --kf-min-dist=9999 --kf-max-dist=9999 --use-fixed-qp-offsets=1 --deltaq-mode=0 --enable-tpl-model=0 --end-usage=q --qp=210 --subgop-config-str=ld --enable-keyframe-filtering=0 --obu --limit=${AVMENC_LIMIT} -o low-delay.obu Vertical_Bayshore_270x480_2997.y4m 2>&1 | tee low-delay.psnr.log"

Static Analysis

Static analysis is performed using the scan-build tool from LLVM , in shallow and deep modes.

Reproducing locally

Static analysis can be reproduced locally as follows:

export ANALYZER_MODE=shallow # OR export ANALYZER_MODE=deep
export PATH="/usr/lib/llvm-18/bin:${PATH}"
mkdir output-${ANALYZER_MODE}
scan_build() {
  scan-build --exclude third_party --exclude _deps --exclude abseil-cpp --exclude benchmark --exclude cpuinfo --exclude eigen --exclude farmhash --exclude fft2d --exclude flatbuffers --exclude flatbuffers-flatc --exclude fp16  --exclude FP16 --exclude FXdiv --exclude psimd  --exclude ml_dtypes --exclude neon2sse  --exclude protobuf --exclude pthreadpool --exclude pthreadpool-source --exclude ruy --exclude xnnpack --exclude XNNPACK  --exclude googletest -o output-${ANALYZER_MODE} -analyzer-config mode=${ANALYZER_MODE} $*
}
scan_build cmake -B avm_build -GNinja -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DCMAKE_BUILD_TYPE=Debug
scan_build --status-bugs cmake --build avm_build

Large tests from Nightly CI

Background

AVM uses 2 types of CI jobs:

CI that runs at every PR: this only runs small tests that provide reasonable coverage but don't take too long to run. This is achieved by setting gtest filter to -*Large*
CI that runs nightly (tip-of-tree main branch): this runs all the unit tests, including the large tests. If it's desired to test these locally, one can use the gtest filter *Large*.

Running Large Tests Locally in Parallel

Assuming you have a machine with lots of cores, you can run large tests locally with sharding (parallel execution) as follows:

export GTEST_TOTAL_SHARDS=50  # parallel runs
seq 0 $(( $GTEST_TOTAL_SHARDS - 1 )) | \
xargs -n 1 -P 0 -I {} sh -c 'env GTEST_SHARD_INDEX={} ./test_libavm --gtest_filter=*Large* 2>&1 | tee /tmp/shard_{}.log'

Console log of each shard will be stored in: /tmp/shard_*.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing CI Test Failures Locally

title: Reproducing CI Test Failures Locally

Introduction

Prerequisites

Check Stage

Style check

PR Check

Download Test Data

Build Stage

Build AVM in Various Configurations

Build Documentation

Build Examples

Build Unit Tests with Sanitizers

Build with `address` / `undefined` / `integer` / `thread` Sanitizers:

Build with `memory` Sanitizer (Special):

Test Stage

Run Unit Tests with Sanitizers

Example Tests

Encoding Comparison

Static Analysis

Reproducing locally

Large tests from Nightly CI

Background

Running Large Tests Locally in Parallel

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Reproducing CI Test Failures Locally

title: Reproducing CI Test Failures Locally

Introduction

Prerequisites

Check Stage

Style check

PR Check

Download Test Data

Build Stage

Build AVM in Various Configurations

Build Documentation

Build Examples

Build Unit Tests with Sanitizers

Build with address / undefined / integer / thread Sanitizers:

Build with memory Sanitizer (Special):

Test Stage

Run Unit Tests with Sanitizers

Example Tests

Encoding Comparison

Static Analysis

Reproducing locally

Large tests from Nightly CI

Background

Running Large Tests Locally in Parallel

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Build with `address` / `undefined` / `integer` / `thread` Sanitizers:

Build with `memory` Sanitizer (Special):