Bash Coding Style Guide

This guide documents the bash coding conventions and patterns used throughout the libp2p test framework. All examples are derived from actual code in ./lib, ./perf, ./transport, and ./hole-punch.

File Structure and Shebang
Variable Naming Conventions
Function Definitions
Quoting and Safety
Array Operations with readarray
Command Substitution: Subshells vs Direct Calls
Name References (local -n)
Error Handling
Conditional Expressions
String Operations
File Locking
Parallelization

File Structure and Shebang

Standard Header

All bash scripts start with shebang and descriptive comment:

#!/bin/bash
# Brief description of what this script does

Example (from lib/lib-filter-engine.sh):

#!/bin/bash
# Common filter engine for test/baseline/relay/router filtering
# Provides recursive alias expansion with loop detection, proper inversion, and deduplication

Set Options

Use set to configure bash behavior at the top of scripts:

set -ueo pipefail

-u: Error on undefined variables
-e: Exit on error (use carefully, often omitted in main scripts)
-o pipefail: Pipelines fail if any command fails

Example (from perf/lib/generate-tests.sh):

#!/bin/bash
# Generate test matrix from ${IMAGES_YAML} with filtering

set -ueo pipefail

trap 'echo "ERROR in generate-tests.sh at line $LINENO: Command exited with status $?" >&2' ERR

Script Directory Detection

Get the directory containing the current script:

_this_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

Example (from lib/lib-filter-engine.sh:7-8):

if ! type indent &>/dev/null; then
  _this_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
  source "${_this_script_dir}/lib-output-formatting.sh"
fi

Variable Naming Conventions

Case Conventions

SCREAMING_SNAKE_CASE for:

Global variables
Environment variables
Constants
Configuration values

TEST_IGNORE="${TEST_IGNORE:-}"
IMAGES_YAML="${IMAGES_YAML:-./images.yaml}"
CACHE_DIR="${CACHE_DIR:-/srv/cache}"
WORKER_COUNT="${WORKER_COUNT:-1}"
DEBUG="${DEBUG:-false}"

Note: The test framework uses get_cpu_count() from lib/lib-host-os.sh for cross-platform CPU detection (macOS uses sysctl, Linux/WSL uses nproc).

snake_case for:

Local variables
Function parameters
Loop variables

local test_name="rust-v0.56 x rust-v0.56 (tcp, noise, yamux)"
local dialer_id="rust-v0.56"
local listener_id="rust-v0.56"

Underscore Prefixes

Use leading underscore for:

Private/internal functions
Internal/temporary variables

_resolve_alias() {
  local alias_name="${1}"
  # ...
}

_this_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

Variable Initialization with Defaults

Use parameter expansion for defaults:

# Simple default
IMAGES_YAML="${IMAGES_YAML:-./images.yaml}"

# Command substitution default
WORKER_COUNT="${WORKER_COUNT:-$(get_cpu_count)}"

# Empty string default (variable may not be set)
TEST_IGNORE="${TEST_IGNORE:-}"

Example (from lib/lib-common-init.sh:23-33):

# Files
IMAGES_YAML="${IMAGES_YAML:-${TEST_ROOT}/images.yaml}"

# Paths
CACHE_DIR="${CACHE_DIR:-/srv/cache}"
TEST_RUN_DIR="${TEST_RUN_DIR:-${CACHE_DIR}/test-run}"

# Common filtering variables
TEST_IGNORE="${TEST_IGNORE:-}"
TRANSPORT_IGNORE="${TRANSPORT_IGNORE:-}"
SECURE_IGNORE="${SECURE_IGNORE:-}"
MUXER_IGNORE="${MUXER_IGNORE:-}"

Array Variables

Arrays use same naming conventions but are declared with ():

# Global arrays
readarray -t FAILED_TESTS < <(...)
all_image_ids=()
filtered_image_ids=()

# Local arrays
local -a result_parts=()
local parts=()

Function Definitions

Function Declaration Style

Use function name followed by parentheses (no function keyword):

# Good
compute_test_key() {
  local test_name="$1"
  # ...
}

# Avoid
function compute_test_key {
  # ...
}

Function Documentation

Document complex functions with comments:

# Compute cache key for the test run from images.yaml + any other parameters
#
# Args:
#   $1: images_yaml - Path to images.yaml file
#   $@: Additional parameters to include in hash
# Returns:
#   8-character hexadecimal string
# Usage:
#   TEST_RUN_KEY=$(compute_test_run_key "$IMAGES_YAML" "$TEST_IGNORE" "$DEBUG")
compute_test_run_key() {
  local images_yaml="$1"
  shift
  # ...
}

Example (from lib/lib-test-caching.sh:11-34):

# Compute cache key for the test run from images.yaml + any other parameters
#
# Usage:
# compute_test_run_key "images.yaml"
compute_test_run_key() {
  local images_yaml="$1"
  shift

  # 1. Load contents of $images_yaml file
  local contents=$(<"${images_yaml}")

  # 2. Remaining arguments joined with '||'
  local args
  if (( $# == 0 )); then
    args=""
  else
    args=$(printf '%s\n' "$@" | paste -sd '|' -)
  fi

  # 3. Calculate the hash of both
  local hash=$(printf '%s' "${contents}${args}" | sha256sum | cut -d ' ' -f1)

  echo "${hash:0:8}"
}

Function Parameters

Always use local for function parameters:

my_function() {
  local param1="$1"
  local param2="$2"
  local optional_param="${3:-default_value}"

  # Function body
}

Use shift to handle variable arguments:

compute_test_run_key() {
  local images_yaml="$1"
  shift  # Remove first argument, $@ now contains remaining args

  # Process remaining arguments
  local args=$(printf '%s\n' "$@" | paste -sd '|' -)
}

Quoting and Safety

Always Quote Variables

Rule: Quote all variable expansions unless you explicitly want word splitting:

# Good
docker build -t "${image_name}" "${build_path}"
if [ -f "${COMPOSE_FILE}" ]; then
  rm "${COMPOSE_FILE}"
fi

# Bad (can break with spaces)
docker build -t $image_name $build_path
if [ -f $COMPOSE_FILE ]; then
  rm $COMPOSE_FILE
fi

Special Cases: Don't Quote When Word Splitting Is Intended

Some cases require unquoted variables for word splitting:

# Intentionally unquoted for word splitting
for transport in ${common_transports}; do
  # common_transports is space-separated: "tcp ws quic-v1"
done

# Docker compose command (WARNING: must not be quoted!)
${DOCKER_COMPOSE_CMD} -f "${COMPOSE_FILE}" up
# DOCKER_COMPOSE_CMD might be "docker compose" or "podman-compose"

Example (from perf/lib/run-single-test.sh:142-145):

# WARNING: Do NOT put quotes around this because the command has two parts
if timeout "${TEST_TIMEOUT}" ${DOCKER_COMPOSE_CMD} -f "${COMPOSE_FILE}" up \
  --exit-code-from dialer --abort-on-container-exit >> "${LOG_FILE}" 2>&1; then

Array Operations with readarray

What is readarray?

readarray (also called mapfile) reads lines from stdin into an array. It's extremely useful for capturing command output into arrays.

Syntax:

readarray -t ARRAY_NAME < <(command)

-t: Remove trailing newlines from each line
< <(...): Process substitution (creates a file descriptor from command output)

Basic Usage

Example 1: Get all implementation IDs from YAML:

readarray -t all_image_ids < <(get_entity_ids "implementations")

This is equivalent to:

all_image_ids=()
while IFS= read -r line; do
  all_image_ids+=("$line")
done < <(get_entity_ids "implementations")

But much more concise!

Example 2: Get failed test names (from perf/run.sh:738):

readarray -t FAILED_TESTS < <(
  yq eval '.tests[] | select(.status == "fail") | .name' "${TEST_PASS_DIR}/results.yaml" 2>/dev/null || true
)

# Now iterate through failed tests
for test_name in "${FAILED_TESTS[@]}"; do
  echo "  ✗ ${test_name}"
done

Multiple readarray Calls

Often used to load different data sets:

Example (from perf/run.sh:455-474):

# Load selected baseline tests
readarray -t selected_baseline_tests < <(
  get_entity_ids "baselines" "${TEST_PASS_DIR}/test-matrix.yaml"
)

# Load ignored baseline tests
readarray -t ignored_baseline_tests < <(
  get_entity_ids "ignoredBaselines" "${TEST_PASS_DIR}/test-matrix.yaml"
)

# Load selected main tests
readarray -t selected_main_tests < <(
  get_entity_ids "tests" "${TEST_PASS_DIR}/test-matrix.yaml"
)

# Load ignored main tests
readarray -t ignored_main_tests < <(
  get_entity_ids "ignoredTests" "${TEST_PASS_DIR}/test-matrix.yaml"
)

# Now we have 4 separate arrays we can work with

Checking Array Length

After using readarray, check if array is empty:

readarray -t selected_tests < <(get_entity_ids "tests")

if [ ${#selected_tests[@]} -eq 0 ]; then
  echo "No tests selected"
  exit 0
fi

echo "Running ${#selected_tests[@]} tests..."

Iterating Arrays from readarray

# By index
for ((i=0; i<${#selected_tests[@]}; i++)); do
  echo "Test $i: ${selected_tests[$i]}"
done

# By value
for test_name in "${selected_tests[@]}"; do
  echo "Running: ${test_name}"
done

Command Substitution: Subshells vs Direct Calls

Direct Function Call

When: Function modifies current shell state (variables, working directory, etc.)

Syntax: Just call the function

# Direct call - function runs in current shell
init_common_variables

# Variables set by the function are available
echo "${IMAGES_YAML}"  # Set by init_common_variables

Example (from perf/run.sh:93-94):

# Initialize common variables
init_common_variables

# Variables are now set in this shell
IMAGES_YAML="${IMAGES_YAML}"  # Available!

Subshell with Command Substitution

When: Capture function output without affecting current shell

Syntax: VAR=$(function_name args)

# Subshell - function runs in child process
TEST_RUN_KEY=$(compute_test_run_key "$IMAGES_YAML" "$TEST_IGNORE")

# Function's output is captured, but any variables it sets are lost
# Only the final echo/printf is captured

Example (from lib/lib-test-caching.sh:15-34):

compute_test_run_key() {
  local images_yaml="$1"
  shift

  local contents=$(<"${images_yaml}")
  local args=$(printf '%s\n' "$@" | paste -sd '|' -)
  local hash=$(printf '%s' "${contents}${args}" | sha256sum | cut -d ' ' -f1)

  # This echo is what gets captured
  echo "${hash:0:8}"
}

# Usage - captures the echo output
TEST_RUN_KEY=$(compute_test_run_key "$IMAGES_YAML" "$TEST_IGNORE")

Nested Command Substitution

Can nest multiple levels:

# Get count of tests
TEST_COUNT=$(yq eval '.tests | length' "${TEST_PASS_DIR}/test-matrix.yaml")

# Get test name using captured count
for ((i=0; i<TEST_COUNT; i++)); do
  test_name=$(yq eval ".tests[${i}].id" "${TEST_PASS_DIR}/test-matrix.yaml")
  echo "Test $i: ${test_name}"
done

Command Substitution in Variable Assignment

Example (from transport/run.sh:594-599):

cat > "${TEST_PASS_DIR}/results.yaml" <<EOF
metadata:
  testPass: ${TEST_PASS_NAME}
  startedAt: $(date -d @"${TEST_START_TIME}" -u +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -r "${TEST_START_TIME}" -u +%Y-%m-%dT%H:%M:%SZ)
  completedAt: $(date -u +%Y-%m-%dT%H:%M:%SZ)
  duration: ${TEST_DURATION}s
  platform: $(uname -m)
  os: $(uname -s)
  workerCount: ${WORKER_COUNT}
EOF

Process Substitution

Different from command substitution - creates a temporary file descriptor:

# Process substitution: <(command)
# Creates /dev/fd/N that reads from command output
readarray -t test_ids < <(get_entity_ids "tests")

# Useful for multiple inputs to a command
diff <(sort file1.txt) <(sort file2.txt)

Why use < <(...) instead of $(...) for readarray?

# Wrong - this doesn't work!
readarray -t tests $(get_entity_ids "tests")

# Right - process substitution creates a file to read from
readarray -t tests < <(get_entity_ids "tests")

Name References (local -n)

What are Name References?

local -n creates a name reference (like a pointer) to another variable. Useful for passing arrays and avoiding global variables.

Syntax:

local -n ref_name="variable_name"

Basic Example

Example (from lib/lib-filter-engine.sh:35-69):

_resolve_alias() {
  local alias_name="${1}"
  local -n processed_aliases_ref="${2}"  # Name reference to array
  local -n value_ref="${3}"              # Name reference to string

  # Modify the referenced variable directly
  value_ref="${ALIASES[${alias_name}]:-}"

  # Append to the referenced array
  processed_aliases_ref="${processed_aliases_ref} ${alias_name}"

  return 0
}

# Usage
processed_aliases=""
resolved_value=""
_resolve_alias "rust" processed_aliases resolved_value

# processed_aliases and resolved_value are now modified
echo "Processed: ${processed_aliases}"
echo "Value: ${resolved_value}"

Array Name References

Pass arrays to functions by reference:

Example (from lib/lib-filter-engine.sh:71-75):

_expand_recursive() {
  local filter_string="${1}"
  local -n all_names_ref="${2}"        # Array reference
  local -n processed_aliases_ref="${3}"  # String reference
  local -n result_parts_ref="${4}"      # Array reference

  # Can read from referenced arrays
  for name in "${all_names_ref[@]}"; do
    # ...
  done

  # Can append to referenced arrays
  result_parts_ref+=("new_element")
}

# Usage
all_names=("rust-v0.56" "go-v0.45")
processed_aliases=""
result_parts=()

_expand_recursive "~rust" all_names processed_aliases result_parts

# result_parts array is now modified

Why Use Name References?

Without name references - need global variables or complex return handling:

# Bad - uses global variable
RESULT=""

get_value() {
  RESULT="some value"  # Modifies global
}

get_value
echo "${RESULT}"

With name references - clean parameter passing:

# Good - uses name reference
get_value() {
  local -n result_ref="${1}"
  result_ref="some value"  # Modifies caller's variable
}

my_result=""
get_value my_result
echo "${my_result}"

Common Pattern: Multiple Return Values

Example:

get_test_info() {
  local test_index="${1}"
  local -n dialer_ref="${2}"
  local -n listener_ref="${3}"
  local -n transport_ref="${4}"

  dialer_ref=$(yq eval ".tests[${test_index}].dialer.id" test-matrix.yaml)
  listener_ref=$(yq eval ".tests[${test_index}].listener.id" test-matrix.yaml)
  transport_ref=$(yq eval ".tests[${test_index}].transport" test-matrix.yaml)
}

# Usage
dialer=""
listener=""
transport=""
get_test_info 0 dialer listener transport

echo "Test: ${dialer} x ${listener} (${transport})"

Error Handling

Exit Codes

0 = Success
Non-zero = Error

if docker build -t "${image_name}" "${build_path}"; then
  echo "Build successful"
else
  echo "Build failed"
  return 1
fi

Safe Command Execution

Use || true to prevent exits:

# Don't exit if grep finds nothing
FAILED=$(grep -c "status: fail" results.yaml || true)

# Don't exit if file doesn't exist
readarray -t TESTS < <(yq eval '.tests[]' results.yaml 2>/dev/null || true)

Trap for Cleanup

Example (from perf/lib/generate-tests.sh:10):

trap 'echo "ERROR in generate-tests.sh at line $LINENO: Command exited with status $?" >&2' ERR

Conditional Expressions

File Tests

if [ -f "${file_path}" ]; then    # File exists
if [ -d "${dir_path}" ]; then     # Directory exists
if [ -r "${file_path}" ]; then    # File is readable
if [ -w "${file_path}" ]; then    # File is writable
if [ -z "${string}" ]; then       # String is empty
if [ -n "${string}" ]; then       # String is not empty

Numeric Comparisons

if [ "${count}" -eq 0 ]; then     # Equal
if [ "${count}" -ne 0 ]; then     # Not equal
if [ "${count}" -gt 0 ]; then     # Greater than
if [ "${count}" -lt 10 ]; then    # Less than
if [ "${count}" -ge 5 ]; then     # Greater or equal
if [ "${count}" -le 20 ]; then    # Less or equal

String Comparisons

if [ "${status}" = "pass" ]; then      # String equal
if [ "${status}" != "fail" ]; then     # String not equal
if [ "${status}" == "pass" ]; then     # Also works (bash-specific)

Pattern Matching

# Case statement
case "${source_type}" in
  local)
    build_from_local "$YAML_FILE"
    ;;
  github)
    build_from_github "$YAML_FILE"
    ;;
  browser)
    build_browser_image "$YAML_FILE"
    ;;
  *)
    echo "Unknown source type: ${source_type}"
    return 1
    ;;
esac

# Regex matching
if [[ "${part}" =~ ^!~(.+)$ ]]; then
  # Captured group in ${BASH_REMATCH[1]}
  local alias_name="${BASH_REMATCH[1]}"
fi

Boolean Variables

Use strings "true" and "false", not 0/1:

DEBUG="${DEBUG:-false}"

if [ "${DEBUG}" = "true" ]; then
  echo "Debug mode enabled"
fi

String Operations

Concatenation

# Simple concatenation
full_name="${first_name} ${last_name}"

# Building paths
image_name="${TEST_TYPE}-implementations-${impl_id}"

Substring Extraction

# First 8 characters
short_hash="${hash:0:8}"

# Remove prefix
part="${part#\\}"  # Remove leading backslash

# Remove suffix
filename="${path%.*}"  # Remove extension

String Replacement

# Replace first occurrence
new_string="${old_string/search/replace}"

# Replace all occurrences
new_string="${old_string//search/replace}"

# Example: slug from test name
TEST_SLUG=$(echo "${TEST_NAME}" | sed 's/[^a-zA-Z0-9-]/_/g')

Trimming

# Remove leading/trailing whitespace
trimmed=$(echo "${string}" | xargs)

File Locking

Why File Locking?

When multiple processes write to the same file, use flock to prevent corruption.

Basic Pattern

(
  flock -x 200  # Exclusive lock on fd 200
  # Critical section - only one process at a time
  echo "data" >> shared_file.txt
) 200>/tmp/lockfile.lock

Real Example: Parallel Test Result Writing

Example (from perf/lib/run-single-test.sh:196-207):

# Multiple tests running in parallel, all writing to same file
(
  flock -x 200
  cat >> "${RESULTS_FILE}" <<EOF
  - name: ${TEST_NAME}
    dialer: ${DIALER_ID}
    listener: ${LISTENER_ID}
    status: $([ "${EXIT_CODE}" -eq 0 ] && echo "pass" || echo "fail")
    duration: ${TEST_DURATION}s
EOF
) 200>/tmp/results.lock

File Lock Pattern: Message Printing

Example (from transport/run.sh:534-537):

# Serialize the message printing using flock (prevents interleaved output)
(
  flock -x 200
  print_message "[$((index + 1))/${TEST_COUNT}] ${name}...${result}"
) 200>/tmp/transport-test-output.lock

Why? Without locking, parallel processes would interleave output:

Test 1...Test 2...
[SUCCESS]Test 3...
[FAILED][SUCCESS]

With locking:

Test 1... [SUCCESS]
Test 2... [FAILED]
Test 3... [SUCCESS]

Parallelization

Pattern 1: Sequential Execution

Use case: Perf tests (accurate measurements require sequential execution)

Pattern:

WORKER_COUNT=1

for ((i=0; i<TEST_COUNT; i++)); do
  bash "${SCRIPT_DIR}/run-single-test.sh" "${i}"
done

Example (from perf/run.sh:607-625):

for ((i=0; i<TEST_COUNT; i++)); do
  # Check for shutdown
  if [ "${SHUTDOWN}" == "true" ]; then
    break
  fi

  # Get test name
  test_name=$(yq eval ".tests[${i}].id" "${TEST_PASS_DIR}/test-matrix.yaml")

  # Show progress
  if [ "${DEBUG:-false}" == "true" ]; then
    print_message "[$((i + 1))/${TEST_COUNT}] ${test_name}..."
  else
    echo_message "[$((i + 1))/${TEST_COUNT}] ${test_name}..."
  fi

  # Run test
  if bash "${SCRIPT_DIR}/run-single-test.sh" "${i}" "tests" "${TEST_RESULTS_FILE}"; then
    echo "[SUCCESS]"
  else
    echo "[FAILED]"
  fi
done

Pattern 2: Parallel with xargs

Use case: Transport tests (maximize throughput)