Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 116 additions & 0 deletions .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
name: Benchmarks

on:
workflow_dispatch:
inputs:
suite:
description: "Benchmark suite to run"
required: false
default: "all"
type: choice
options:
- all
- wal_benchmarks
- eventbus_benchmarks
- serialization_benchmarks
- system_benchmarks
- process_collector_benchmarks

Comment thread
unclesp1d3r marked this conversation as resolved.
# Restrict permissions to minimum required
permissions:
contents: read

defaults:
run:
shell: bash

env:
CARGO_TERM_COLOR: always
CI: true

jobs:
benchmarks:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

- uses: jdx/mise-action@5228313ee0372e111a38da051671ca30fc5a96db # v3.6.3
with:
install: true
cache: true
github_token: ${{ secrets.GITHUB_TOKEN }}

- name: Restore baseline benchmarks
uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
with:
path: target/criterion
key: criterion-baseline-${{ runner.os }}-${{ hashFiles('rust-toolchain.toml', 'Cargo.lock') }}

- name: Run benchmarks
env:
BENCH_SUITE: ${{ inputs.suite }}
run: |
set -o pipefail
if [ "$BENCH_SUITE" = "all" ]; then
mise x -- cargo bench --package procmond 2>&1 | tee bench-output.txt
else
mise x -- cargo bench --package procmond --bench "$BENCH_SUITE" 2>&1 | tee bench-output.txt
fi

- name: Check for performance regression
# CI runners have variable performance; use a generous 20% threshold
# to avoid false positives while still catching significant regressions.
run: |
if grep -q "Performance has regressed" bench-output.txt; then
echo "::warning::Performance regression detected in benchmarks"
grep -A2 "Performance has regressed" bench-output.txt
if grep -oP 'change: \+\K[0-9.]+' bench-output.txt | awk '{if ($1 > 20.0) exit 1}'; then
echo "All regressions within 20% threshold (CI runner noise tolerance)"
else
echo "::error::Benchmark regression exceeds 20% threshold"
exit 1
fi
else
echo "No performance regressions detected"
fi
Comment thread
unclesp1d3r marked this conversation as resolved.

- name: Save baseline benchmarks
uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
if: github.ref == 'refs/heads/main'
with:
path: target/criterion
key: criterion-baseline-${{ runner.os }}-${{ hashFiles('rust-toolchain.toml', 'Cargo.lock') }}

- name: Upload benchmark results
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
if: always()
with:
name: benchmark-results
path: bench-output.txt
retention-days: 30

load-tests:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

- uses: jdx/mise-action@5228313ee0372e111a38da051671ca30fc5a96db # v3.6.3
with:
install: true
cache: true
github_token: ${{ secrets.GITHUB_TOKEN }}

- name: Run load tests
run: |
set -o pipefail
NO_COLOR=1 TERM=dumb mise x -- cargo test --package procmond --test load_tests -- --ignored --nocapture 2>&1 | tee load-test-output.txt

- name: Upload load test results
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
if: always()
with:
name: load-test-results
path: load-test-output.txt
retention-days: 30
60 changes: 0 additions & 60 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -129,63 +129,3 @@ jobs:
with:
token: ${{ secrets.QLTY_COVERAGE_TOKEN }}
files: target/lcov.info

benchmarks:
runs-on: ubuntu-latest
needs: test
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0

- uses: jdx/mise-action@5228313ee0372e111a38da051671ca30fc5a96db # v3.6.3
with:
install: true
cache: true
github_token: ${{ secrets.GITHUB_TOKEN }}

- name: Restore baseline benchmarks
uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
with:
path: target/criterion
key: criterion-baseline-${{ runner.os }}

- name: Run benchmarks
run: mise x -- cargo bench --package procmond 2>&1 | tee bench-output.txt

- name: Check for performance regression
run: |
# Criterion reports "regressed" when performance degrades beyond noise threshold.
# Fail CI if any benchmark regresses more than 10%.
if grep -q "Performance has regressed" bench-output.txt; then
echo "::warning::Performance regression detected in benchmarks"
grep -A2 "Performance has regressed" bench-output.txt
if grep -oP 'change: \+\K[0-9.]+' bench-output.txt | awk '{if ($1 > 10.0) exit 1}'; then
echo "All regressions within 10% threshold"
else
echo "::error::Benchmark regression exceeds 10% threshold"
exit 1
fi
else
echo "No performance regressions detected"
fi

- name: Save baseline benchmarks
uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
if: github.ref == 'refs/heads/main'
with:
path: target/criterion
key: criterion-baseline-${{ runner.os }}

- name: Run load tests
run: NO_COLOR=1 TERM=dumb mise x -- cargo test --package procmond --test load_tests -- --ignored --nocapture 2>&1 | tee load-test-output.txt

- name: Upload benchmark results
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
if: always()
with:
name: benchmark-results
path: |
bench-output.txt
load-test-output.txt
retention-days: 30
3 changes: 3 additions & 0 deletions docs/src/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ just fmt
# Run benchmarks
just bench

# Run procmond benchmarks
just bench-procmond

# Generate documentation
just docs

Expand Down
113 changes: 81 additions & 32 deletions docs/src/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -336,17 +336,18 @@ async fn test_full_system_workflow() {

## Performance Testing

### Automated CI Benchmarks
### Automated Benchmarks

DaemonEye's CI pipeline includes automated performance benchmarking to detect regressions:
DaemonEye provides a dedicated benchmarking workflow for performance testing:

- **Automatic Execution**: Performance benchmarks run on every CI build using Criterion
- **Regression Detection**: Tests automatically detect performance regressions with a 10% threshold
- **Manual Trigger**: Performance benchmarks are triggered manually via workflow_dispatch
- **Configurable Suites**: Select which benchmark suite to run ("all", "performance_benchmarks", or "process_collector_benchmarks")
- **Regression Detection**: Tests detect performance regressions and log warnings for review
- **Baseline Comparison**: Benchmark results are cached and compared against baseline from the main branch
- **Load Testing**: Automated load tests validate system behavior under stress
- **Results Archival**: Benchmark results are uploaded as artifacts with 30-day retention
- **Load Testing**: Automated load tests validate system behavior under stress in a separate job
- **Results Archival**: Benchmark and load test results are uploaded as artifacts with 30-day retention

Developers can access benchmark results from the GitHub Actions workflow artifacts. If a performance regression exceeds the 10% threshold, the CI build will fail with a detailed error message showing which benchmarks regressed.
Developers can access benchmark results from the GitHub Actions workflow artifacts. Performance regressions are logged as warnings but do not fail the build, allowing for manual review and assessment.

### Load Testing

Expand Down Expand Up @@ -664,9 +665,13 @@ impl TestDataManager {

## Continuous Integration

### GitHub Actions Workflow
### GitHub Actions Workflows

The CI pipeline includes multiple jobs that run on every build:
DaemonEye uses two separate GitHub Actions workflows for testing:

#### Main CI Workflow (`.github/workflows/ci.yml`)

The main CI pipeline runs on every push and pull request:

```yaml
name: Tests
Expand Down Expand Up @@ -750,10 +755,32 @@ jobs:
files: lcov.info
fail_ci_if_error: false
token: ${{ secrets.CODECOV_TOKEN }}
```

#### Benchmarks Workflow (`.github/workflows/benchmarks.yml`)

The benchmarks workflow is triggered manually and runs independently:

```yaml
name: Benchmarks

on:
workflow_dispatch:
inputs:
suite:
description: "Benchmark suite to run"
required: false
default: "all"
type: choice
options:
- all
- performance_benchmarks
- process_collector_benchmarks
Comment on lines +764 to +778
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmarking workflow example in the docs still lists performance_benchmarks as a selectable suite, but that bench was removed and the actual workflow input options are now wal_benchmarks, eventbus_benchmarks, serialization_benchmarks, system_benchmarks, and process_collector_benchmarks. Please update the options list (and any surrounding text) to match the real .github/workflows/benchmarks.yml input choices so readers can run the documented suites successfully.

Copilot uses AI. Check for mistakes.

jobs:
benchmarks:
runs-on: ubuntu-latest
needs: test
timeout-minutes: 15
steps:
- uses: actions/checkout@v6
with:
Expand All @@ -772,22 +799,20 @@ jobs:
key: criterion-baseline-${{ runner.os }}

- name: Run benchmarks
run: mise x -- cargo bench --package procmond 2>&1 | tee
bench-output.txt
env:
BENCH_SUITE: ${{ inputs.suite }}
run: |
if [ "$BENCH_SUITE" = "all" ]; then
mise x -- cargo bench --package procmond 2>&1 | tee bench-output.txt
else
mise x -- cargo bench --package procmond --bench "$BENCH_SUITE" 2>&1 | tee bench-output.txt
fi

- name: Check for performance regression
run: |
# Criterion reports "regressed" when performance degrades beyond noise threshold.
# Fail CI if any benchmark regresses more than 10%.
if grep -q "Performance has regressed" bench-output.txt; then
echo "::warning::Performance regression detected in benchmarks"
grep -A2 "Performance has regressed" bench-output.txt
if grep -oP 'change: \+\K[0-9.]+' bench-output.txt | awk '{if ($1 > 10.0) exit 1}'; then
echo "All regressions within 10% threshold"
else
echo "::error::Benchmark regression exceeds 10% threshold"
exit 1
fi
else
echo "No performance regressions detected"
fi
Expand All @@ -799,40 +824,64 @@ jobs:
path: target/criterion
key: criterion-baseline-${{ runner.os }}

- name: Upload benchmark results
uses: actions/upload-artifact@v4
if: always()
with:
name: benchmark-results
path: bench-output.txt
retention-days: 30

load-tests:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v6

- uses: jdx/mise-action@v3
with:
install: true
cache: true
github_token: ${{ secrets.GITHUB_TOKEN }}

- name: Run load tests
run: NO_COLOR=1 TERM=dumb mise x -- cargo test --package procmond --test
load_tests -- --ignored --nocapture 2>&1 | tee load-test-output.txt

- name: Upload benchmark results
- name: Upload load test results
uses: actions/upload-artifact@v4
if: always()
with:
name: benchmark-results
path: |
bench-output.txt
load-test-output.txt
name: load-test-results
path: load-test-output.txt
retention-days: 30
```

### CI Jobs Overview

The CI pipeline includes the following jobs:
The main CI pipeline includes the following jobs:

1. **quality**: Runs code formatting and linting checks
2. **test**: Executes the full test suite with all features enabled
3. **test-cross-platform**: Tests on Ubuntu, macOS, and Windows
4. **coverage**: Generates and uploads code coverage reports
5. **benchmarks**: Runs performance benchmarks with regression detection

The benchmarks workflow includes two independent jobs:

1. **benchmarks**: Runs performance benchmarks with configurable suite selection (15-minute timeout)
2. **load-tests**: Runs load tests under stress conditions (10-minute timeout)

### Accessing Benchmark Results

Benchmark results are available in multiple ways:
Benchmark results are available through the dedicated benchmarks workflow:

- **Workflow Artifacts**: Download `benchmark-results` artifacts from the GitHub Actions workflow summary page
- **CI Logs**: View benchmark output directly in the workflow logs under the "Run benchmarks" step
- **Performance Alerts**: If a regression exceeds 10%, the CI build will fail with a warning annotation showing which benchmarks regressed
- **Manual Trigger**: Navigate to the Actions tab and select the "Benchmarks" workflow, then choose "Run workflow" to trigger manually
- **Suite Selection**: Choose which benchmark suite to run: "all" (default), "performance_benchmarks", or "process_collector_benchmarks"
- **Workflow Artifacts**: Download `benchmark-results` and `load-test-results` artifacts from the workflow summary page
- **CI Logs**: View benchmark output directly in the workflow logs
- **Performance Alerts**: Regressions are logged as warnings for manual review without failing the workflow
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say benchmark regressions are "logged as warnings but do not fail the build/workflow", but the actual .github/workflows/benchmarks.yml step exits non-zero when regressions exceed the threshold (currently 20%). Update the documentation to reflect the current behavior (or adjust the workflow to match the documented non-failing behavior), otherwise this will surprise people relying on the docs.

Suggested change
- **Performance Alerts**: Regressions are logged as warnings for manual review without failing the workflow
- **Performance Alerts**: Significant regressions (currently >20% slowdown versus the `main` baseline) cause the benchmarks job to fail, while smaller regressions are logged as warnings for manual review

Copilot uses AI. Check for mistakes.

The `benchmarks` job stores baseline results from the `main` branch and compares all subsequent runs against this baseline to detect performance regressions.
The benchmarks workflow stores baseline results from the `main` branch and compares all subsequent runs against this baseline to detect performance regressions.

### Test Reporting

Expand Down
4 changes: 4 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,10 @@ test-security:
bench:
@{{ mise_exec }} cargo bench --workspace

# Run procmond benchmarks (WAL, EventBus, process collection, serialization)
bench-procmond:
@{{ mise_exec }} cargo bench -p procmond

# Run specific benchmark suites
bench-process:
@{{ mise_exec }} cargo bench -p daemoneye-lib --bench process_collection
Expand Down
Loading
Loading