feat: add comprehensive load testing infrastructure#154
Open
sysid wants to merge 4 commits into
Open
Conversation
3cc3f76 to
21b0ed9
Compare
Add exhaustive load testing to detect memory leaks, watcher deduplication at scale, and prevent performance regressions. Test coverage (15 tests): - Memory stability: leak detection, baseline return, event set cleanup - Watcher scale: single watcher verification, rapid connect/disconnect - Throughput: single/multi-client, TTFE, inter-event latency - Shutdown: graceful termination with active connections - Backpressure: slow client isolation, connection churn, send_timeout Infrastructure: - Docker-based test server with /metrics endpoint (psutil) - testcontainers fixtures with health check wait strategies - httpx-sse + asyncio.gather() for concurrent SSE clients - Manual GitHub Actions workflow (workflow_dispatch) New dependencies in dev group: httpx-sse New Makefile target: test-load
Add structured metrics collection, baseline management, and HTML/JSON reporting to load tests. Tests now produce observable performance data instead of just pass/fail results. New components: - MetricsCollector: aggregates latency, memory, throughput samples - BaselineManager: stores/loads per-test baselines, detects regressions - ReportGenerator: produces JSON reports and self-contained HTML with inline SVG charts All 15 load tests updated with: - Metrics collection integration - Structured docstrings explaining what/why/how for each test - Baseline comparison and optional regression detection CLI options added: --update-baseline, --fail-on-regression, --output-dir, --baselines-dir, --regression-threshold GitHub workflow updated with baseline update and regression detection inputs, plus artifact upload for reports.
Remove --scale and --duration CLI options from load tests. Each test now defines its own parameters as constants, allowing appropriate values per test type (e.g., shutdown tests need fewer connections). Changes: - conftest.py: remove --scale, --duration options and fixtures - metrics.py: compute duration_minutes internally from actual duration - test_*.py: add explicit NUM_CLIENTS, DURATION_SEC, etc. constants - README.md: update CLI options documentation - load-test.yml: remove scale/duration workflow inputs
111c37f to
6ff6afd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add exhaustive load testing infrastructure to detect memory leaks, validate Issue #152 watcher deduplication at scale, and prevent performance regressions.
Test Coverage (15 tests)
Infrastructure
/metricsendpoint (psutil)Changes
tests/load/directory with 15 load teststests/Dockerfile.loadtestfor test server image.github/workflows/load-test.yml(manual trigger)make test-loadUsage