Skip to content

Conversation

@gargsaumya
Copy link
Contributor

@gargsaumya gargsaumya commented Feb 13, 2026

Work Item / Issue Reference

AB#41877

GitHub Issue: #<ISSUE_NUMBER>


Summary

This pull request adds a comprehensive suite of multi-threaded stress tests for the mssql-python package. The new tests cover a wide range of concurrency scenarios, sustained load, pool lifecycle, and heavy query scaling to ensure robust performance and stability under high-load conditions. The tests are cross-platform, feature deterministic shutdowns, and include adaptive validation of error rates.

Concurrency and load testing:

  • Added a concurrency matrix of tests (test_concurrency_matrix) that runs queries with varying thread counts and connection pooling settings, validating error rates at different concurrency levels.
  • Introduced sustained load tests (test_sustained_load) that run multiple threads for extended durations to simulate real-world high-load situations.
  • Implemented heavy query scaling tests (test_heavy_query_scaling) with large result sets and high thread counts to stress test the connection pool.

Connection pool management:

  • Added a pool lifecycle test (test_pool_lifecycle_cycles) to repeatedly enable and disable connection pooling, verifying deterministic shutdown and resource cleanup.
  • Provided utilities for strict thread joining and pool shutdown to ensure clean test teardown and avoid resource leaks.

Test infrastructure and utilities:

  • Defined data models (ThreadResult, StressResult) and a runner class (MultiThreadedQueryRunner) for collecting detailed metrics and managing

Copilot AI review requested due to automatic review settings February 13, 2026 05:33
@github-actions github-actions bot added the pr-size: large Substantial code update label Feb 13, 2026
echo "Waiting for SQL Server to be ready..."
for i in {1..30}; do
if docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -Q "SELECT 1" >/dev/null 2>&1; then

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production

echo "Creating database and user for stress tests..."
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -Q "CREATE DATABASE StressTestDB"

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -Q "CREATE DATABASE StressTestDB"
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -Q "CREATE LOGIN testuser WITH PASSWORD = '$(DB_PASSWORD)'"

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -Q "CREATE LOGIN testuser WITH PASSWORD = '$(DB_PASSWORD)'"
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -d StressTestDB -Q "CREATE USER testuser FOR LOGIN testuser"

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -d StressTestDB -Q "CREATE USER testuser FOR LOGIN testuser"
docker exec sqlserver-stress /opt/mssql-tools18/bin/sqlcmd \
-S localhost -U SA -P "$(DB_PASSWORD)" -C -d StressTestDB -Q "ALTER ROLE db_owner ADD MEMBER testuser"

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production
WRAPPER
displayName: 'Run Multi-Threaded Stress Tests'
env:
DB_CONNECTION_STRING: 'Server=localhost;Database=StressTestDB;Uid=testuser;Pwd=$(DB_PASSWORD);TrustServerCertificate=yes'

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production
python -m pytest tests/test_011_performance_stress.py -v -m "stress" --junitxml=perf-stress-test-results-linux.xml --timeout=1800 --capture=tee-sys
displayName: 'Run Performance Stress Tests'
env:
DB_CONNECTION_STRING: 'Server=localhost;Database=StressTestDB;Uid=testuser;Pwd=$(DB_PASSWORD);TrustServerCertificate=yes'

Check notice

Code scanning / devskim

Accessing localhost could indicate debug code, or could hinder scaling. Note

Do not leave debug code in production
@gargsaumya gargsaumya changed the title FEAT: Add stress test pipeline and multi-threaded stress tests FEAT: Add stress test pipeline and migrate it to use new 1ES Pool/Images Feb 13, 2026
@gargsaumya gargsaumya changed the title FEAT: Add stress test pipeline and migrate it to use new 1ES Pool/Images FEAT: Add stress test pipeline and migrate it to new 1ES Pool/Images Feb 13, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new scheduled/manual OneBranch pipeline to run stress testing and introduces a new multi-threaded stress test suite for mssql-python aimed at validating pooling and concurrency behavior under sustained and high-concurrency load.

Changes:

  • Added a new tests/test_020_multithreaded_stress.py module with multi-threaded stress tests and a helper runner/metrics models.
  • Added OneBranchPipelines/stress-test-pipeline.yml to run the stress suite on Windows (LocalDB) and Linux (Docker SQL Server) on a daily schedule or manually.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 12 comments.

File Description
tests/test_020_multithreaded_stress.py New concurrency-focused stress tests with a multi-threaded runner and workload scenarios (matrix, sustained load, lifecycle, heavy queries).
OneBranchPipelines/stress-test-pipeline.yml New scheduled pipeline to provision DBs (LocalDB/Docker) and execute/publish stress + performance stress tests on Windows and Linux agents.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 174 to 177
@pytest.mark.stress_threading
@pytest.mark.parametrize("thread_count", THREAD_LEVELS)
@pytest.mark.parametrize("pooling", [True, False])
def test_concurrency_matrix(stress_conn_str, thread_count, pooling):
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pytest.mark.stress_threading is not registered in the repo’s pytest.ini (only stress is), and the default addopts = -m "not stress" means these long-running tests will run in normal pytest runs whenever DB_CONNECTION_STRING is set. Consider also marking these tests with @pytest.mark.stress (and/or registering stress_threading + excluding it by default) so they don’t slow or destabilize standard CI and local runs.

Copilot uses AI. Check for mistakes.
Comment on lines 109 to 119
try:
conn = connect(self.conn_str)
cur = conn.cursor()
cur.execute(self.query)
cur.fetchall()
cur.close()
conn.close()
except Exception as e:
tr.errors += 1
if len(tr.error_samples) < 3:
tr.error_samples.append(str(e))
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an exception occurs after creating conn/cur, the current code increments the error count but doesn’t reliably close the cursor/connection, which can leak resources and skew the stress results (and even exhaust the pool). Consider wrapping cursor/connection usage in a try/finally (or context managers) so they are always closed even on failures.

Suggested change
try:
conn = connect(self.conn_str)
cur = conn.cursor()
cur.execute(self.query)
cur.fetchall()
cur.close()
conn.close()
except Exception as e:
tr.errors += 1
if len(tr.error_samples) < 3:
tr.error_samples.append(str(e))
conn = None
cur = None
try:
conn = connect(self.conn_str)
cur = conn.cursor()
cur.execute(self.query)
cur.fetchall()
except Exception as e:
tr.errors += 1
if len(tr.error_samples) < 3:
tr.error_samples.append(str(e))
finally:
if cur is not None:
try:
cur.close()
except Exception:
pass
if conn is not None:
try:
conn.close()
except Exception:
pass

Copilot uses AI. Check for mistakes.
Comment on lines 214 to 226
def worker():
try:
while not stop_event.is_set():
try:
conn = connect(stress_conn_str)
cur = conn.cursor()
cur.execute("SELECT TOP 20 name FROM sys.objects")
cur.fetchall()
cur.close()
conn.close()
except Exception:
pass
time.sleep(0.005)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_sustained_load swallows all exceptions and has no assertions around success/error rate, so it can pass even if every connection/query fails. Consider tracking successes/errors (and ensuring connections/cursors are closed on exceptions) and asserting a minimum success count or a maximum error rate so the test actually validates sustained-load behavior.

Copilot uses AI. Check for mistakes.
# PYTHON-1ES-MMS2022 has LocalDB pre-installed (no full SQL Server instance)
- powershell: |
Write-Host "Setting up LocalDB for stress tests..."
sqllocaldb create MSSQLLocalDB
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sqllocaldb create MSSQLLocalDB will fail if the instance already exists (possible on a reused agent). Consider making this idempotent (e.g., check existence first, or delete/recreate) to avoid flaky pipeline failures.

Suggested change
sqllocaldb create MSSQLLocalDB
# Check if MSSQLLocalDB instance already exists; create only if missing
& sqllocaldb info MSSQLLocalDB > $null 2>&1
if ($LASTEXITCODE -ne 0) {
Write-Host "MSSQLLocalDB instance not found. Creating..."
sqllocaldb create MSSQLLocalDB
} else {
Write-Host "MSSQLLocalDB instance already exists. Reusing existing instance."
}

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +135
$gracePeriod = 30
$arguments = "-m pytest tests/test_020_multithreaded_stress.py -v -m stress_threading --junitxml=$xmlFile --timeout=3600 --capture=tee-sys"
$process = Start-Process -FilePath "python" -ArgumentList $arguments -NoNewWindow -PassThru

# Poll: wait for process to exit or XML file to appear
$xmlSeenAt = $null
while (-not $process.HasExited) {
if ($null -eq $xmlSeenAt -and (Test-Path $xmlFile)) {
$xmlSeenAt = Get-Date
Write-Host "XML file detected, giving ${gracePeriod}s for clean shutdown..."
}
if ($null -ne $xmlSeenAt -and ((Get-Date) - $xmlSeenAt).TotalSeconds -gt $gracePeriod) {
Write-Host "Process still running after grace period, killing..."
$process | Stop-Process -Force
break
}
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process-kill logic starts the grace timer as soon as the JUnit XML file exists. Pytest’s --junitxml typically creates the file at session start, so this can kill long-running stress tests after ~30s even when they’re still running normally. Consider instead relying on pytest-timeout for hangs, or only starting the grace period after detecting a completed XML (e.g., parseable with closing tags and non-zero tests, and file mtime stable).

Suggested change
$gracePeriod = 30
$arguments = "-m pytest tests/test_020_multithreaded_stress.py -v -m stress_threading --junitxml=$xmlFile --timeout=3600 --capture=tee-sys"
$process = Start-Process -FilePath "python" -ArgumentList $arguments -NoNewWindow -PassThru
# Poll: wait for process to exit or XML file to appear
$xmlSeenAt = $null
while (-not $process.HasExited) {
if ($null -eq $xmlSeenAt -and (Test-Path $xmlFile)) {
$xmlSeenAt = Get-Date
Write-Host "XML file detected, giving ${gracePeriod}s for clean shutdown..."
}
if ($null -ne $xmlSeenAt -and ((Get-Date) - $xmlSeenAt).TotalSeconds -gt $gracePeriod) {
Write-Host "Process still running after grace period, killing..."
$process | Stop-Process -Force
break
}
$arguments = "-m pytest tests/test_020_multithreaded_stress.py -v -m stress_threading --junitxml=$xmlFile --timeout=3600 --capture=tee-sys"
$process = Start-Process -FilePath "python" -ArgumentList $arguments -NoNewWindow -PassThru
# Wait for the pytest process to exit; rely on --timeout for hangs
while (-not $process.HasExited) {

Copilot uses AI. Check for mistakes.
t = threading.Thread(
target=self.worker,
args=(i + 1, iterations, results),
daemon=True, # 🔑 critical
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These worker threads are started as daemon=True. If join_threads_strict asserts (timeout) or another failure happens, daemon threads may continue running in the background and interfere with subsequent tests in the same pytest process. Consider using non-daemon threads and ensuring teardown always stops/joins them (or use a stop event so you can signal a clean shutdown on failure).

Suggested change
daemon=True, # 🔑 critical

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +108
sqlcmd -S "(localdb)\MSSQLLocalDB" -Q "CREATE DATABASE StressTestDB"
sqlcmd -S "(localdb)\MSSQLLocalDB" -Q "CREATE LOGIN testuser WITH PASSWORD = '$env:DB_PASSWORD'"
sqlcmd -S "(localdb)\MSSQLLocalDB" -d StressTestDB -Q "CREATE USER testuser FOR LOGIN testuser"
sqlcmd -S "(localdb)\MSSQLLocalDB" -d StressTestDB -Q "ALTER ROLE db_owner ADD MEMBER testuser"
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DB/user setup commands aren’t idempotent (CREATE DATABASE/LOGIN/USER will error if they already exist). On reused Windows agents this can break repeated scheduled runs. Consider guarding with IF DB_ID(...) IS NULL ... / IF NOT EXISTS (...) ... or dropping existing objects before creating them.

Suggested change
sqlcmd -S "(localdb)\MSSQLLocalDB" -Q "CREATE DATABASE StressTestDB"
sqlcmd -S "(localdb)\MSSQLLocalDB" -Q "CREATE LOGIN testuser WITH PASSWORD = '$env:DB_PASSWORD'"
sqlcmd -S "(localdb)\MSSQLLocalDB" -d StressTestDB -Q "CREATE USER testuser FOR LOGIN testuser"
sqlcmd -S "(localdb)\MSSQLLocalDB" -d StressTestDB -Q "ALTER ROLE db_owner ADD MEMBER testuser"
sqlcmd -S "(localdb)\MSSQLLocalDB" -Q "IF DB_ID('StressTestDB') IS NULL CREATE DATABASE StressTestDB;"
sqlcmd -S "(localdb)\MSSQLLocalDB" -Q "IF NOT EXISTS (SELECT 1 FROM sys.server_principals WHERE name = 'testuser') BEGIN CREATE LOGIN testuser WITH PASSWORD = '$env:DB_PASSWORD'; END"
sqlcmd -S "(localdb)\MSSQLLocalDB" -d StressTestDB -Q "IF NOT EXISTS (SELECT 1 FROM sys.database_principals WHERE name = 'testuser') BEGIN CREATE USER testuser FOR LOGIN testuser; END"
sqlcmd -S "(localdb)\MSSQLLocalDB" -d StressTestDB -Q "IF NOT EXISTS (SELECT 1 FROM sys.database_role_members drm JOIN sys.database_principals r ON drm.role_principal_id = r.principal_id JOIN sys.database_principals m ON drm.member_principal_id = m.principal_id WHERE r.name = 'db_owner' AND m.name = 'testuser') BEGIN ALTER ROLE db_owner ADD MEMBER testuser; END"

Copilot uses AI. Check for mistakes.
from dataclasses import dataclass, field
from typing import List

import mssql_python
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module 'mssql_python' is imported with both 'import' and 'import from'.

Copilot uses AI. Check for mistakes.
import os
import time
import threading
import psutil
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'psutil' is not used.

Suggested change
import psutil

Copilot uses AI. Check for mistakes.
Comment on lines 34 to 57
# def wait_for_quiescence(timeout=15):
# proc = psutil.Process()
# start = time.time()
# last = None
# stable = 0

# while time.time() - start < timeout:
# current = proc.num_threads()
# if current == last:
# stable += 1
# else:
# stable = 0
# if stable >= 5:
# return
# last = current
# time.sleep(0.2)

# raise AssertionError(f"Process threads did not stabilize: {proc.num_threads()}")


def shutdown_pool():
mssql_python.pooling(enabled=False)
time.sleep(0.2)
# wait_for_quiescence()
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment appears to contain commented-out code.

Suggested change
# def wait_for_quiescence(timeout=15):
# proc = psutil.Process()
# start = time.time()
# last = None
# stable = 0
# while time.time() - start < timeout:
# current = proc.num_threads()
# if current == last:
# stable += 1
# else:
# stable = 0
# if stable >= 5:
# return
# last = current
# time.sleep(0.2)
# raise AssertionError(f"Process threads did not stabilize: {proc.num_threads()}")
def shutdown_pool():
mssql_python.pooling(enabled=False)
time.sleep(0.2)
# wait_for_quiescence()
def shutdown_pool():
mssql_python.pooling(enabled=False)
time.sleep(0.2)
# Note: we intentionally do not wait for full process quiescence here to keep tests fast.

Copilot uses AI. Check for mistakes.
@gargsaumya gargsaumya force-pushed the saumya/stress-tests-gh branch from a62b99a to a7ac237 Compare February 13, 2026 06:34
@gargsaumya gargsaumya force-pushed the saumya/stress-tests-gh branch from a7ac237 to 96841c3 Compare February 13, 2026 06:37
This is an extreme stress test to find the breaking point.
Expects some failures but tests for graceful degradation.
"""
num_threads = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1000 threads is really really large and extremely unrealistic.
Let's use something pragmatic pragmatic like 2 times the processor core count.


assert not result.hung, f"Test hung: {num_threads} threads, pooling={pooling}"

# Adaptive expectations based on thread count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An enhancement to the comment would be great. It should explain how adaptive nature is working. And why it is the right thing to do.

after = get_resource_usage()
print(f"After stress RSS: {after['rss_mb']} MB")

mem_delta = after["rss_mb"] - baseline["rss_mb"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit of measurement in variable name is useful. like mem_delta_in_mb

Tests with JOINs, aggregations, and larger result sets.
"""
complex_query = """
SELECT TOP 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about making this select * instead. You will get more results and each thread will be busy for longer.
With 50 rows, even if sql is busy, we wont be keeping the driver busy


# Check for excessive memory growth (potential leak)
# Allow up to 100MB growth for long test
assert mem_growth < 100, f"Potential memory leak: {mem_growth:.1f}MB growth"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit of measure in var name is going to be helpful


This tests for memory leaks and resource exhaustion over time.
"""
num_threads = 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test which uses normal thread count i.e. threads = Processor cores. I am sure we can get the processor core from some python runtime env var.

While we are stressing the driver, with high thread count, it would be good to have a test which works in "normal" conditions also. But keeping each thread busy for longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: large Substantial code update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants