Skip to content

fix: spurious async client tests failures#1181

Open
charles-dyfis-net wants to merge 1 commit intoqdrant:devfrom
charles-dyfis-net:spurious-unit-test-failures
Open

fix: spurious async client tests failures#1181
charles-dyfis-net wants to merge 1 commit intoqdrant:devfrom
charles-dyfis-net:spurious-unit-test-failures

Conversation

@charles-dyfis-net
Copy link
Copy Markdown

Addresses spurious test failures:

  • skip cluster-only test when server is standalone
  • increase timeout for unit test performing multiple snapshot operations
  • clean up stale snapshots left by previous runs

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

* skip cluster-only test when server is standalone

* increase timeout for unit test performing multiple snapshot operations

* clean up stale snapshots left by previous runs
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 27, 2026

Deploy Preview for poetic-froyo-8baba7 failed.

Name Link
🔨 Latest commit 776e5af
🔍 Latest deploy log https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/69c69d2e6bd7690008cd1f06

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 27, 2026

📝 Walkthrough

Walkthrough

This pull request modifies test code in tests/test_async_qdrant_client.py to enhance test reliability and coverage. Changes include: adding a distributed-mode guard to skip test_custom_sharding when clustering is disabled, increasing the AsyncQdrantClient timeout from 15 to 60 seconds, extending storage lock lifecycle assertions with multiple state validations, implementing snapshot cleanup logic to remove stale snapshots from prior test runs, and removing dead commented-out code paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • tbung
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: spurious async client tests failures' directly and concisely summarizes the main purpose of the changeset: addressing flaky test failures in async client tests.
Description check ✅ Passed The description is directly related to the changeset, explaining the three main fixes: skipping cluster-only tests in standalone mode, increasing timeout for snapshot operations, and cleaning up stale snapshots.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_async_qdrant_client.py`:
- Line 24: The test contains an unused unpack of read_version(): the line
"major, minor, patch, dev = read_version()" should be removed or replaced to
intentionally ignore the return value; fix by either deleting the call entirely
from tests/test_async_qdrant_client.py or replacing it with a single underscore
assignment like "_ = read_version()" or prefix unused names with underscores
(e.g., "_major, _minor, _patch, _dev = read_version()") so the read_version()
invocation (and its return handling) no longer triggers the unused-variable
warning; target the read_version() call in that test file.
- Around line 184-194: Replace the blocking time.sleep calls used in async
cleanup with non-blocking await asyncio.sleep; specifically, in
tests/test_async_qdrant_client.py replace the time.sleep(0.5) calls that appear
after delete_shard_snapshot (and the other occurrence noted) with await
asyncio.sleep(0.5) and ensure asyncio is imported at the top of the test module
so the async test function can await the sleep without blocking the event loop;
keep the rest of the cleanup logic (client.list_shard_snapshots,
client.delete_shard_snapshot, etc.) unchanged.
- Around line 577-580: The sync call using httpx.get blocks the event loop and
doesn't handle connection errors; replace it with an async probe using
httpx.AsyncClient and await the GET (or run httpx.get in a thread via
anyio.to_thread.run_sync) and wrap the probe in a try/except that catches
httpx.RequestError and timeouts, calling pytest.skip("Requires distributed
mode") when the probe fails or when resp.json().get("result", {}).get("status")
== "disabled"; update the code around the existing httpx.get and pytest.skip
usage to use the async client/await and error-safe fallback.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: db810e47-2240-4c61-82ea-64ea483c05e9

📥 Commits

Reviewing files that changed from the base of the PR and between c26c1e0 and 776e5af.

📒 Files selected for processing (1)
  • tests/test_async_qdrant_client.py

@pytest.mark.parametrize("prefer_grpc", [True, False])
async def test_async_qdrant_client(prefer_grpc):
client = AsyncQdrantClient(prefer_grpc=prefer_grpc, timeout=15)
major, minor, patch, dev = read_version()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove or intentionally ignore the unused read_version() unpack.

This introduces the Ruff RUF059 warnings and adds a no-op call in this test path.

💡 Proposed fix
-    major, minor, patch, dev = read_version()
+    _major, _minor, _patch, _dev = read_version()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
major, minor, patch, dev = read_version()
_major, _minor, _patch, _dev = read_version()
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 24-24: Unpacked variable major is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 24-24: Unpacked variable minor is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 24-24: Unpacked variable patch is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 24-24: Unpacked variable dev is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_async_qdrant_client.py` at line 24, The test contains an unused
unpack of read_version(): the line "major, minor, patch, dev = read_version()"
should be removed or replaced to intentionally ignore the return value; fix by
either deleting the call entirely from tests/test_async_qdrant_client.py or
replacing it with a single underscore assignment like "_ = read_version()" or
prefix unused names with underscores (e.g., "_major, _minor, _patch, _dev =
read_version()") so the read_version() invocation (and its return handling) no
longer triggers the unused-variable warning; target the read_version() call in
that test file.

Comment on lines +184 to +194
# Clean up any stale snapshots left by previous failed runs
for snap in await client.list_snapshots(COLLECTION_NAME):
await client.delete_snapshot(COLLECTION_NAME, snapshot_name=snap.name, wait=True)
for snap in await client.list_full_snapshots():
await client.delete_full_snapshot(snapshot_name=snap.name, wait=True)
for snap in await client.list_shard_snapshots(COLLECTION_NAME, shard_id=0):
await client.delete_shard_snapshot(
COLLECTION_NAME, snapshot_name=snap.name, shard_id=0
)
time.sleep(0.5) # https://github.com/qdrant/qdrant-client/issues/254

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify blocking sleeps inside async functions in this test file.
python - <<'PY'
import ast
from pathlib import Path

path = Path("tests/test_async_qdrant_client.py")
tree = ast.parse(path.read_text())

for node in ast.walk(tree):
    if isinstance(node, ast.AsyncFunctionDef):
        for inner in ast.walk(node):
            if isinstance(inner, ast.Call) and isinstance(inner.func, ast.Attribute):
                if getattr(inner.func.value, "id", None) == "time" and inner.func.attr == "sleep":
                    print(f"{path}:{inner.lineno} -> time.sleep inside async function '{node.name}'")
PY

Repository: qdrant/qdrant-client

Length of output: 357


🏁 Script executed:

# Check imports in the test file
head -30 tests/test_async_qdrant_client.py

Repository: qdrant/qdrant-client

Length of output: 838


🏁 Script executed:

# Get context around line 193 and check the function scope
sed -n '180,230p' tests/test_async_qdrant_client.py | cat -n

Repository: qdrant/qdrant-client

Length of output: 2627


🏁 Script executed:

# Check if asyncio is imported anywhere in the file
rg "import asyncio" tests/test_async_qdrant_client.py

Repository: qdrant/qdrant-client

Length of output: 78


Replace blocking time.sleep with await asyncio.sleep in async cleanup code.

Using time.sleep in an async def blocks the event loop and can cause spurious async test failures. There are two instances in this function that need to be fixed:

  • Line 193: time.sleep(0.5) in shard snapshot deletion cleanup
  • Line 221: time.sleep(0.5) in shard snapshot deletion cleanup
💡 Proposed fixes
         await client.delete_shard_snapshot(
             COLLECTION_NAME, snapshot_name=snap.name, shard_id=0
         )
-        time.sleep(0.5)  # https://github.com/qdrant/qdrant-client/issues/254
+        await asyncio.sleep(0.5)  # https://github.com/qdrant/qdrant-client/issues/254
         await client.delete_shard_snapshot(
             COLLECTION_NAME, snapshot_name=snapshots[0].name, shard_id=0
         )
-    time.sleep(
-        0.5
-    )  # wait param is not propagated https://github.com/qdrant/qdrant-client/issues/254
+    await asyncio.sleep(
+        0.5
+    )  # wait param is not propagated https://github.com/qdrant/qdrant-client/issues/254
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Clean up any stale snapshots left by previous failed runs
for snap in await client.list_snapshots(COLLECTION_NAME):
await client.delete_snapshot(COLLECTION_NAME, snapshot_name=snap.name, wait=True)
for snap in await client.list_full_snapshots():
await client.delete_full_snapshot(snapshot_name=snap.name, wait=True)
for snap in await client.list_shard_snapshots(COLLECTION_NAME, shard_id=0):
await client.delete_shard_snapshot(
COLLECTION_NAME, snapshot_name=snap.name, shard_id=0
)
time.sleep(0.5) # https://github.com/qdrant/qdrant-client/issues/254
# Clean up any stale snapshots left by previous failed runs
for snap in await client.list_snapshots(COLLECTION_NAME):
await client.delete_snapshot(COLLECTION_NAME, snapshot_name=snap.name, wait=True)
for snap in await client.list_full_snapshots():
await client.delete_full_snapshot(snapshot_name=snap.name, wait=True)
for snap in await client.list_shard_snapshots(COLLECTION_NAME, shard_id=0):
await client.delete_shard_snapshot(
COLLECTION_NAME, snapshot_name=snap.name, shard_id=0
)
await asyncio.sleep(0.5) # https://github.com/qdrant/qdrant-client/issues/254
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_async_qdrant_client.py` around lines 184 - 194, Replace the
blocking time.sleep calls used in async cleanup with non-blocking await
asyncio.sleep; specifically, in tests/test_async_qdrant_client.py replace the
time.sleep(0.5) calls that appear after delete_shard_snapshot (and the other
occurrence noted) with await asyncio.sleep(0.5) and ensure asyncio is imported
at the top of the test module so the async test function can await the sleep
without blocking the event loop; keep the rest of the cleanup logic
(client.list_shard_snapshots, client.delete_shard_snapshot, etc.) unchanged.

Comment on lines +577 to +580
resp = httpx.get("http://localhost:6333/cluster")
if resp.json().get("result", {}).get("status") == "disabled":
pytest.skip("Requires distributed mode")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify synchronous httpx.get usage inside async functions.
python - <<'PY'
import ast
from pathlib import Path

path = Path("tests/test_async_qdrant_client.py")
tree = ast.parse(path.read_text())

for node in ast.walk(tree):
    if isinstance(node, ast.AsyncFunctionDef):
        for inner in ast.walk(node):
            if isinstance(inner, ast.Call):
                f = inner.func
                if isinstance(f, ast.Attribute) and getattr(f.value, "id", None) == "httpx" and f.attr == "get":
                    print(f"{path}:{inner.lineno} -> sync httpx.get inside async function '{node.name}'")
PY

Repository: qdrant/qdrant-client

Length of output: 164


🏁 Script executed:

#!/bin/bash
# View the full context of the test function around the problematic lines
sed -n '565,595p' tests/test_async_qdrant_client.py

Repository: qdrant/qdrant-client

Length of output: 1179


🏁 Script executed:

#!/bin/bash
# Check imports and module-level error handling
head -50 tests/test_async_qdrant_client.py | grep -E "import|from.*import"

Repository: qdrant/qdrant-client

Length of output: 343


🏁 Script executed:

#!/bin/bash
# Check for other httpx usages in the test file to understand the pattern
rg "httpx\." tests/test_async_qdrant_client.py -n

Repository: qdrant/qdrant-client

Length of output: 121


🏁 Script executed:

#!/bin/bash
# Verify httpx.AsyncClient is available and check version
python - <<'PY'
import httpx
print(f"httpx version: {httpx.__version__}")
print(f"AsyncClient available: {hasattr(httpx, 'AsyncClient')}")
print(f"HTTPError available: {hasattr(httpx, 'HTTPError')}")
PY

Repository: qdrant/qdrant-client

Length of output: 139


🏁 Script executed:

#!/bin/bash
# Verify if there's any pytest fixture handling setup/teardown errors
rg "@pytest.fixture|@pytest.mark.asyncio" tests/test_async_qdrant_client.py -B2 -A2 | head -40

Repository: qdrant/qdrant-client

Length of output: 569


🏁 Script executed:

#!/bin/bash
# Check for conftest.py or any test setup that might ensure cluster availability
fd "conftest.py" tests/ -exec cat {} \; | head -100

Repository: qdrant/qdrant-client

Length of output: 210


🏁 Script executed:

#!/bin/bash
# Check for conftest.py or any test setup that might ensure cluster availability
find tests -name "conftest.py" -exec cat {} \;

Repository: qdrant/qdrant-client

Length of output: 651


🏁 Script executed:

#!/bin/bash
# Check if there's any CI/setup configuration that ensures cluster availability
find . -maxdepth 2 -name "*.yml" -o -name "*.yaml" -o -name "docker-compose*" | grep -E "(github|gitlab|ci|docker)" | head -10

Repository: qdrant/qdrant-client

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Final verification: Check if there are any other async HTTP patterns in the codebase we should reference
rg "AsyncClient|async.*http" --type py | head -20

Repository: qdrant/qdrant-client

Length of output: 800


Use async HTTP probing (or error-safe fallback) for the distributed-mode skip guard.

httpx.get(...) is synchronous and blocks the event loop in this async test. Additionally, if the request fails—connection error, timeout, etc.—the exception is unhandled, causing the test to fail instead of skipping gracefully.

💡 Proposed fix
-    resp = httpx.get("http://localhost:6333/cluster")
-    if resp.json().get("result", {}).get("status") == "disabled":
-        pytest.skip("Requires distributed mode")
+    try:
+        async with httpx.AsyncClient(timeout=5.0) as http_client:
+            resp = await http_client.get("http://localhost:6333/cluster")
+        if resp.json().get("result", {}).get("status") == "disabled":
+            pytest.skip("Requires distributed mode")
+    except httpx.HTTPError:
+        pytest.skip("Cluster endpoint unavailable; requires distributed mode")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
resp = httpx.get("http://localhost:6333/cluster")
if resp.json().get("result", {}).get("status") == "disabled":
pytest.skip("Requires distributed mode")
try:
async with httpx.AsyncClient(timeout=5.0) as http_client:
resp = await http_client.get("http://localhost:6333/cluster")
if resp.json().get("result", {}).get("status") == "disabled":
pytest.skip("Requires distributed mode")
except httpx.HTTPError:
pytest.skip("Cluster endpoint unavailable; requires distributed mode")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_async_qdrant_client.py` around lines 577 - 580, The sync call
using httpx.get blocks the event loop and doesn't handle connection errors;
replace it with an async probe using httpx.AsyncClient and await the GET (or run
httpx.get in a thread via anyio.to_thread.run_sync) and wrap the probe in a
try/except that catches httpx.RequestError and timeouts, calling
pytest.skip("Requires distributed mode") when the probe fails or when
resp.json().get("result", {}).get("status") == "disabled"; update the code
around the existing httpx.get and pytest.skip usage to use the async
client/await and error-safe fallback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant