Skip to content

test: reduce min_sample_size from 30 to 10 in integration tests#38

Merged
ghinks merged 2 commits intomainfrom
feat/reduce-min-samples-integration-tests
Mar 11, 2026
Merged

test: reduce min_sample_size from 30 to 10 in integration tests#38
ghinks merged 2 commits intomainfrom
feat/reduce-min-samples-integration-tests

Conversation

@ghinks
Copy link
Owner

@ghinks ghinks commented Mar 11, 2026

What

Reduces min_sample_size from 30 to 10 in all integration tests.

Why

A threshold of 30 means the classify tests fail against any repository
that has fewer than 30 PRs merged in the collation window. Lowering to 10
makes the tests more resilient without compromising their ability to detect
outliers — statistical Z-score analysis is still meaningful at this sample
size for the metrics being tested.

Changes

tests/test_integration.py

  • Added --min-samples 10 to test_classify_example_table_output,
    test_classify_example_stricter_threshold, and test_classify_example_json_output
    (previously using the application default of 30)
  • Updated test_classify_example_exclude_primary_merged from 5 to 10
    for consistency across all classify tests

tests/test_outlier_detection_integration.py

  • Changed all explicit min_sample_size=30 calls to min_sample_size=10
  • Updated InsufficientDataError match string from "at least 30" to "at least 10"

Testing

uv run pytest -m "not integration" -q
# 62 passed

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

ghinks and others added 2 commits March 11, 2026 06:12
Rewrite test_integration.py with a proper fixture-based structure:
- session-scoped github_env and date_windows fixtures for shared setup
- module-scoped fetched_workspace fixture that runs fetch once and
  reuses the database across all classify tests
- date windows computed dynamically relative to today rather than
  hardcoded values
- four parametrised fetch variants (default, with-dates, reset-db, config)
- dedicated classify tests: table output, stricter threshold, JSON output,
  and --exclude-primary-merged

Expand the README Development section with instructions for running unit
tests only, integration tests only, and each individual integration test
by its pytest node ID.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lower the minimum sample threshold across all integration tests so
they can pass against repositories with fewer than 30 merged PRs in
the collation window.

- tests/test_integration.py: add --min-samples 10 to all classify
  commands that did not already specify it; update the
  exclude-primary-merged test from 5 to 10 for consistency
- tests/test_outlier_detection_integration.py: change all explicit
  min_sample_size=30 calls to min_sample_size=10; update the
  InsufficientDataError match string accordingly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ghinks ghinks merged commit 8951961 into main Mar 11, 2026
1 check passed
@ghinks ghinks deleted the feat/reduce-min-samples-integration-tests branch March 11, 2026 10:46
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66651535ba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +298 to +299
"--min-samples",
"10",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore lower min-samples for exclude-primary integration path

Increasing this test's --min-samples to 10 makes the --exclude-primary-merged scenario fail whenever the live fixture repo has only 5–9 non-primary-branch merges in the analysis window, because outlier detection exits with InsufficientDataError when len(prs) < min_sample_size (src/review_classification/analysis/outlier_detector.py, calculate_repository_statistics). Since this test targets real GitHub data, that tighter threshold reintroduces the same availability/flakiness problem this change is trying to reduce.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant