Skip to content

Refactor evaluation in tests to use evaluate_accuracy function#1402

Open
xin3he wants to merge 1 commit intomainfrom
xinhe/ut
Open

Refactor evaluation in tests to use evaluate_accuracy function#1402
xin3he wants to merge 1 commit intomainfrom
xinhe/ut

Conversation

@xin3he
Copy link
Contributor

@xin3he xin3he commented Feb 4, 2026

Description

Refactor evaluation in tests to use evaluate_accuracy function

  • Replaced instances of simple_evaluate_user_model with evaluate_accuracy across multiple test files.
  • Updated accuracy thresholds for various tasks in test cases to ensure consistency.
  • Removed redundant imports and functions related to accuracy evaluation.
  • Enhanced readability and maintainability of the test code by centralizing accuracy evaluation logic.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Copilot AI review requested due to automatic review settings February 4, 2026 10:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors test files to use a centralized evaluate_accuracy helper function, replacing multiple instances of simple_evaluate_user_model and simple_evaluate calls. The changes improve code maintainability by consolidating evaluation logic and removing redundant helper functions.

Changes:

  • Replaced direct evaluation calls with the evaluate_accuracy helper function across test files
  • Removed duplicate helper functions (get_accuracy) from multiple test files
  • Updated import statements to include evaluate_accuracy from the helpers module
  • Cleaned up unused imports including re, simple_evaluate, and simple_evaluate_user_model

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/helpers.py Added centralized evaluate_accuracy and generate_prompt helper functions
test/test_cuda/utils/test_alg_ext.py Replaced evaluation call with evaluate_accuracy helper
test/test_cuda/schemes/test_auto_scheme.py Updated to use evaluate_accuracy helper
test/test_cuda/quantization/test_mxfp_nvfp.py Replaced evaluation with evaluate_accuracy helper
test/test_cuda/quantization/test_mix_bits.py Updated to use evaluate_accuracy helper
test/test_cuda/quantization/test_asym.py Removed unused imports
test/test_cuda/quantization/test_2_3bits.py Replaced evaluation calls and removed duplicate get_accuracy function
test/test_cuda/export/test_gguf.py Updated to use evaluate_accuracy helper
test/test_cuda/export/test_auto_round_format.py Replaced evaluation calls with evaluate_accuracy helper
test/test_cuda/core/test_main_func.py Updated evaluation calls and removed duplicate get_accuracy function
test/test_cuda/backends/test_triton_backend.py Replaced evaluation calls with evaluate_accuracy helper
test/test_cuda/backends/test_torch_backend.py Updated to use evaluate_accuracy helper
test/test_cuda/backends/test_marlin_backend.py Replaced evaluation calls with evaluate_accuracy helper
test/test_cuda/backends/test_exllamav2_backend.py Updated to use evaluate_accuracy helper
test/test_cuda/advanced/test_multiple_card_calib.py Removed duplicate get_accuracy function
test/test_cuda/advanced/test_multiple_card.py Removed duplicate get_accuracy function and updated evaluation calls
test/test_cuda/advanced/test_fp8_input.py Moved helper functions to centralized location
test/test_cpu/quantization/test_mix_bits.py Updated to use evaluate_accuracy helper
test/test_cpu/quantization/test_asym.py Removed unused imports
test/test_cpu/core/test_autoround.py Replaced evaluation calls with evaluate_accuracy helper
test/test_cpu/backends/test_torch_backend.py Updated to use evaluate_accuracy helper
test/test_ark/test_model.py Replaced evaluation call with evaluate_accuracy helper

@xin3he xin3he requested review from n1ck-guo and wenhuach21 February 4, 2026 10:16
@xin3he
Copy link
Contributor Author

xin3he commented Feb 4, 2026

Let‘s target it to 1.0.0

@xin3he xin3he added this to the 1.0.0 milestone Feb 4, 2026
@xin3he xin3he force-pushed the xinhe/ut branch 2 times, most recently from e82e580 to cb401e5 Compare February 5, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant