Refactor evaluation in tests to use evaluate_accuracy function by xin3he · Pull Request #1402 · intel/auto-round

xin3he · 2026-02-04T10:14:41Z

Description

Refactor evaluation in tests to use evaluate_accuracy function

Replaced instances of simple_evaluate_user_model with evaluate_accuracy across multiple test files.
Updated accuracy thresholds for various tasks in test cases to ensure consistency.
Removed redundant imports and functions related to accuracy evaluation.
Enhanced readability and maintainability of the test code by centralizing accuracy evaluation logic.

Type of Change

Related Issues

Fixes or relates to #

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

Copilot

Pull request overview

This PR refactors test files to use a centralized evaluate_accuracy helper function, replacing multiple instances of simple_evaluate_user_model and simple_evaluate calls. The changes improve code maintainability by consolidating evaluation logic and removing redundant helper functions.

Changes:

Replaced direct evaluation calls with the evaluate_accuracy helper function across test files
Removed duplicate helper functions (get_accuracy) from multiple test files
Updated import statements to include evaluate_accuracy from the helpers module
Cleaned up unused imports including re, simple_evaluate, and simple_evaluate_user_model

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
test/helpers.py	Added centralized `evaluate_accuracy` and `generate_prompt` helper functions
test/test_cuda/utils/test_alg_ext.py	Replaced evaluation call with `evaluate_accuracy` helper
test/test_cuda/schemes/test_auto_scheme.py	Updated to use `evaluate_accuracy` helper
test/test_cuda/quantization/test_mxfp_nvfp.py	Replaced evaluation with `evaluate_accuracy` helper
test/test_cuda/quantization/test_mix_bits.py	Updated to use `evaluate_accuracy` helper
test/test_cuda/quantization/test_asym.py	Removed unused imports
test/test_cuda/quantization/test_2_3bits.py	Replaced evaluation calls and removed duplicate `get_accuracy` function
test/test_cuda/export/test_gguf.py	Updated to use `evaluate_accuracy` helper
test/test_cuda/export/test_auto_round_format.py	Replaced evaluation calls with `evaluate_accuracy` helper
test/test_cuda/core/test_main_func.py	Updated evaluation calls and removed duplicate `get_accuracy` function
test/test_cuda/backends/test_triton_backend.py	Replaced evaluation calls with `evaluate_accuracy` helper
test/test_cuda/backends/test_torch_backend.py	Updated to use `evaluate_accuracy` helper
test/test_cuda/backends/test_marlin_backend.py	Replaced evaluation calls with `evaluate_accuracy` helper
test/test_cuda/backends/test_exllamav2_backend.py	Updated to use `evaluate_accuracy` helper
test/test_cuda/advanced/test_multiple_card_calib.py	Removed duplicate `get_accuracy` function
test/test_cuda/advanced/test_multiple_card.py	Removed duplicate `get_accuracy` function and updated evaluation calls
test/test_cuda/advanced/test_fp8_input.py	Moved helper functions to centralized location
test/test_cpu/quantization/test_mix_bits.py	Updated to use `evaluate_accuracy` helper
test/test_cpu/quantization/test_asym.py	Removed unused imports
test/test_cpu/core/test_autoround.py	Replaced evaluation calls with `evaluate_accuracy` helper
test/test_cpu/backends/test_torch_backend.py	Updated to use `evaluate_accuracy` helper
test/test_ark/test_model.py	Replaced evaluation call with `evaluate_accuracy` helper

test/test_cuda/schemes/test_auto_scheme.py

xin3he · 2026-02-04T10:16:58Z

Let‘s target it to 1.0.0

Refactor evaluation in tests to use evaluate_accuracy function

cb401e5

Copilot AI review requested due to automatic review settings February 4, 2026 10:14

Copilot AI reviewed Feb 4, 2026

View reviewed changes

test/test_cuda/schemes/test_auto_scheme.py Show resolved Hide resolved

test/test_cuda/schemes/test_auto_scheme.py Show resolved Hide resolved

xin3he requested review from n1ck-guo and wenhuach21 February 4, 2026 10:16

xin3he added this to the 1.0.0 milestone Feb 4, 2026

xin3he force-pushed the xinhe/ut branch 2 times, most recently from e82e580 to cb401e5 Compare February 5, 2026 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor evaluation in tests to use evaluate_accuracy function#1402

Refactor evaluation in tests to use evaluate_accuracy function#1402
xin3he wants to merge 1 commit intomainfrom
xinhe/ut

xin3he commented Feb 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

xin3he commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xin3he commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

xin3he commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xin3he commented Feb 4, 2026 •

edited

Loading