Fix top sample data#45214

Open

YoYoJa wants to merge 2 commits intomainfrom

jessli/FixTopSampleData

Contributor

YoYoJa commented Feb 16, 2026

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.


          Fix top sample data

0fe7846

YoYoJa requested a review from a team as a code owner

February 16, 2026 19:15

Copilot AI review requested due to automatic review settings

February 16, 2026 19:15

github-actions bot added the Evaluation label

Copilot started reviewing on behalf of YoYoJa

February 16, 2026 19:16

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This PR fixes the handling of top-level sample data in evaluation output conversion. The changes address how generated sample data is parsed and stored, and ensure token counts are properly normalized to integers instead of floats.

Changes:

Added parsing logic for sample.generated_sample_data field to extract and populate top-level sample information
Modified token count fields (prompt_tokens, completion_tokens, total_tokens) to convert float values to integers
Updated test data files to include new sample data fields and reflect integer token counts

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py	Added logic to parse and extract generated_sample_data from input, and modified token count handling to convert values to integers
sdk/evaluation/azure-ai-evaluation/tests/unittests/data/evaluation_util_convert_old_output_test.jsonl	Updated test input data to include sample.generated_sample_data and output_status fields
sdk/evaluation/azure-ai-evaluation/tests/unittests/data/evaluation_util_convert_expected_output.json	Updated expected output to reflect integer token counts and new top-level sample data structure

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

Comment on lines +2370 to +2372

+                          except Exception as e:
+                              logger.error(
+                                  f"Failed to parse generated_sample_data as JSON for row {row_idx}, eval_id: {eval_id}, eval_run_id: {eval_run_id}. Storing as string. Error: {e}"

Copilot AI Feb 16, 2026

The error log message may expose sensitive user content from generated_sample_data. According to the repository's security practices for log redaction, user-provided content should be redacted from log messages to prevent credentials or sensitive information from appearing in telemetry. The generated_sample_data can contain queries, responses, and other user inputs that should not be logged. Consider using logger.debug instead of logger.error, or ensure the exception message doesn't contain the actual data content.

Suggested change

      
                        except Exception as e:
          
                            logger.error(
          
                                f"Failed to parse generated_sample_data as JSON for row {row_idx}, eval_id: {eval_id}, eval_run_id: {eval_run_id}. Storing as string. Error: {e}"
          
                        except Exception:
          
                            logger.error(
          
                                "Failed to parse generated_sample_data as JSON for row %s, eval_id: %s, eval_run_id: %s. "
          
                                "Storing as string.",
          
                                row_idx,
          
                                eval_id,
          
                                eval_run_id,
          
                                exc_info=True,

Copilot uses AI. Check for mistakes.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

Comment on lines +2714 to +2716

+                      metric_dict["sample"]["usage"]["total_tokens"] = (
+                          None if _is_none_or_nan(metric_value) else int(float(metric_value))
+                      )

Copilot AI Feb 16, 2026

The int(float(metric_value)) conversion can raise ValueError if metric_value is a string that cannot be converted to float. Consider adding exception handling or validation before conversion to handle edge cases gracefully. For example, if metric_value is an empty string or contains non-numeric characters, this will crash.

Copilot uses AI. Check for mistakes.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

Comment on lines +2722 to +2724

+                      metric_dict["sample"]["usage"]["prompt_tokens"] = (
+                          None if _is_none_or_nan(metric_value) else int(float(metric_value))
+                      )

Copilot AI Feb 16, 2026

The int(float(metric_value)) conversion can raise ValueError if metric_value is a string that cannot be converted to float. Consider adding exception handling or validation before conversion to handle edge cases gracefully. For example, if metric_value is an empty string or contains non-numeric characters, this will crash.

Copilot uses AI. Check for mistakes.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

Comment on lines +2730 to +2732

+                      metric_dict["sample"]["usage"]["completion_tokens"] = (
+                          None if _is_none_or_nan(metric_value) else int(float(metric_value))
+                      )

Copilot AI Feb 16, 2026

The int(float(metric_value)) conversion can raise ValueError if metric_value is a string that cannot be converted to float. Consider adding exception handling or validation before conversion to handle edge cases gracefully. For example, if metric_value is an empty string or contains non-numeric characters, this will crash.

Copilot uses AI. Check for mistakes.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

+                          except Exception as e:
+                              logger.error(
+                                  f"Failed to parse generated_sample_data as JSON for row {row_idx}, eval_id: {eval_id}, eval_run_id: {eval_run_id}. Storing as string. Error: {e}"
+                              )

Copilot AI Feb 16, 2026

The error log message states "Storing as string" but the code doesn't actually store the generated_sample_data when JSON parsing fails. The top_sample remains an empty dict, and the generated_sample_data is not preserved in input_data either. This means the data is lost rather than stored as a string. Either remove the misleading "Storing as string" text from the error message, or actually implement fallback logic to store the raw string value.

Suggested change

      
                            )
          
                            )
          
                            # Fallback: store the raw string value in the sample when JSON parsing fails
          
                            top_sample = {"generated_sample_data": top_sample_str}
          
                            # Remove the raw string from input_data to keep behavior consistent
          
                            input_data.pop("sample.generated_sample_data", None)

Copilot uses AI. Check for mistakes.


          merge main

40b422a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels