Skip to content

[Enhancement] Metrics changes and Eval demo tasks addition#119

Merged
vipul-mittal merged 12 commits into
mainfrom
scratch/eval_demo_tasks
Feb 3, 2026
Merged

[Enhancement] Metrics changes and Eval demo tasks addition#119
vipul-mittal merged 12 commits into
mainfrom
scratch/eval_demo_tasks

Conversation

@NiraliPopat

@NiraliPopat NiraliPopat commented Jan 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR is adding demo tasks for evaluation and the corresponding changes to support that.

Explain the features implemented:

  • question answering eval task added at tasks/eval/question_answering/simpleqa
  • classification eval task added at tasks/eval/classification/simpleqa
  • precision, recall, and f1_score to get the average as well as per-class score
  • unit_metric registry added
  • fixes to support eval tasks

Performance impact (if any):

  • N/A

How to Test the feature

Steps for reviewers to verify functionality:

  1. run the task tasks/eval/question_answering/simpleqa
  2. Observe tasks/eval/question_answering/simpleqa/MetricCollatorPostProcessor_.json file with results
  3. run the task tasks/eval/classification/simpleqa
  4. Observe tasks/eval/classification/simpleqa/MetricCollatorPostProcessor_.json file with results

Screenshots (if applicable)

NA

Checklist

  • Lint fixes and unit testing done
  • End to end task testing
  • Documentation updated

Result Files

Classification -> simpleqa

MetricCollatorPostProcessor_2026-01-27_14-30-59.json

Question_Answering -> simpleqa

MetricCollatorPostProcessor_2026-01-27_14-27-27.json

@NiraliPopat NiraliPopat requested a review from a team as a code owner January 22, 2026 08:28
@NiraliPopat NiraliPopat marked this pull request as draft January 22, 2026 08:29
@NiraliPopat NiraliPopat marked this pull request as ready for review January 22, 2026 08:55
@NiraliPopat NiraliPopat marked this pull request as draft January 22, 2026 09:20
@NiraliPopat NiraliPopat marked this pull request as ready for review January 22, 2026 09:41
@bidyapati-p

Copy link
Copy Markdown
Collaborator

We need to update documentation, this is first task on sygra platform. Lets discuss and update document as well as part of this

@bidyapati-p

Copy link
Copy Markdown
Collaborator

@NiraliPopat
Can you please attach the result file to review the output schema

@NiraliPopat

Copy link
Copy Markdown
Collaborator Author

Added the documentation README and the result files to the PR description.

@psriramsnc psriramsnc left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@psriramsnc psriramsnc added the enhancement New feature or request label Feb 3, 2026
@vipul-mittal vipul-mittal merged commit 083da7e into main Feb 3, 2026
4 checks passed
@vipul-mittal vipul-mittal deleted the scratch/eval_demo_tasks branch February 3, 2026 09:05
abhigyaverma02 pushed a commit that referenced this pull request Apr 17, 2026
* Metrics changes and Eval demo tasks addition

* Unit tests modification

* fix format

* fix lint

* fix tests

* fix tests

* add documentation

* fix documentation link

---------

Co-authored-by: Vipul Mittal <118464422+vipul-mittal@users.noreply.github.com>
Co-authored-by: Sriram Puttagunta <sriram.puttagunta@servicenow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants