Skip to content

test(benchmark): Define requirements for performance benchmarking #3

@christian-pinto

Description

@christian-pinto

Testing Objective

The objective is to establish the core requirements for a flexible benchmarking system that can evaluate registered models and algorithms. The system must support both internally defined test cases and integration with existing, domain-specific benchmarking frameworks.

To achieve this, we need to collaboratively define and agree upon the following aspects:

Scope of Requirements:

The primary objective of this issue is to establish the core requirements for a flexible and usable benchmarking system. This system must evaluate registered models, support internal and external benchmarks, and allow users to easily discover available testing options.

To achieve this, we need to collaboratively define and agree upon the following aspects:

Scope of Requirements:

  • 1. Benchmark Registry and Discoverability: Define how users find and select available benchmarks.

  • 2. Benchmark Test Case Handling: Define how the system manages test cases, accommodating two distinct types:

    • A) Natively Defined Test Cases: Specify the standard format for simple, internal test cases created from scratch within our system.
    • B) Integration with External Frameworks: Define the strategy for incorporating existing, domain-specific frameworks (e.g., MLPerf, GLUE). This involves creating "adapters" that allow our system to invoke the framework and parse its results.
  • 3. Model Execution Script Requirements: Define the standard for the script that runs a model against a benchmark.

    • Interface: What is the standard interface the script must expose? It must be generic enough to work with both native and external test cases.
    • Responsibilities: The script is responsible for handling the model's specific logic (e.g., setting hyperparameters), receiving the benchmark definition, and executing the model.
    • Ownership: This script will be provided by the model contributor.
  • 4. Model/Algorithm Contributor Responsibilities: Articulate what contributors must provide when they add a model

  • 5. Benchmarking Orchestration: Determine the high-level operations we will support (e.g., single runs, hyperparameter sweeps, triggering external frameworks).

  • 6. Dataset Sourcing & Hosting: Specify requirements for datasets, noting they may be managed internally or be part of an external framework.

  • 7. Execution Environment: Outline infrastructure requirements, including a mechanism to handle dependencies for external frameworks (e.g., via containerization).

  • 8. Data Storage & Analysis: Define a unified format for storing results to ensure consistent analysis, regardless of the benchmark's source.


✅ Expected Outcome:

Upon successful completion of this issue, a concise document detailing the agreed-upon benchmarking requirements will be committed to the project repository. This document will serve as the foundation for the subsequent design and implementation of the benchmarking protocol.

Metadata

Metadata

Labels

documentationImprovements or additions to documentation

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions