BenchmarkResults Table – Data Model & Access Patterns

This document explains the DynamoDB schema used to store test results and analyze test runs.

Use Case

The BenchmarkResults DynamoDB table stores results for the Lambda ARM vs x86 benchmark suite.

This is a single-table, multi-entity database that is optimized for:

Fast, cheap queries over aggregated statistics (avoids excessive reads).
Occasional deep dives into raw invocation results.
Tracking and comparing multiple benchmark runs over time.

Because this table is for test result data and isn't mission-critical or production data, we don't have backups enabled for this table.

Core Access Patterns

The table is designed around these primary access patterns:

AP1 – Get test run summary + all aggregates for a run
“Show me metadata and stats for test run X.”
AP2 – Get a single aggregate for a specific config in a run
“Give me cold-start stats for python3.13-arm64-cpu-512 in run X.”
AP3 – Get all raw results for a test run
“Fetch all samples for run X so I can re-aggregate / debug.”
AP4 – Get all runs for a specific configuration
“Show how python3.13-arm64-cpu-512 has changed over time.”
AP5 – List all test runs
“Show me all benchmark runs with their status, notes, and timestamps.”
AP6 (optional / advanced) – Detect missing aggregates for a run
“Given the test matrix for run X, which aggregates are missing?”

The schema and GSIs exist to satisfy these patterns directly.

Table Design Overview

Single-Table, Multi-Entity

All entities share one table:

result – individual invocation data
aggregate – pre-calculated statistics per configuration
test-run – metadata and test matrix for a run

Every item includes:

itemType {"result","aggregate","test-run"} – required entity type declaration
testRunId – UUID that ties all data for a run together

Primary Key (Base Table)

The base table uses a generic primary key:

Partition key: pk
Sort key: sk

Item key patterns:

Test run items
- pk = "TESTRUN#{testRunId}"
- sk = "TESTRUN#{testRunId}"
Aggregate items
- pk = "TESTRUN#{testRunId}"
- sk = "AGGREGATE#{configId}#{invocationType}"
Result items
- pk = "{testRunId}#{configId}"
- sk = "{invocationType}#{invocationNumber}"

Note
Aggregates and the test-run item for a run share the same pk.
This allows a single query on pk = "TESTRUN#{testRunId}" to return:

One test-run item (metadata)

Many aggregate items (stats)

Entity Definitions

Test Run Item (`itemType = "test-run"`)

Key pattern

pk = "TESTRUN#{testRunId}"
sk = "TESTRUN#{testRunId}"

Purpose

One item per benchmark run.
Stores metadata and the test matrix.

Typical attributes

Identity & timestamps:
- testRunId
- timestamp (start time)
- startTime, endTime (optional)
Status:
- status – "in_progress", "completed", or "failed"
- failedInvocations (optional)
- errorSummary (optional)
Run configuration:
- mode – "test" or "production"
- totalConfigurations
- totalInvocations
- coldStartsPerConfig
- warmStartsPerConfig
Documentation:
- notes – human-readable description of what changed / why
Test matrix (recommended):
- High-level lists:
  - runtimes
  - architectures
  - workloadTypes
- Detailed:
  - configurations[] with:
    - runtime
    - architecture
    - workloadType
    - memorySizes[]

The test matrix enables reconstruction of expected (configId, invocationType) aggregates and detection of missing data, and it supports partial runs (for example, filtered by runtime or memory).

Aggregate Item (`itemType = "aggregate"`)

Key pattern

pk = "TESTRUN#{testRunId}"
sk = "AGGREGATE#{configId}#{invocationType}"

Purpose

One item per (testRunId, configId, invocationType).
Stores pre-calculated statistics for fast, low-cost analysis.

Typical attributes

Identity:
- testRunId
- configId
- invocationType – "cold" | "warm"
- timestamp (when the aggregate was created)
Configuration dimensions:
- runtime
- architecture
- workloadType
- memorySizeMB
Sample metadata:
- sampleCount
- failedCount
- allSuccessful
Statistics objects (names can evolve):
- durationMsStats
- billedDurationMsStats
- memoryMBStats
- initDurationMsStats (cold only)

Each stats object typically includes a combination of: mean, median, min, max, percentiles (p90, p95, p99), sampleCount, and optional flags such as outliersRemoved.

Result Item (`itemType = "result"`)

Key pattern

pk = "{testRunId}#{configId}"
sk = "{invocationType}#{invocationNumber}"

Purpose

Raw per-invocation data used for deep debugging or re-computing stats.

Typical attributes

Identity:
- testRunId
- configId
- invocationType – "cold" | "warm"
- invocationNumber – integer
- timestamp
Metrics:
- Duration metrics (for example, durationMs, billedDurationMs)
- Memory usage (for example, maxMemoryUsedMB)
- Optional cold-start metric: initDurationMs
- Any additional metrics captured (for example, client latency)
Lambda metadata:
- functionName
- lambdaRequestId (or equivalent request ID)

Global Secondary Indexes

ConfigIndex (GSI1)

Schema

Partition key: configId
Sort key: timestamp

Supports

AP4 – Get all runs for a specific configuration

Typical questions

“Show all runs where python3.13-arm64-cpu-512 was tested.”
“Get the most recent N runs for this configuration.”

TestRunIndex (GSI2)

Schema

Partition key: testRunId
Sort key: timestamp

Supports

AP3 – Get all raw results for a test run

Typical questions

“Fetch all samples for run abc-123.”
“Recompute stats for run abc-123 using custom logic.”

Access Patterns and How to Query

The table below summarizes which part of the model to use for each access pattern.

Access pattern	Target	Key condition / usage
AP1 – Test run summary + all aggregates for a run	Base table	`pk = "TESTRUN#{testRunId}"`. Filter `itemType` = `test-run` / `aggregate`.
AP2 – Single aggregate for `(testRunId, configId, type)`	Base table (GetItem)	`pk = "TESTRUN#{testRunId}"`, `sk = "AGGREGATE#{configId}#{invocationType}"`.
AP3 – All raw results for a test run	TestRunIndex (GSI2)	`testRunId = {testRunId}`; use `timestamp` sort key as needed.
AP4 – All runs for a given configuration	ConfigIndex (GSI1)	`configId = {configId}`; sort / filter by `timestamp`.
AP5 – List all test runs	Base table (Scan)	Filter on `itemType = "test-run"`, then sort by `timestamp`.
AP6 – Detect missing aggregates for a run	Base table + primary-key gets	1) Read `test-run` item; 2) for each expected `(configId, type)` perform GetItem on aggregate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BenchmarkResults Table – Data Model & Access Patterns

Use Case

Core Access Patterns

Table Design Overview

Single-Table, Multi-Entity

Primary Key (Base Table)

Entity Definitions

Test Run Item (`itemType = "test-run"`)

Aggregate Item (`itemType = "aggregate"`)

Result Item (`itemType = "result"`)

Global Secondary Indexes

ConfigIndex (GSI1)

TestRunIndex (GSI2)

Access Patterns and How to Query

FilesExpand file tree

dynamodb-schema.md

Latest commit

History

dynamodb-schema.md

File metadata and controls

BenchmarkResults Table – Data Model & Access Patterns

Use Case

Core Access Patterns

Table Design Overview

Single-Table, Multi-Entity

Primary Key (Base Table)

Entity Definitions

Test Run Item (itemType = "test-run")

Aggregate Item (itemType = "aggregate")

Result Item (itemType = "result")

Global Secondary Indexes

ConfigIndex (GSI1)

TestRunIndex (GSI2)

Access Patterns and How to Query

Test Run Item (`itemType = "test-run"`)

Aggregate Item (`itemType = "aggregate"`)

Result Item (`itemType = "result"`)