This document explains how benchmark timings are collected without modifying workload code and only using CloudWatch.
- The orchestrator invokes functions with
LogType="Tail" - Lambda returns the last 4 KB of logs base64 encoded in
LogResult - The CloudWatch
REPORTline contains the timing fields we need - We parse the
REPORTline locally and write metrics to DynamoDB - Handlers contain only workload logic
No Telemetry API is required since we can just get this data from the logs.
Durationin milliseconds (stored asdurationMs)Billed Durationin milliseconds (stored asbilledDurationMs)Max Memory Usedin MB (stored asmaxMemoryUsedMB)Init Durationin milliseconds (stored asinitDurationMs, present only on cold starts)
- Cold starts are produced by changing a function environment variable between invocations
- Warm starts reuse the same execution environment
Init Durationappears only on cold starts which gives a second signal that a sample was cold
The cold start technique is credited to AJ Stuyvenberg. See the repo linked below.
- Invoke with
LogType="Tail" - Base64 decode
LogResult - Find the single line that starts with
REPORTand parse key values - Store parsed values with the associated
testRunId,configId, andinvocationType
Parsing should be tolerant to minor formatting differences across runtimes.
- If the handler returns an error variant, still record the
REPORTmetrics and attach the error string on the result item - If the invocation fails before a
REPORTline is produced, record a result item with an explicit failure flag and no timing fields - Do not retry failed invocations inside the metrics module. Retries are orchestrator policy
Init Durationis only present on cold startsBilled Durationis rounded up which is expected- The
REPORTline may include extra fields in some runtimes. Unknown fields can be ignored - Do not log large payloads. The
LogResultlimit is 4 KB