Target: <50ms end-to-end for FastPath mode
- Expected: 1-5ms
- Measured: TBD
- Expected: <1ms
- Measured: ~1.3ms (from benchmarks)
- Sub-operations (100 fastlets):
- Candidate filtering: <0.5ms
- Scoring: <0.5ms
- Selection: <0.5ms
- Large pool (1000 fastlets): ~14ms
- Expected: 5-20ms
- Measured: TBD
- Expected: 10-30ms
- Measured: TBD
- Sub-operations:
- Image pull (cached): 0ms
- Container create: TBD
- Container start: TBD
Test Environment:
- CPU: Apple M4 Pro
- OS: darwin/arm64
- Date: 2026-01-26
| Benchmark | ns/op | B/op | allocs/op | Notes |
|---|---|---|---|---|
| BenchmarkRegistryAllocate | 1312 | 993 | 4 | Standard allocation (100 fastlets) |
| BenchmarkRegistryAllocateWithPorts | 1469 | 993 | 4 | With port constraints |
| BenchmarkRegistryAllocateNoImageMatch | 1349 | 993 | 4 | No pre-image match |
| BenchmarkRegistryAllocateLargePool | 14613 | 8297 | 4 | Large pool (1000 fastlets) |
| BenchmarkRegistryRegisterOrUpdate | 127.9 | 91 | 4 | Fastlet registration |
| BenchmarkRegistryGetAllFastlets | 4810 | 19328 | 2 | Get all fastlets (100) |
| BenchmarkRegistryGetAllFastletsLargePool | 51290 | 188419 | 2 | Get all fastlets (1000) |
| BenchmarkRegistryGetFastletByID | 27.06 | 0 | 0 | Map lookup - zero alloc |
| BenchmarkRegistryRelease | 14.49 | 0 | 0 | Fastlet release - zero alloc |
| BenchmarkRegistryCleanupStaleFastlets | 38.58 | 0 | 0 | Stale cleanup - zero alloc |
| BenchmarkParallelAllocate | 1513 | 1016 | 6 | Concurrent allocation |
Run benchmarks with:
go test ./internal/controller/fastletpool/ -bench=. -benchmemStart controller with profiling:
./scripts/profile.shCapture 30-second profile:
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30 > /tmp/controller_cpu.profView profile:
go tool pprof -http=:8080 /tmp/controller_cpu.profGenerate flamegraph from captured profile:
./scripts/flamegraph.shPrometheus metrics are available for FastPath operations:
fastpath_create_sandbox_duration_seconds- Histogram of CreateSandbox RPC duration- Labels:
mode(fast/strong),success(true/false) - Buckets: 1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s
- Labels:
Enable detailed timing logs with verbosity level 2:
# Controller
./bin/controller -v=2
# Fastlet
./bin/fastlet -v=2This will show:
- Registry allocation timing breakdown
- Fastlet RPC call timing
- containerd Runtime timing breakdown
- Registry allocation - minimize lock contention (currently ~1.3ms for 100 fastlets)
- Fastlet RPC - consider connection pooling (gRPC connection reuse)
- containerd - ensure image cache hit (zero-pull goal)
- Controller reconcile - optimize periodic sync interval
- FastPath gRPC server - measure actual gRPC call overhead
| Operation | Target | Current | Status |
|---|---|---|---|
| FastPath CreateSandbox (e2e) | <50ms | TBD | 🔍 To Measure |
| Registry.Allocate (100 fastlets) | <2ms | ~1.3ms | ✅ Pass |
| Registry.Allocate (1000 fastlets) | <20ms | ~14ms | ✅ Pass |
| Fastlet.CreateSandbox RPC | <20ms | TBD | 🔍 To Measure |
| containerd container start | <30ms | TBD | 🔍 To Measure |
If performance degrades:
-
Run benchmarks to detect regression:
go test ./internal/controller/fastletpool/ -bench=. -benchmem -
Capture CPU profile to identify hotspots:
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30 > /tmp/cpu.prof go tool pprof -list Allocate /tmp/cpu.prof
-
Check logs with
-v=2to see timing breakdown -
Check metrics at
:9091/metricsfor Prometheus histograms