Techniques to optimize RAGLeakLab runs.
Enable caching to skip redundant retrieval/generation when inputs are identical.
ragleaklab run --corpus data/corpus --attacks data/attacks --out results --cache- Cache Key =
sha256(corpus_hash + target_hash + query + retrieval_params) - On cache miss: runs pipeline/target, stores result
- On cache hit: returns cached result, skips execution
Cache files are stored in:
<output_dir>/.ragleaklab_cache/
Cache automatically invalidates when:
- Corpus changes (different
corpus_hash) - Target changes (different
target_hash) - Query text changes
- Retrieval parameters change (
top_k,strategy)
rm -rf results/.ragleaklab_cache/Each cache entry stores:
retrieved: Retrieval hits with chunk data and scorescontext: Context passed to generatoranswer: Generated answer
When cache is enabled, runs.jsonl includes cache_hit: true/false in each case's meta field.
| Scenario | Speedup |
|---|---|
| Identical corpus + queries | ~10-100x (skips retrieval+generation) |
| Different queries | No speedup (cache miss) |
| Modified corpus | No speedup (invalidated by hash) |
- Enable
--cachefor CI regression runs with frozen corpora - Disable cache when actively developing attack queries
- Clear cache after corpus updates to ensure fresh results
Run attack cases in parallel with --jobs N:
ragleaklab run --corpus data/corpus --attacks data/attacks --out results --jobs 4Results are always sorted by test_id regardless of parallel execution order,
ensuring report.json and runs.jsonl are reproducible across runs.
- Cache disabled: When
jobs > 1, disk cache is disabled (not process-safe) - HTTP targets: Use with caution for HTTP targets without rate limiting
For HTTP targets without rate limiting, consider using --jobs 1 to avoid
overwhelming the target service. Future versions may add configurable rate
limiting for parallel HTTP requests.
| Scenario | Speedup |
|---|---|
| CPU-bound cases, jobs=N | Up to Nx (limited by core count) |
| I/O-bound HTTP targets | Significant (parallel requests) |
| Single case | No speedup |
Use ragleaklab bench to measure performance of attack packs.
ragleaklab bench --pack canary-basic --runs 3 --out out/bench.json| Option | Default | Description |
|---|---|---|
--pack, -p |
Required | Attack pack to benchmark |
--runs, -r |
3 | Number of benchmark runs |
--out, -o |
Required | Output JSON file path |
--cache |
false | Enable disk cache |
--jobs, -j |
1 | Parallel workers |
{
"pack": "canary-basic",
"runs": 3,
"cases_per_run": 10,
"jobs": 1,
"cache_enabled": false,
"total_runtime_sec": 0.156,
"run_times_sec": [0.052, 0.051, 0.053],
"median_per_case_sec": 0.00512,
"median_per_case_ms": 5.12,
"cache_hit_rate": 0.0,
"generated_at": "2026-02-04T00:00:00+00:00"
}| Metric | Description |
|---|---|
total_runtime_sec |
Sum of all run times |
median_per_case_sec |
Median time per case (reliable estimate) |
cache_hit_rate |
Fraction of cached results (0.0 - 1.0) |
| Scenario | Budget | Notes |
|---|---|---|
| Laptop (in-memory pipeline) | 200 cases < 2s | ~10ms/case |
| CI pipeline (in-memory) | 200 cases < 5s | Account for VM overhead |
| HTTP target | 50 cases < 30s | Network latency dependent |
- Baseline first: Run bench once to establish baseline timing
- Multiple runs: Use
--runs 5for stable median estimates - Compare jobs: Test
--jobs 1vs--jobs 4to measure parallelism gains - Cache impact: Compare with/without
--cacheto measure cache speedup