diff --git a/README.md b/README.md
index 992af0b..1c3b4a2 100644
--- a/README.md
+++ b/README.md
@@ -180,7 +180,7 @@ The criteria spell out what each score means for each dimension. Here is the ful
### Deep dive — does a public skill, and tuning it, actually help?
-A separate study took a public DDD skill (the `tactical-ddd` skill from `ntcoding/claude-skillz`) and its repo-tuned version across four configurations on two deliberately different tasks — a feature on a clean DDD codebase and a legacy anemic→rich refactor. The headline: **the effect is task-dependent** — tuning the skill to the repo adds **+0.13** quality on the clean-feature task but only **+0.06** on the legacy refactor (increment over the bare model; absolute scores across tasks aren't comparable). Two lessons that generalize: judge *per dimension*, not one aggregate; and a skill present on disk is not a skill used — verify it activated.
+A separate study took a public DDD skill (the `tactical-ddd` skill from `ntcoding/claude-skillz`) and its repo-tuned version across four configurations on two deliberately different tasks — a feature on a clean DDD codebase and a legacy anemic→rich refactor. The headline: **a repo-tuned skill measurably beats the bare model on both tasks** (+0.12 on the clean feature, +0.05 on the legacy refactor — increment over vanilla, both clearing our significance bar), and it also beats hand-written DDD hints. But an **off-the-shelf public skill helps only on the greenfield feature** (+0.07) — on the legacy refactor it doesn't beat the bare model at all. Two lessons that generalize: judge *per dimension*, not one aggregate (a real architecture gain can hide inside a flat average); and a skill present on disk is not a skill used — verify it activated.
→ Full tables, per-dimension radars, and token/time charts in **[Benchmark Results](docs/benchmark-results.md#deep-dive--tactical-ddd-skill-public-vs-repo-tuned-claude-code)**.
diff --git a/docs/benchmark-results.md b/docs/benchmark-results.md
index a79c830..f31dd3b 100644
--- a/docs/benchmark-results.md
+++ b/docs/benchmark-results.md
@@ -32,22 +32,22 @@ Results from the three example benchmarks included in `examples/`. All scores ar
### Deep dive — tactical-ddd skill: public vs repo-tuned (Claude Code)
-A focused follow-up on the same benchmark family: we took a public DDD skill ([`tactical-ddd` from `ntcoding/claude-skillz`](https://github.com/NTCoding/claude-skillz)) and a repo-tuned version, and measured four configurations of Claude Code on two deliberately different tasks — a **feature on a clean DDD codebase** (`ddd-weather-discount`) and a **legacy anemic→rich refactor** (`csharp-movie-rental-anemic`). Each configuration was run repeatedly and each run scored repeatedly; the numbers below are medians (normalized 0–1). Skill activation was verified per run — a mounted skill the agent never invokes scores like no skill at all.
+A focused follow-up on the same benchmark family: we took a public DDD skill ([`tactical-ddd` from `ntcoding/claude-skillz`](https://github.com/NTCoding/claude-skillz)) and a repo-tuned version, and measured four configurations of Claude Code on two deliberately different tasks — a **feature on a clean DDD codebase** (`ddd-weather-discount`) and a **legacy anemic→rich refactor** (`csharp-movie-rental-anemic`). Each configuration was run repeatedly and each run scored repeatedly; the numbers below are averages (normalized 0–1). Skill activation was verified per run — a mounted skill the agent never invokes scores like no skill at all.
| Configuration | Weather (feature) | Movie (legacy) |
|---|:---:|:---:|
| vanilla (no skill) | 0.79 | 0.56 |
-| guided (manual DDD hints, no skill) | 0.84 | 0.58 |
-| public skill | 0.85 | 0.60 |
-| repo-tuned skill | **0.92** | **0.62** |
+| guided (manual DDD hints, no skill) | 0.80 | 0.57 |
+| public skill | 0.86 | 0.54 |
+| repo-tuned skill | **0.91** | **0.61** |
-**The effect is task-dependent.** Absolute scores across tasks aren't comparable — task difficulty sets the baseline (movie starts lower because it's harder). What *is* comparable is the **increment over vanilla** on each task: how much each step lifts quality above the bare model.
+**The effect is task-dependent — and the comparison that matters is against the bare model, not between skills.** Absolute scores across tasks aren't comparable (task difficulty sets the baseline; movie starts lower because it's harder), so we compare the **increment over vanilla** on each task and only call a gap real when its 95% confidence interval (percentile bootstrap on per-attempt means) excludes zero. On both tasks the **repo-tuned skill significantly beats the bare model** (+0.12 weather, +0.05 movie) and beats hand-written hints. The **public skill helps only on the clean feature** (+0.07, significant); on the legacy refactor it doesn't beat vanilla at all. Hand-written hints (`guided`) never clear the bar — about the same as no skill.
-The same repo-tuned skill adds **+0.13** on the clean-feature task but only **+0.06** on the legacy refactor — twice the payoff where the design space is open.
+On the clean feature every step lifts quality (public +0.07, repo-tuned +0.12). On the legacy refactor only the repo-tuned skill clears the line (+0.05); the public skill dips just below vanilla — off-the-shelf doesn't help where the task fixes the design shape, and tuning is what recovers a gain.
Per-dimension radars show *where* the gains land (test quality stays flat everywhere — the skill teaches modeling, not testing):
@@ -56,6 +56,8 @@ Per-dimension radars show *where* the gains land (test quality stays flat everyw
+This is where the per-dimension view earns its keep, and it changes the Movie story. The aggregate says the repo-tuned skill barely moved on Movie (+0.05) — but per dimension it significantly lifts **four of five**: architecture (+0.12), encapsulation (+0.08), domain modeling (+0.04), extensibility (+0.05). What flattens the headline is the fifth dimension — test quality actually drops (−0.035, also significant) because the skill teaches domain design, not testing, and that one axis drags the average back down. So "barely moved" is an artifact of averaging: on the legacy refactor the skill *does* rebuild the code, it just doesn't touch tests. (Full per-dimension significance, both methods, in the tables linked below.)
+
What does the gain cost? Token usage and run time per configuration — and the answer is **not** the simple "better costs more" you might expect:
@@ -65,10 +67,14 @@ What does the gain cost? Token usage and run time per configuration — and the
-Cost doesn't track quality. On weather the top-scoring repo-tuned skill spends *fewer* tokens than guided or public; on movie the public skill is the cheapest of all four. The real overhead is run time on the messy refactor, where the skill arms take roughly twice as long as the bare model.
+Cost doesn't track quality. On weather the top-scoring repo-tuned skill spends *fewer* tokens than guided or public; on movie the public skill is the cheapest of all four in tokens. The real overhead is run time on the messy refactor, where the skill arms run noticeably longer than the bare model (~1.5–1.6×). Dollar cost is deliberately not estimated.
**Two lessons that generalize:** (1) judge one aggregate number and you miss the story — a real per-dimension gain hides inside an averaged score, which is why we show radars, not a single bar; (2) a skill present on disk is not a skill used — always verify it activated before trusting the result.
+**Significance.** Gaps are called real only when a 95% confidence interval (percentile bootstrap on per-attempt means) excludes zero. Full per-dimension tables are here:
+- [Per-dimension significance — bootstrap](../examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bootstrap.md)
+- [Per-dimension significance — Bayesian bootstrap vs bootstrap](../examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bayes-vs-bootstrap.md) (the two methods agree on every aggregate verdict)
+
## UC1: Project-Specific Setup — NASDE Dev Skill
1 task: Add multi-attempt support to the nasde-toolkit itself. Claude only (project-specific skill, cross-agent comparison not applicable).
diff --git a/examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bayes-vs-bootstrap.md b/examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bayes-vs-bootstrap.md
new file mode 100644
index 0000000..92e7ef7
--- /dev/null
+++ b/examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bayes-vs-bootstrap.md
@@ -0,0 +1,109 @@
+# Per-dimension significance — Bayesian bootstrap vs bootstrap (95% CI)
+
+
+
+Same comparisons (each configuration **vs the bare model**), now placing the **Bayesian bootstrap (Rubin)** — Dirichlet(1,…,1) weights instead of resampling with replacement — beside the frequentist bootstrap. For small samples Wolfe's article recommends Bayesian methods; we show both to demonstrate the verdicts don't depend on the choice. On every **Overall** comparison the two agree; per dimension they agree almost everywhere, and the few disagreements are all borderline cases sitting on zero.
+
+## Weather
+
+### Weather — Overall
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.004 | [-0.057, +0.063] noise | [-0.052, +0.059] noise | ✓ |
+| public | +0.069 | [+0.019, +0.120] **real** | [+0.023, +0.117] **real** | ✓ |
+| repo-tuned | +0.115 | [+0.073, +0.161] **real** | [+0.076, +0.158] **real** | ✓ |
+
+### Weather — Domain Modeling
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | -0.001 | [-0.075, +0.069] noise | [-0.069, +0.062] noise | ✓ |
+| public | +0.120 | [+0.067, +0.176] **real** | [+0.073, +0.172] **real** | ✓ |
+| repo-tuned | +0.155 | [+0.109, +0.205] **real** | [+0.115, +0.202] **real** | ✓ |
+
+### Weather — Encapsulation
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | -0.003 | [-0.093, +0.075] noise | [-0.088, +0.065] noise | ✓ |
+| public | +0.097 | [+0.000, +0.190] **real** | [+0.008, +0.182] **real** | ✓ |
+| repo-tuned | +0.180 | [+0.100, +0.250] **real** | [+0.104, +0.239] **real** | ✓ |
+
+### Weather — Architecture
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.002 | [-0.037, +0.040] noise | [-0.034, +0.036] noise | ✓ |
+| public | +0.030 | [-0.003, +0.067] noise | [+0.000, +0.064] **real** | ✗ |
+| repo-tuned | +0.050 | [-0.010, +0.097] noise | [-0.011, +0.089] noise | ✓ |
+
+### Weather — Extensibility
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.005 | [-0.048, +0.050] noise | [-0.046, +0.044] noise | ✓ |
+| public | +0.102 | [+0.067, +0.133] **real** | [+0.067, +0.130] **real** | ✓ |
+| repo-tuned | +0.076 | [+0.004, +0.147] **real** | [+0.010, +0.140] **real** | ✓ |
+
+### Weather — Test Quality
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.020 | [-0.164, +0.168] noise | [-0.160, +0.151] noise | ✓ |
+| public | -0.007 | [-0.123, +0.103] noise | [-0.117, +0.094] noise | ✓ |
+| repo-tuned | +0.097 | [+0.000, +0.200] **real** | [+0.010, +0.193] **real** | ✓ |
+
+## Movie
+
+### Movie — Overall
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.002 | [-0.011, +0.015] noise | [-0.010, +0.014] noise | ✓ |
+| public | -0.020 | [-0.056, +0.021] noise | [-0.052, +0.020] noise | ✓ |
+| repo-tuned | +0.049 | [+0.034, +0.067] **real** | [+0.036, +0.066] **real** | ✓ |
+
+### Movie — Domain Modeling
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.012 | [-0.013, +0.036] noise | [-0.012, +0.033] noise | ✓ |
+| public | -0.007 | [-0.045, +0.035] noise | [-0.042, +0.032] noise | ✓ |
+| repo-tuned | +0.037 | [+0.017, +0.060] **real** | [+0.019, +0.059] **real** | ✓ |
+
+### Movie — Encapsulation
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | -0.005 | [-0.038, +0.027] noise | [-0.035, +0.024] noise | ✓ |
+| public | -0.063 | [-0.148, +0.025] noise | [-0.140, +0.019] noise | ✓ |
+| repo-tuned | +0.078 | [+0.037, +0.127] **real** | [+0.042, +0.124] **real** | ✓ |
+
+### Movie — Architecture
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.017 | [-0.033, +0.060] noise | [-0.030, +0.054] noise | ✓ |
+| public | -0.003 | [-0.037, +0.033] noise | [-0.034, +0.031] noise | ✓ |
+| repo-tuned | +0.120 | [+0.098, +0.138] **real** | [+0.100, +0.137] **real** | ✓ |
+
+### Movie — Extensibility
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | +0.018 | [-0.000, +0.038] noise | [+0.001, +0.036] **real** | ✗ |
+| public | -0.009 | [-0.042, +0.033] noise | [-0.039, +0.032] noise | ✓ |
+| repo-tuned | +0.049 | [+0.027, +0.069] **real** | [+0.026, +0.067] **real** | ✓ |
+
+### Movie — Test Quality
+
+| vs vanilla | Δ | Bootstrap 95% CI | Bayes 95% CI | Agree? |
+|---|---:|---|---|:---:|
+| guided | -0.030 | [-0.068, +0.015] noise | [-0.063, +0.012] noise | ✓ |
+| public | -0.020 | [-0.052, +0.017] noise | [-0.048, +0.015] noise | ✓ |
+| repo-tuned | -0.035 | [-0.067, +0.000] **real** | [-0.063, -0.002] **real** | ✓ |
diff --git a/examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bootstrap.md b/examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bootstrap.md
new file mode 100644
index 0000000..dd14b86
--- /dev/null
+++ b/examples/ddd-architectural-challenges/SIGNIFICANCE-per-dimension-bootstrap.md
@@ -0,0 +1,133 @@
+# Per-dimension significance — bootstrap (95% CI)
+
+
+
+Each configuration compared **against the bare model (vanilla)**, per quality dimension, split by task. A gap is **real** when its 95% CI (percentile bootstrap, 40,000 resamples on per-attempt means) excludes zero; **noise** when it straddles zero. Scores normalized 0–1.
+
+## Weather
+
+### Weather — Overall
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.004 | [-0.057, +0.063] | noise |
+| public | +0.069 | [+0.019, +0.120] | **real** |
+| repo-tuned | +0.115 | [+0.073, +0.161] | **real** |
+
+### Weather — Domain Modeling
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | -0.001 | [-0.075, +0.069] | noise |
+| public | +0.120 | [+0.067, +0.176] | **real** |
+| repo-tuned | +0.155 | [+0.109, +0.205] | **real** |
+
+### Weather — Encapsulation
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | -0.003 | [-0.093, +0.075] | noise |
+| public | +0.097 | [+0.000, +0.190] | **real** |
+| repo-tuned | +0.180 | [+0.100, +0.250] | **real** |
+
+### Weather — Architecture
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.002 | [-0.037, +0.040] | noise |
+| public | +0.030 | [-0.003, +0.067] | noise |
+| repo-tuned | +0.050 | [-0.010, +0.097] | noise |
+
+### Weather — Extensibility
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.005 | [-0.048, +0.050] | noise |
+| public | +0.102 | [+0.067, +0.133] | **real** |
+| repo-tuned | +0.076 | [+0.004, +0.147] | **real** |
+
+### Weather — Test Quality
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.020 | [-0.164, +0.168] | noise |
+| public | -0.007 | [-0.123, +0.103] | noise |
+| repo-tuned | +0.097 | [+0.000, +0.200] | **real** |
+
+## Movie
+
+### Movie — Overall
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.002 | [-0.011, +0.015] | noise |
+| public | -0.020 | [-0.056, +0.021] | noise |
+| repo-tuned | +0.049 | [+0.034, +0.067] | **real** |
+
+### Movie — Domain Modeling
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.012 | [-0.013, +0.036] | noise |
+| public | -0.007 | [-0.045, +0.035] | noise |
+| repo-tuned | +0.037 | [+0.017, +0.060] | **real** |
+
+### Movie — Encapsulation
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | -0.005 | [-0.038, +0.027] | noise |
+| public | -0.063 | [-0.148, +0.025] | noise |
+| repo-tuned | +0.078 | [+0.037, +0.127] | **real** |
+
+### Movie — Architecture
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.017 | [-0.033, +0.060] | noise |
+| public | -0.003 | [-0.037, +0.033] | noise |
+| repo-tuned | +0.120 | [+0.098, +0.138] | **real** |
+
+### Movie — Extensibility
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | +0.018 | [-0.000, +0.038] | noise |
+| public | -0.009 | [-0.042, +0.033] | noise |
+| repo-tuned | +0.049 | [+0.027, +0.069] | **real** |
+
+### Movie — Test Quality
+
+
+
+| vs vanilla | Δ | 95% CI | Verdict |
+|---|---:|---|---|
+| guided | -0.030 | [-0.068, +0.015] | noise |
+| public | -0.020 | [-0.052, +0.017] | noise |
+| repo-tuned | -0.035 | [-0.067, +0.000] | **real** |
diff --git a/examples/ddd-architectural-challenges/assets/forest_movie_architecture_compliance.png b/examples/ddd-architectural-challenges/assets/forest_movie_architecture_compliance.png
new file mode 100644
index 0000000..ef268a9
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_movie_architecture_compliance.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_movie_domain_modeling.png b/examples/ddd-architectural-challenges/assets/forest_movie_domain_modeling.png
new file mode 100644
index 0000000..3074da3
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_movie_domain_modeling.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_movie_encapsulation.png b/examples/ddd-architectural-challenges/assets/forest_movie_encapsulation.png
new file mode 100644
index 0000000..8f66c4a
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_movie_encapsulation.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_movie_extensibility.png b/examples/ddd-architectural-challenges/assets/forest_movie_extensibility.png
new file mode 100644
index 0000000..af1ca12
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_movie_extensibility.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_movie_overall.png b/examples/ddd-architectural-challenges/assets/forest_movie_overall.png
new file mode 100644
index 0000000..2a7f0e5
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_movie_overall.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_movie_test_quality.png b/examples/ddd-architectural-challenges/assets/forest_movie_test_quality.png
new file mode 100644
index 0000000..ad22065
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_movie_test_quality.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_weather_architecture_compliance.png b/examples/ddd-architectural-challenges/assets/forest_weather_architecture_compliance.png
new file mode 100644
index 0000000..23759c4
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_weather_architecture_compliance.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_weather_domain_modeling.png b/examples/ddd-architectural-challenges/assets/forest_weather_domain_modeling.png
new file mode 100644
index 0000000..5b0a7eb
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_weather_domain_modeling.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_weather_encapsulation.png b/examples/ddd-architectural-challenges/assets/forest_weather_encapsulation.png
new file mode 100644
index 0000000..d2f6c92
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_weather_encapsulation.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_weather_extensibility.png b/examples/ddd-architectural-challenges/assets/forest_weather_extensibility.png
new file mode 100644
index 0000000..8b4bdc1
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_weather_extensibility.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_weather_overall.png b/examples/ddd-architectural-challenges/assets/forest_weather_overall.png
new file mode 100644
index 0000000..8d601cc
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_weather_overall.png differ
diff --git a/examples/ddd-architectural-challenges/assets/forest_weather_test_quality.png b/examples/ddd-architectural-challenges/assets/forest_weather_test_quality.png
new file mode 100644
index 0000000..c5edebc
Binary files /dev/null and b/examples/ddd-architectural-challenges/assets/forest_weather_test_quality.png differ
diff --git a/examples/ddd-architectural-challenges/assets/increment_vs_vanilla.png b/examples/ddd-architectural-challenges/assets/increment_vs_vanilla.png
index 3e0a73b..1026055 100644
Binary files a/examples/ddd-architectural-challenges/assets/increment_vs_vanilla.png and b/examples/ddd-architectural-challenges/assets/increment_vs_vanilla.png differ
diff --git a/examples/ddd-architectural-challenges/assets/ops_time_movie.png b/examples/ddd-architectural-challenges/assets/ops_time_movie.png
index 9f73a8e..87f6a90 100644
Binary files a/examples/ddd-architectural-challenges/assets/ops_time_movie.png and b/examples/ddd-architectural-challenges/assets/ops_time_movie.png differ
diff --git a/examples/ddd-architectural-challenges/assets/ops_time_weather.png b/examples/ddd-architectural-challenges/assets/ops_time_weather.png
index 0c6cf10..0de845e 100644
Binary files a/examples/ddd-architectural-challenges/assets/ops_time_weather.png and b/examples/ddd-architectural-challenges/assets/ops_time_weather.png differ
diff --git a/examples/ddd-architectural-challenges/assets/ops_tokens_movie.png b/examples/ddd-architectural-challenges/assets/ops_tokens_movie.png
index d89e504..76dc89f 100644
Binary files a/examples/ddd-architectural-challenges/assets/ops_tokens_movie.png and b/examples/ddd-architectural-challenges/assets/ops_tokens_movie.png differ
diff --git a/examples/ddd-architectural-challenges/assets/ops_tokens_weather.png b/examples/ddd-architectural-challenges/assets/ops_tokens_weather.png
index f9112bb..3f211e0 100644
Binary files a/examples/ddd-architectural-challenges/assets/ops_tokens_weather.png and b/examples/ddd-architectural-challenges/assets/ops_tokens_weather.png differ
diff --git a/examples/ddd-architectural-challenges/assets/radar_movie.png b/examples/ddd-architectural-challenges/assets/radar_movie.png
index 0f462e3..37819d8 100644
Binary files a/examples/ddd-architectural-challenges/assets/radar_movie.png and b/examples/ddd-architectural-challenges/assets/radar_movie.png differ
diff --git a/examples/ddd-architectural-challenges/assets/radar_weather.png b/examples/ddd-architectural-challenges/assets/radar_weather.png
index f915246..93e79dd 100644
Binary files a/examples/ddd-architectural-challenges/assets/radar_weather.png and b/examples/ddd-architectural-challenges/assets/radar_weather.png differ