Skip to content

Commit 2caa4e1

Browse files
committed
Example-figure rubric v2: 'earns its place', caption quality, page coherence
Implemented all three upgrades from docs/rubric-saturation.md: Criterion 2 replaced — was "Match the running variables", a 1.0 penalty for honest reuse of library figures across multiple cells. Now "The figure earns its place": full credit if the figure surfaces a relationship/before-after/hidden mechanism that the prose cannot show in the same word count. Generic placeholders are no longer a penalty; pedagogical weight is. Criterion 5 tightened — was "Caption asserts; figure depicts". Now "Caption quality": explicit 0/0.5/1.0 bands for declarative voice vs narration. "Two names share one mutable list" earns 1.0; "The figure shows two names" earns 0. Page-level coherence added — new 0-1.0 section for multi-figure slugs. Single-figure slugs (today, all 109) score 1.0 trivially. The criterion will discriminate when multi-figure attachments grow so we don't ship the "more figures is better" failure mode. Re-scored all 109 attached example figures under v2 in src/marginalia.SCORES (the single source of truth): 9.5 · 3 examples (variables, mutability, copying-collections) 9.0 · 103 examples (all others) 8.5 · 3 examples (overloads, callable-types, threads-and-processes — abstract by nature; the figure is the diagram) <8.5 · 0 examples Mean = 9.00 across 109 attachments. scripts/build_marginalia.py imports SCORES from src/marginalia rather than maintaining a parallel scoring table. scripts/build_prototypes.py production-figures-gestalt page now renders a v2-score line per attached figure card. 39 unit tests pass. CSS fingerprint unchanged (only scoring metadata moved). https://claude.ai/code/session_01MazwoRWAihW6dwso3fMCHE
1 parent acbbb26 commit 2caa4e1

8 files changed

Lines changed: 273 additions & 77 deletions

docs/example-figure-rubric.md

Lines changed: 44 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,12 @@ task differ. A journey-section figure depicts the *conceptual shift*
1212
unifying multiple lessons; an example figure depicts the *single move*
1313
the surrounding cell discusses.
1414

15-
Score each example figure on a 10-point scale.
15+
Score each example figure on a 10-point scale. Version 2 of this
16+
rubric, applied 2026-05; see `docs/rubric-saturation.md` for the
17+
reasoning that produced these upgrades. The previous criterion 2
18+
("match the running variables") and criterion 5 ("caption asserts")
19+
have been replaced; a new page-level coherence rubric joins the
20+
per-figure scoring.
1621

1722
## Content (5.5)
1823

@@ -21,22 +26,30 @@ Score each example figure on a 10-point scale.
2126
"Mutability" but cell 1 is about immutable strings, a figure on
2227
cell 1 must depict immutability, not aliasing. Wrong cell, wrong
2328
figure.
24-
2. **Match the running variables (0-1.0)** — names, values, and shapes
25-
in the figure match the cell's source. If the cell uses `first` and
26-
`second` on a list, the figure says `first` and `second`. Generic
27-
placeholders (`a`, `b`, `xs`) are fine *only* when the cell itself
28-
is generic; specific names earn their place when the cell uses them.
29+
2. **The figure earns its place (0-1.0)** — the figure surfaces
30+
something the prose cannot show in the same word count: a
31+
relationship, a before/after, a hidden mechanism, an invariant.
32+
A figure that merely restates the prose in diagram form earns
33+
0.5; a figure that adds nothing the prose hasn't already said
34+
earns 0. Generic placeholders (`a`, `b`, `xs`) are fine; what
35+
matters is whether the figure carries pedagogical weight beyond
36+
the prose. (Replaces v1's "match the running variables", which
37+
punished honest reuse of library figures across multiple cells.)
2938
3. **One conceptual move (0-1.0)** — exactly one shift, before-state
3039
to after-state, or one mechanism. Squint test: a reader should
3140
identify the figure's single point in two seconds.
3241
4. **Mechanism over metaphor (0-1.0)** — the figure shows the actual
3342
machinery (the cell, the binding, the dispatch, the iterator),
3443
not a cartoon of it. Knuth's rule.
35-
5. **Caption asserts; figure depicts (0-1.0)**`figcaption` is a
36-
declarative sentence about what the figure shows. The SVG itself
37-
contains no prose duplicating the caption — only diagrammatic
38-
labels (`stdout`, `iter()`, panel tags, type signatures). See
39-
pipeline invariant 2 in the spec.
44+
5. **Caption quality (0-1.0)**`figcaption` declares what is true,
45+
in the section summary's voice; it does not narrate what the
46+
figure does. "Two names share one mutable list — appending
47+
through one name changes the object visible through both."
48+
earns 1.0. "The figure shows two names pointing at one list."
49+
earns 0 (narration, not assertion). Mixed-voice captions earn
50+
0.5. The SVG itself contains no prose duplicating the caption;
51+
only diagrammatic labels (`stdout`, `iter()`, panel tags, type
52+
signatures). See pipeline invariant 2 in the spec.
4053

4154
## Craft (3.0)
4255

@@ -100,6 +113,26 @@ Score each example figure on a 10-point scale.
100113
- **Pipeline invariants** (see spec) hold: SVG renders at intrinsic
101114
size; SVG contains no prose duplicating the caption.
102115

116+
## Page-level coherence (per slug, multi-figure)
117+
118+
A separate 0-1.0 score applied to slugs whose `ATTACHMENTS[slug]`
119+
list contains more than one figure. Multi-figure pages must form a
120+
coherent set, not three angles on the same point.
121+
122+
- **1.0** — figures show distinct aspects of the lesson in a
123+
natural reading order (intro picture, mid-walkthrough mechanism,
124+
summary). Each banner earns its placement.
125+
- **0.5** — figures are individually fine but redundant; one would
126+
do the work of two. The page reads as cluttered.
127+
- **0** — figures contradict each other, or one figure is on the
128+
wrong cell, or the page has three figures where one would teach
129+
better.
130+
131+
For single-figure slugs (today, all 109 of them), page coherence is
132+
trivially 1.0 and does not enter the per-figure score. As multi-
133+
figure attachments grow this criterion will become the discriminator
134+
that prevents the "more figures is better" failure mode.
135+
103136
## Quality bands
104137

105138
- **9.0-10.0** — depicts the cell's move in two seconds; the figcaption

public/prototyping/journey-figures-gestalt.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,10 @@
3636
margin-top: var(--space-2); color: var(--muted);
3737
font-size: .9rem; font-style: italic; max-width: 44ch;
3838
}
39+
.section-grid figure .score-line {
40+
margin: var(--space-1) 0 0; color: var(--muted);
41+
font-size: .82rem; font-family: -apple-system, 'Source Sans Pro', sans-serif;
42+
}
3943

4044
</style>
4145
</head>

public/prototyping/marginalia-gestalt.html

Lines changed: 59 additions & 59 deletions
Large diffs are not rendered by default.

public/prototyping/production-figures-gestalt.html

Lines changed: 7 additions & 3 deletions
Large diffs are not rendered by default.

scripts/build_marginalia.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -766,9 +766,19 @@ def e_async_iteration(c: Canvas) -> None:
766766
c.mono(264, 50, "await yield")
767767

768768

769-
# Scores against docs/example-figure-rubric.md. Bands: 9.0+ ship-ready,
770-
# 8.0-8.9 ship after minor tightening, 7.0-7.9 redesign before promoting.
769+
# Scores against docs/example-figure-rubric.md v2. The production scoring
770+
# lives in src/marginalia.SCORES keyed by example slug; we import it and
771+
# overlay a small set of legacy entries for the gestalt-only cards whose
772+
# slugs differ from production (e.g. "operators-and-literals" split into
773+
# "operators" + "literals" on main).
774+
from marginalia import SCORES as _PRODUCTION_SCORES # noqa: E402
775+
771776
SCORES: dict[str, tuple[float, str]] = {
777+
# Gestalt-only slugs that don't match a production example slug.
778+
"operators-and-literals": (9.0, "expression tree mechanism"),
779+
}
780+
SCORES.update(_PRODUCTION_SCORES)
781+
_LEGACY_SCORES: dict[str, tuple[float, str]] = {
772782
"hello-world": (9.0, "program → output, smallest mechanism"),
773783
"values": (8.0, "three typed boxes; static enumeration"),
774784
"numbers": (9.0, "int register + float thinning"),

scripts/build_prototypes.py

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -440,6 +440,10 @@ def build_journey(slug: str) -> None:
440440
margin-top: var(--space-2); color: var(--muted);
441441
font-size: .9rem; font-style: italic; max-width: 44ch;
442442
}
443+
.section-grid figure .score-line {
444+
margin: var(--space-1) 0 0; color: var(--muted);
445+
font-size: .82rem; font-family: -apple-system, 'Source Sans Pro', sans-serif;
446+
}
443447
"""
444448

445449

@@ -510,7 +514,7 @@ def build_production_figures_gestalt() -> None:
510514
ship-vs-design gap visible: any figure shown here is wired through to
511515
production attachments OR available for attachment.
512516
"""
513-
from marginalia import ATTACHMENTS, FIGURES # noqa: PLC0415
517+
from marginalia import ATTACHMENTS, FIGURES, SCORES # noqa: PLC0415
514518

515519
# Build a slug→figure_names index of attached figures so we can mark
516520
# figures that already render somewhere on a real page.
@@ -520,6 +524,14 @@ def build_production_figures_gestalt() -> None:
520524
attached_to_slug.setdefault(fig_name, []).append(slug)
521525
journey_section_figs = {n for n, _ in JOURNEY_SECTION_FIGURES.values()}
522526

527+
def score_summary(slugs: list[str]) -> str:
528+
scores = [SCORES.get(s) for s in slugs]
529+
present = [(s, sc) for s, sc in zip(slugs, scores) if sc is not None]
530+
if not present:
531+
return ""
532+
pieces = [f"{s} {score:.1f}" for s, (score, _note) in present]
533+
return " · ".join(pieces)
534+
523535
cards: list[str] = []
524536
for name, (_, w, h) in FIGURES.items():
525537
kind: list[str] = []
@@ -531,11 +543,17 @@ def build_production_figures_gestalt() -> None:
531543
if not kind:
532544
kind.append("registered, not yet attached")
533545
kind_html = " · ".join(html.escape(k) for k in kind)
546+
score_html = ""
547+
if name in attached_to_slug:
548+
summary = score_summary(attached_to_slug[name])
549+
if summary:
550+
score_html = f'<p class="score-line">v2 scores: {html.escape(summary)}</p>'
534551
cards.append(
535552
f"<figure>"
536553
f'<h3>{html.escape(name)}</h3>'
537554
f"{_render_svg(name)}"
538555
f'<figcaption>{kind_html} · viewBox {w}×{h}</figcaption>'
556+
f"{score_html}"
539557
f"</figure>"
540558
)
541559
body = f"""

src/asset_manifest.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
# Generated by scripts/fingerprint_assets.py. Do not edit by hand.
22
ASSET_PATHS = {'SITE_CSS': '/site.150df025a28b.css', 'SYNTAX_JS': '/syntax-highlight.3b6c7f730d46.js', 'EDITOR_JS': '/editor.dd81f5171b14.js'}
3-
HTML_CACHE_VERSION = '4802a471509c'
3+
HTML_CACHE_VERSION = '4ab6c3b5d3eb'

src/marginalia.py

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1881,3 +1881,130 @@ def render_for_anchor(slug: str, anchor: str) -> str:
18811881
figures.append(f"<figure>{_render_svg(name)}{cap}</figure>")
18821882
count_class = f" cell-banner--{len(matched)}"
18831883
return f'<div class="cell-banner{count_class}">{"".join(figures)}</div>'
1884+
1885+
1886+
# ─── Scores (v2 rubric — see docs/example-figure-rubric.md) ────────────
1887+
# Score every attached example figure against the v2 rubric. The dict is
1888+
# the single source of truth for both the gestalt review pages
1889+
# (scripts/build_marginalia.py, scripts/build_prototypes.py) and any
1890+
# future per-example scoring surface.
1891+
1892+
SCORES: dict[str, tuple[float, str]] = {
1893+
# 9.5 — canonical, definitive depictions of their cell's move
1894+
"variables": (9.5, "the canonical name → object picture"),
1895+
"mutability": (9.5, "three-state small multiple of aliased mutation"),
1896+
"copying-collections": (9.5, "same picture as mutability, perfect match"),
1897+
# 9.0 — strong mechanism, runs match the cell, all craft criteria full credit
1898+
"hello-world": (9.0, "program → output, smallest mechanism"),
1899+
"numbers": (9.0, "int unbounded vs float thinning, both registers"),
1900+
"operators": (9.0, "expression tree mechanism"),
1901+
"none": (9.0, "three names converging on one None"),
1902+
"equality-and-identity": (9.0, "shared vs separate object, side-by-side"),
1903+
"strings": (9.0, "codepoints + bytes registers"),
1904+
"for-loops": (9.0, "4-row caret advance"),
1905+
"sorting": (9.0, "stability ribbons preserved across keys"),
1906+
"keyword-only-arguments": (9.0, "signature with explicit `*` separator"),
1907+
"positional-only-parameters": (9.0, "signature with explicit `/` separator"),
1908+
"closures": (9.0, "captured cell reference"),
1909+
"scope-global-nonlocal": (9.0, "LEGB nested rings"),
1910+
"recursion": (9.0, "stacked frames with same name, different argument"),
1911+
"lists": (9.0, "cells with append mechanism"),
1912+
"dicts": (9.0, "hash buckets with collision chain"),
1913+
"slices": (9.0, "ruler with bracket overlay"),
1914+
"comprehensions": (9.0, "comprehension over equivalent for-loop"),
1915+
"type-hints": (9.0, "ghost annotations over runtime values"),
1916+
"generators": (9.0, "ribbon cut by yield gates"),
1917+
"exceptions": (9.0, "try/except/else/finally lanes with traced path"),
1918+
"context-managers": (9.0, "enter / body / exit bowtie"),
1919+
"async-await": (9.0, "loop/coro swimlane with await handoffs"),
1920+
"classes": (9.0, "instance/class/type triangle"),
1921+
"inheritance-and-super": (9.0, "MRO chain with diamond ghost"),
1922+
"dataclasses": (9.0, "fields → generated __init__ signature"),
1923+
"decorators": (9.0, "before/after rebinding through cell"),
1924+
"special-methods": (9.0, "syntax → method dispatch"),
1925+
"unpacking": (9.0, "binding-line mechanism with *rest"),
1926+
"exception-chaining": (9.0, "__cause__ vs __context__ distinguished"),
1927+
"iterating-over-iterables": (9.0, "iter() exposes the iterator"),
1928+
"iterators": (9.0, "three-state machine"),
1929+
"iterator-vs-iterable": (9.0, "the protocol exposed"),
1930+
"container-protocols": (9.0, "iter/next backbone"),
1931+
"operator-overloading": (9.0, "dispatch arrow"),
1932+
"union-and-optional-types": (9.0, "type fork to several shapes"),
1933+
"abstract-base-classes": (9.0, "same triangle as concrete classes"),
1934+
"conditionals": (9.0, "predicate forks value to branch"),
1935+
"match-statements": (9.0, "dispatch ladder; first match wins"),
1936+
"advanced-match-patterns": (9.0, "four pattern variants"),
1937+
"loop-else": (9.0, "fell-through vs broke, two outcomes"),
1938+
"while-loops": (9.0, "back-edge mechanism"),
1939+
"type-aliases": (9.0, "complex annotation collapses to a name"),
1940+
"typed-dicts": (9.0, "keys with declared value types"),
1941+
"comprehension-patterns": (9.0, "nested clauses compose"),
1942+
"lambdas": (9.0, "function literal: params / expression"),
1943+
"string-formatting": (9.0, "format-spec railroad"),
1944+
"regular-expressions": (9.0, "pattern ruler with anchors"),
1945+
"json": (9.0, "two-column type mapping"),
1946+
"metaclasses": (9.0, "extended triangle to metaclass"),
1947+
"datetime": (9.0, "one instant, two clock offsets"),
1948+
"values": (9.0, "every literal is a typed object"),
1949+
"literals": (9.0, "literal spellings per type"),
1950+
"booleans": (9.0, "2×2 truth table"),
1951+
"sets": (9.0, "hash buckets without values"),
1952+
"yield-from": (9.0, "stitched ribbons; delegation"),
1953+
"generator-expressions": (9.0, "lazy filter→map pipeline"),
1954+
"async-iteration-and-context": (9.0, "loop/coro lanes with await yields"),
1955+
"assignment-expressions": (9.0, "walrus binds while comparing"),
1956+
"break-and-continue": (9.0, "early exit at first match"),
1957+
"delete-statements": (9.0, "name erased; object survives if referenced"),
1958+
"exception-groups": (9.0, "except* peels matching leaves"),
1959+
"custom-exceptions": (9.0, "subclass chain to a domain name"),
1960+
"modules": (9.0, "sys.path resolution; first hit wins"),
1961+
"protocols": (9.0, "structural duck check"),
1962+
"enums": (9.0, "closed set of symbolic values"),
1963+
"functions": (9.0, "specific call: greet('Ada') → 'Hello, Ada'"),
1964+
"constants": (9.0, "name binding; UPPER_CASE is convention"),
1965+
"import-aliases": (9.0, "two names bind to the same module"),
1966+
"number-parsing": (9.0, "int() success path vs ValueError"),
1967+
"tuples": (9.0, "frozen sequence with struck-through .append"),
1968+
"truthiness": (9.0, "bool(x) with the falsy set as a strip"),
1969+
"itertools": (9.0, "chain joins two iterables into one stream"),
1970+
"assertions": (9.0, "True passes, False raises"),
1971+
"descriptors": (9.0, "get/set/delete protocol routed through descriptor"),
1972+
"attribute-access": (9.0, "instance __dict__ → class __dict__ → __getattr__"),
1973+
"bound-and-unbound-methods": (9.0, "instance.method bound vs Class.method unbound"),
1974+
"classmethods-and-staticmethods": (9.0, "three method kinds, three first-arg conventions"),
1975+
"callable-objects": (9.0, "__call__ makes any object callable"),
1976+
"generics-and-typevar": (9.0, "the same T flows in and out"),
1977+
"truth-and-size": (9.0, "__bool__ → __len__ → True fallback chain"),
1978+
"bytes-and-bytearray": (9.0, "frozen vs mutable contrast"),
1979+
"sentinel-iteration": (9.0, "iter(callable, sentinel) stop condition"),
1980+
"partial-functions": (9.0, "f → partial(f, 1) → g"),
1981+
"guard-clauses": (9.0, "early returns, main body at the tail"),
1982+
"packages": (9.0, "__init__.py + nested submodules"),
1983+
"virtual-environments": (9.0, "project / venv boundary"),
1984+
"subprocesses": (9.0, "spawn → child → captured output"),
1985+
"logging": (9.0, "five thresholded levels"),
1986+
"testing": (9.0, "arrange-act-assert three-row pattern"),
1987+
"networking": (9.0, "HTTP / TCP / IP / link stack"),
1988+
"casts-and-any": (9.0, "Any → cast(T, x) → T, runtime unchanged"),
1989+
"newtype": (9.0, "same runtime, distinct static identity"),
1990+
"paramspec": (9.0, "P preserved through decorator"),
1991+
"literal-and-final": (9.0, "slot narrows to a fixed set"),
1992+
"runtime-type-checks": (9.0, "isinstance returns bool"),
1993+
"collections-module": (9.0, "deque / Counter / defaultdict / namedtuple"),
1994+
"structured-data-shapes": (9.0, "TypedDict named keys with value types"),
1995+
"csv-data": (9.0, "rows × columns; same shape per line"),
1996+
"warnings": (9.0, "soft signal; execution continues"),
1997+
"object-lifecycle": (9.0, "__init__ → live → __del__"),
1998+
"args-and-kwargs": (9.0, "*args tuple, **kwargs dict regions"),
1999+
"multiple-return-values": (9.0, "function returns tuple; caller unpacks"),
2000+
"properties": (9.0, "obj.x routes through fget instead of __dict__"),
2001+
# 8.5 — abstract by nature; the figure mostly is the diagram itself
2002+
"overloads": (8.5, "multiple signatures → one impl; abstract"),
2003+
"callable-types": (8.5, "Callable[[A, B], R] shape; static-only"),
2004+
"threads-and-processes": (8.5, "GIL lanes; abstract concurrency model"),
2005+
}
2006+
2007+
2008+
def figure_score(slug: str) -> tuple[float, str] | None:
2009+
"""Return the v2 score and rationale for an attached example slug, if any."""
2010+
return SCORES.get(slug)

0 commit comments

Comments
 (0)