Decided (2026-06-05): one canonical lowercase slug per language everywhere (cpp/csharp/tsx/…) for Display/FromStr/CLI-JSON/web/py; drop c/c++/c# and the Python override. See the resolved-decisions comment.
Summary
The REST /metrics response language field emits the raw LANG::name()
display string — "c/c++", "c#" — which is not a valid language= lookup
token, and which the Python bindings deliberately override. The two published
surfaces disagree on the canonical name for the same language.
Evidence
WebMetricsResponse.language (big-code-analysis-web/src/web/metrics.rs:34),
populated from guess_language's second tuple element
(big-code-analysis-web/src/web/server.rs:311), which is LANG::name()
(src/tools.rs). LANG::name() returns "c/c++" for C++ and "c#" for C#.
- The Python bindings override exactly these tokens
(big-code-analysis-py/src/language.rs:46, lang_to_name): Cpp → "cpp",
Csharp → "csharp", Tsx → "tsx", precisely because "c/c++" / "c#"
are not valid lookup tokens.
Why 2.0-worthy
A client cannot round-trip the language string from a /metrics response back
into any language-name parameter, and web vs py disagree on canonical names. The
web response shape is a frozen API; changing "c/c++" → "cpp" is breaking → 2.0.
Proposed change
Decide a single canonical language-name vocabulary across web + Python (+ the CLI
JSON language field) for 2.0. Route the web language field through the same
lang_to_name-style override the bindings already use. This pairs with #508's
Display/FromStr work — the canonical name should be the FromStr-parseable
token, not the human display name.
Acceptance
/metrics language is a valid lookup token consistent with the Python
surface and LANG::FromStr.
Part of the pre-2.0 review (#505).
Resolution
Fixed in fix/issue-540 (commit 57e056d9). One canonical lowercase
slug per language is now used on every surface (LANG::name /
Display / FromStr, CLI JSON, web /metrics, Python). Cpp → cpp,
Csharp → csharp, Tsx → tsx; the pretty c/c++ / c# forms are
dropped. Display is injective and round-trips exactly through
FromStr for every variant (enforced by a new all-variants round-trip
test plus a no-punctuation guard). fake::get_true reports cpp for
Objective-C/C++ instead of the unparseable obj-c/c++. The Python
lang_to_name override is removed as redundant. Breaking serialized
language value, deferred to 2.0; version stays 1.x. See the
resolution comment for the full slug table and decision rationale.
Summary
The REST
/metricsresponselanguagefield emits the rawLANG::name()display string —
"c/c++","c#"— which is not a validlanguage=lookuptoken, and which the Python bindings deliberately override. The two published
surfaces disagree on the canonical name for the same language.
Evidence
WebMetricsResponse.language(big-code-analysis-web/src/web/metrics.rs:34),populated from
guess_language's second tuple element(
big-code-analysis-web/src/web/server.rs:311), which isLANG::name()(
src/tools.rs).LANG::name()returns"c/c++"for C++ and"c#"for C#.(
big-code-analysis-py/src/language.rs:46,lang_to_name):Cpp→"cpp",Csharp→"csharp",Tsx→"tsx", precisely because"c/c++"/"c#"are not valid lookup tokens.
Why 2.0-worthy
A client cannot round-trip the
languagestring from a/metricsresponse backinto any language-name parameter, and web vs py disagree on canonical names. The
web response shape is a frozen API; changing
"c/c++"→"cpp"is breaking → 2.0.Proposed change
Decide a single canonical language-name vocabulary across web + Python (+ the CLI
JSON
languagefield) for 2.0. Route the weblanguagefield through the samelang_to_name-style override the bindings already use. This pairs with #508'sDisplay/FromStrwork — the canonical name should be theFromStr-parseabletoken, not the human display name.
Acceptance
/metricslanguageis a valid lookup token consistent with the Pythonsurface and
LANG::FromStr.Part of the pre-2.0 review (#505).
Resolution
Fixed in
fix/issue-540(commit57e056d9). One canonical lowercaseslug per language is now used on every surface (
LANG::name/Display/FromStr, CLI JSON, web/metrics, Python).Cpp→cpp,Csharp→csharp,Tsx→tsx; the prettyc/c++/c#forms aredropped.
Displayis injective and round-trips exactly throughFromStrfor every variant (enforced by a new all-variants round-triptest plus a no-punctuation guard).
fake::get_truereportscppforObjective-C/C++ instead of the unparseable
obj-c/c++. The Pythonlang_to_nameoverride is removed as redundant. Breaking serializedlanguagevalue, deferred to 2.0; version stays 1.x. See theresolution comment for the full slug table and decision rationale.