Skip to content

fix(web): /metrics language field emits non-lookup display names (c/c++, c#) #540

@dekobon

Description

@dekobon

Decided (2026-06-05): one canonical lowercase slug per language everywhere (cpp/csharp/tsx/…) for Display/FromStr/CLI-JSON/web/py; drop c/c++/c# and the Python override. See the resolved-decisions comment.

Summary

The REST /metrics response language field emits the raw LANG::name()
display string — "c/c++", "c#" — which is not a valid language= lookup
token, and which the Python bindings deliberately override. The two published
surfaces disagree on the canonical name for the same language.

Evidence

  • WebMetricsResponse.language (big-code-analysis-web/src/web/metrics.rs:34),
    populated from guess_language's second tuple element
    (big-code-analysis-web/src/web/server.rs:311), which is LANG::name()
    (src/tools.rs). LANG::name() returns "c/c++" for C++ and "c#" for C#.
  • The Python bindings override exactly these tokens
    (big-code-analysis-py/src/language.rs:46, lang_to_name): Cpp"cpp",
    Csharp"csharp", Tsx"tsx", precisely because "c/c++" / "c#"
    are not valid lookup tokens.

Why 2.0-worthy

A client cannot round-trip the language string from a /metrics response back
into any language-name parameter, and web vs py disagree on canonical names. The
web response shape is a frozen API; changing "c/c++""cpp" is breaking → 2.0.

Proposed change

Decide a single canonical language-name vocabulary across web + Python (+ the CLI
JSON language field) for 2.0. Route the web language field through the same
lang_to_name-style override the bindings already use. This pairs with #508's
Display/FromStr work — the canonical name should be the FromStr-parseable
token, not the human display name.

Acceptance

  • /metrics language is a valid lookup token consistent with the Python
    surface and LANG::FromStr.

Part of the pre-2.0 review (#505).

Resolution

Fixed in fix/issue-540 (commit 57e056d9). One canonical lowercase
slug per language is now used on every surface (LANG::name /
Display / FromStr, CLI JSON, web /metrics, Python). Cppcpp,
Csharpcsharp, Tsxtsx; the pretty c/c++ / c# forms are
dropped. Display is injective and round-trips exactly through
FromStr for every variant (enforced by a new all-variants round-trip
test plus a no-punctuation guard). fake::get_true reports cpp for
Objective-C/C++ instead of the unparseable obj-c/c++. The Python
lang_to_name override is removed as redundant. Breaking serialized
language value, deferred to 2.0; version stays 1.x. See the
resolution comment for the full slug table and decision rationale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions