Skip to content

Add support for json.Number to dataconv.Marshal#128

Closed
phughes-scwx wants to merge 1 commit into
1set:masterfrom
phughes-scwx:phughes-add-json-number
Closed

Add support for json.Number to dataconv.Marshal#128
phughes-scwx wants to merge 1 commit into
1set:masterfrom
phughes-scwx:phughes-add-json-number

Conversation

@phughes-scwx

@phughes-scwx phughes-scwx commented Aug 1, 2025

Copy link
Copy Markdown

What

Adds a case to the dataconv.Marshal type switch that retains values passed as json.Number as their underlying string value.

Why

In instances where a user is consuming raw JSON without a predetermined struct for unmarshaling, they often unmarshal into a map[string]interface{}. When doing this all numbers are converted to floats, even if they have no fractional part. Eg:

var v map[string]any
json.Unmarshal([]byte(`{"value":6}`), &v)
fmt.Printf("%T", v["value"])
// float64

Therefore a user, in a situation where they are unmarshaling deeply-nested JSON into a map[string]interface{} and then marshaling using dataconv.Marshal for use in Starlark, can expect integers to be misidentified as floats. Checks inside the script will fail to distinguish between integers and floats.

To simplify this situation, we ask the Go JSON decoder to use json.Number values for all numerics.

var v map[string]any
dec := json.NewDecoder(bytes.NewReader([]byte(`{"value":6}`)))
dec.UseNumber()
dec.Decode(&v)
fmt.Printf("%T", v["value"])
// json.Number

If dataconv.Marshal converts the value to a string representation, then we can handle the values differently based on str.isdigit() and str.isdecimal() within the script.

vt128 added a commit that referenced this pull request Jun 13, 2026
…167)

Two related JSON-number gaps, both surfaced by the diagnosis in #128
(thanks @phughes-scwx):

1. Marshal had no json.Number case, so a json.Number fell through to
   "unrecognized type" and errored. A caller decoding raw JSON into
   map[string]interface{} with dec.UseNumber() (the way to keep the
   int/float distinction that a plain json.Unmarshal collapses to
   float64) could not then Marshal it. Add the case, mapping by literal
   form: an integer literal -> Int (big.Int beyond int64), otherwise
   -> Float. This matches json.decode and serial; mapping to a string
   (as proposed in #128) would make Marshal the only converter that
   turns a number into a non-number and break value == 6. Also accept
   *big.Int (the inverse of Unmarshal, so the round-trip stays exact).

2. UnmarshalStarlarkJSON decoded WITHOUT UseNumber and re-derived
   int-ness from float64, silently saturating integers past 2^53 (it is
   on the lib/http request-body path, so real JSON ids were affected).
   Decode with UseNumber and the same number-preserving logic; impose
   single-document strictness the previous json.Unmarshal had.

interface.go doc updated: numbers now map by literal form (whole floats
like 6.0 become Float, not Int — no test pinned the old heuristic).

Test-first: TestMarshal gains json.Number/*big.Int cases and
TestUnmarshalStarlarkJSONNumberFidelity pins exact big-int decoding.
Requirement: LET-28. Refs #128.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@vt128

vt128 commented Jun 13, 2026

Copy link
Copy Markdown
Member

Thanks for this, @phughes-scwx — and sorry for the slow look. The gap you're pointing at is real: today dataconv.Marshal has no json.Number case, so a json.Number falls through to the default: arm and errors with unrecognized type. The underlying pain is real too — json.Unmarshal into map[string]interface{} collapses every number to float64, so integers reach Starlark as Float and a script can't tell 6 from 6.0. Decoding with dec.UseNumber() is the right way to keep that, so Marshal should accept json.Number.

The one thing we wanted to settle before merging is the target type. Mapping json.Number to a Starlark string makes Marshal the only converter in the stack that turns a number into a non-number — json.decode (go.starlark.net), serial.loads, and dataconv's own GoToStarlarkViaJSON all map a JSON number to a Starlark Int/Float. A string result also breaks the natural script-side checks (value == 6 becomes False, arithmetic fails) and brings back the isdigit()/int() dance the int/float distinction was meant to remove.

So rather than merge this PR as-is, we've taken your diagnosis and landed the fix in #167 (credited to you), mapping json.NumberInt for an integer literal (with *big.Int beyond int64, so large ids stay exact) and → Float otherwise — the same int-vs-float rule json.decode and serial already use. With UseNumber() set: {"value":6}Int(6), {"value":6.5}Float(6.5), {"id":12345678901234567890} stays exact.

Verifying this also surfaced a sibling bug: dataconv.UnmarshalStarlarkJSON was decoding without UseNumber() and re-deriving int-ness from float64, which silently saturated integers past 2^53 — and it's on the lib/http request-body path, so real JSON ids were affected. #167 routes that through UseNumber() too, so the module now handles JSON numbers consistently.

Does the implementation in #167 work for your use case? If it covers what you needed we'll close this PR in favor of it; if anything's missing or you'd prefer a different shape, very happy to adjust. Thanks again — the report and repro pinpointed a real inconsistency in how we handle JSON numbers.

@phughes-scwx

Copy link
Copy Markdown
Author

The gap you're pointing at is real

Ah, my good friend Claude. Thank you for your support.

Does the implementation in #167 work for your use case?

Yeah sure! Looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants