Encode telemetry IDs as UUID strings to prevent PostHog truncation#6611
Conversation
distinct_id and distinct_app_id were reported as 128-bit integers. PostHog coerces large JSON numbers to float64, discarding all but ~16 significant digits, so two distinct users or apps could collapse onto the same truncated value and have their events incorrectly correlated. Encode both identifiers as canonical UUID hex strings before sending. A UUID holds the same 128 bits, so str(UUID(int=existing_id)) is a lossless re-encoding: the value is unchanged, only its wire form differs. Existing installs derive their UUID from the stored integer (never regenerated), so post-migration events stay linkable to their pre-migration history. New installs now generate a real uuid4, persisted as its integer form to keep the installation_id and reflex.json files readable by older Reflex versions. https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
Greptile SummaryFixes a telemetry bug where 128-bit installation and project identifiers were being sent as raw JSON integers, which PostHog silently coerces to floats, discarding all but ~16 significant digits and collapsing distinct installs/apps onto the same truncated ID.
Confidence Score: 5/5Safe to merge — the change is narrowly scoped to telemetry ID encoding, all error paths are wrapped by the existing suppress(Exception) in _send, and new tests verify the round-trip fidelity. The encoding logic is minimal (str(uuid.UUID(int=value))), the TypedDict update is consistent, generated IDs remain backward-compatible on disk, and the test suite directly covers the regression scenario. No application logic outside telemetry is affected. No files require special attention. Important Files Changed
Reviews (1): Last reviewed commit: "fix(telemetry): send distinct ids as UUI..." | Re-trigger Greptile |
Merging this PR will improve performance by 3.68%
|
| Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|
| ⚡ | test_var_access[mutable_dataclass_list] |
218.2 ms | 210.4 ms | +3.68% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/compassionate-planck-wjwQy (097f506) with main (00fdeaf)
Footnotes
-
8 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Re-encoding distinct_id as a UUID string makes PostHog treat the new UUID identity and the old (float-truncated) numeric identity as separate persons, breaking continuity with pre-migration events. On the first telemetry send of a process, emit a one-time PostHog $create_alias event linking the new UUID distinct_id to the legacy numeric id. The legacy id is sent as a JSON number so PostHog coerces it to the same lossy float as the historic events, merging the two persons. The attempt is best-effort and runs exactly once: a flag in reflex.json records that it ran (set even when the alias does not match, since the lossy legacy id may not), and it is written with a merging update so it survives Reflex downgrades and upgrades. Brand-new UUID-native projects preset the flag during init since they have no legacy numeric telemetry to alias. https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
…per-app The alias links the per-machine installation distinct_id, so gating it on a per-app reflex.json flag was the wrong scope. Replace that flag with a marker file next to the installation id in the Reflex dir, recording that the install uses v0.9.5 UUID distinct_id semantics. - New installs write the marker when the id is first generated (ensure_reflex_installation_id), so they never attempt a pointless alias. - Legacy installs (id present, marker absent) attempt the one-time $create_alias, then write the marker regardless of outcome so it is not retried on every run. The marker lives in the per-user Reflex dir, which no Reflex version clears, so it persists across downgrades and upgrades. reflex.json is no longer touched for telemetry. https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
The "Update branch" merge of main combined `import importlib.metadata` (#6610) with this branch's `import uuid` but left a blank line splitting the stdlib import group, which ruff's isort rejected in CI pre-commit. Remove it. https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
Type of change
Description
Fixes a critical telemetry bug where 128-bit installation and project identifiers were being silently truncated by PostHog when sent as raw JSON integers. PostHog coerces large numbers to floats, discarding all but ~16 significant digits, causing distinct installations and apps to collapse onto the same identifier and have their events incorrectly correlated.
Solution: Encode the 128-bit identifiers as UUID strings before sending. A UUID carries the same 128 bits losslessly while remaining mathematically equivalent to the original integer (
uuid.UUID(int=value).int == value). This preserves full fidelity while keeping new events linkable to pre-migration history.Changes
reflex/utils/telemetry.py_encode_distinct_id()function to convert 128-bit integers to canonical UUID hex strings_PropertiesTypedDict to declaredistinct_idanddistinct_app_idasstrinstead ofint_get_event_defaults()to encode both identifiers via_encode_distinct_id()reflex/utils/prerequisites.pyensure_reflex_installation_id()to generateuuid.uuid4().intinstead ofrandom.getrandbits(128)reflex/utils/frontend_skeleton.pyinit_reflex_json()to generateuuid.uuid4().intinstead ofrandom.getrandbits(128)tests/units/test_telemetry.pyevent_defaultsfixture to expect UUID strings instead of integerstest_encode_distinct_id_round_trips_losslessly()to verify 128-bit precision is preservedtest_encode_distinct_id_handles_uuid4_int_form()to verify uuid4 round-trippingtest_encode_distinct_id_pads_small_values()to verify small integers are zero-padded correctlystub_event_default_sourcesfixture to mock slow/host-specific telemetry inputstest_get_event_defaults_encodes_ids_as_uuid_strings()regression testtest_get_event_defaults_omits_distinct_app_id_without_project_hash()to verify conditional behaviortest_get_event_defaults_returns_none_without_installation_id()to verify existing contractTest Plan
All new unit tests pass and cover:
_get_event_defaults()distinct_app_idwhen project hash is unavailableExisting telemetry tests updated to expect UUID string format.
https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy