Skip to content

Encode telemetry IDs as UUID strings to prevent PostHog truncation#6611

Merged
masenf merged 6 commits into
mainfrom
claude/compassionate-planck-wjwQy
Jun 5, 2026
Merged

Encode telemetry IDs as UUID strings to prevent PostHog truncation#6611
masenf merged 6 commits into
mainfrom
claude/compassionate-planck-wjwQy

Conversation

@masenf

@masenf masenf commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Description

Fixes a critical telemetry bug where 128-bit installation and project identifiers were being silently truncated by PostHog when sent as raw JSON integers. PostHog coerces large numbers to floats, discarding all but ~16 significant digits, causing distinct installations and apps to collapse onto the same identifier and have their events incorrectly correlated.

Solution: Encode the 128-bit identifiers as UUID strings before sending. A UUID carries the same 128 bits losslessly while remaining mathematically equivalent to the original integer (uuid.UUID(int=value).int == value). This preserves full fidelity while keeping new events linkable to pre-migration history.

Changes

  1. reflex/utils/telemetry.py

    • Added _encode_distinct_id() function to convert 128-bit integers to canonical UUID hex strings
    • Updated _Properties TypedDict to declare distinct_id and distinct_app_id as str instead of int
    • Modified _get_event_defaults() to encode both identifiers via _encode_distinct_id()
  2. reflex/utils/prerequisites.py

    • Changed ensure_reflex_installation_id() to generate uuid.uuid4().int instead of random.getrandbits(128)
    • Added comment explaining that the integer form is stored for backward compatibility with older Reflex versions
  3. reflex/utils/frontend_skeleton.py

    • Changed init_reflex_json() to generate uuid.uuid4().int instead of random.getrandbits(128)
    • Added comment explaining the encoding strategy
  4. tests/units/test_telemetry.py

    • Updated event_defaults fixture to expect UUID strings instead of integers
    • Added test_encode_distinct_id_round_trips_losslessly() to verify 128-bit precision is preserved
    • Added test_encode_distinct_id_handles_uuid4_int_form() to verify uuid4 round-tripping
    • Added test_encode_distinct_id_pads_small_values() to verify small integers are zero-padded correctly
    • Added stub_event_default_sources fixture to mock slow/host-specific telemetry inputs
    • Added test_get_event_defaults_encodes_ids_as_uuid_strings() regression test
    • Added test_get_event_defaults_omits_distinct_app_id_without_project_hash() to verify conditional behavior
    • Added test_get_event_defaults_returns_none_without_installation_id() to verify existing contract

Test Plan

All new unit tests pass and cover:

  • Lossless round-trip encoding/decoding of 128-bit identifiers
  • UUID4 integer form handling
  • Zero-padding of small values
  • Integration with _get_event_defaults()
  • Conditional omission of distinct_app_id when project hash is unavailable
  • Preservation of existing behavior when installation ID is missing

Existing telemetry tests updated to expect UUID string format.

https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy

distinct_id and distinct_app_id were reported as 128-bit integers. PostHog
coerces large JSON numbers to float64, discarding all but ~16 significant
digits, so two distinct users or apps could collapse onto the same truncated
value and have their events incorrectly correlated.

Encode both identifiers as canonical UUID hex strings before sending. A UUID
holds the same 128 bits, so str(UUID(int=existing_id)) is a lossless
re-encoding: the value is unchanged, only its wire form differs. Existing
installs derive their UUID from the stored integer (never regenerated), so
post-migration events stay linkable to their pre-migration history. New
installs now generate a real uuid4, persisted as its integer form to keep the
installation_id and reflex.json files readable by older Reflex versions.

https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
@masenf masenf requested a review from a team as a code owner June 4, 2026 23:15
@greptile-apps

greptile-apps Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Fixes a telemetry bug where 128-bit installation and project identifiers were being sent as raw JSON integers, which PostHog silently coerces to floats, discarding all but ~16 significant digits and collapsing distinct installs/apps onto the same truncated ID.

  • Adds _encode_distinct_id() in telemetry.py to encode stored 128-bit integers as canonical UUID hex strings before they leave the process, updating _Properties and _get_event_defaults() accordingly.
  • Replaces random.getrandbits(128) with uuid.uuid4().int in both prerequisites.py and frontend_skeleton.py; the integer form is still persisted to disk for backward compatibility with older Reflex versions.
  • Six new unit tests cover lossless round-trip encoding, zero-padding of small values, and regression coverage of _get_event_defaults() emitting string IDs.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped to telemetry ID encoding, all error paths are wrapped by the existing suppress(Exception) in _send, and new tests verify the round-trip fidelity.

The encoding logic is minimal (str(uuid.UUID(int=value))), the TypedDict update is consistent, generated IDs remain backward-compatible on disk, and the test suite directly covers the regression scenario. No application logic outside telemetry is affected.

No files require special attention.

Important Files Changed

Filename Overview
reflex/utils/telemetry.py Adds _encode_distinct_id() to convert 128-bit ints to UUID strings; updates _Properties TypedDict and _get_event_defaults() to use it.
reflex/utils/prerequisites.py Swaps random.getrandbits(128) for uuid.uuid4().int when generating a new installation ID, preserving integer storage for backward compatibility.
reflex/utils/frontend_skeleton.py Same random.getrandbits(128) → uuid.uuid4().int change for project hash generation; comment explains encoding strategy.
tests/units/test_telemetry.py Updates existing fixture to expect UUID strings; adds six new targeted tests covering round-trip fidelity, zero-padding, and _get_event_defaults() integration.

Reviews (1): Last reviewed commit: "fix(telemetry): send distinct ids as UUI..." | Re-trigger Greptile

@codspeed-hq

codspeed-hq Bot commented Jun 4, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 3.68%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 25 untouched benchmarks
⏩ 8 skipped benchmarks1

Performance Changes

Benchmark BASE HEAD Efficiency
test_var_access[mutable_dataclass_list] 218.2 ms 210.4 ms +3.68%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing claude/compassionate-planck-wjwQy (097f506) with main (00fdeaf)

Open in CodSpeed

Footnotes

  1. 8 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

claude added 3 commits June 5, 2026 00:48
Re-encoding distinct_id as a UUID string makes PostHog treat the new UUID
identity and the old (float-truncated) numeric identity as separate persons,
breaking continuity with pre-migration events.

On the first telemetry send of a process, emit a one-time PostHog
$create_alias event linking the new UUID distinct_id to the legacy numeric id.
The legacy id is sent as a JSON number so PostHog coerces it to the same lossy
float as the historic events, merging the two persons. The attempt is
best-effort and runs exactly once: a flag in reflex.json records that it ran
(set even when the alias does not match, since the lossy legacy id may not),
and it is written with a merging update so it survives Reflex downgrades and
upgrades. Brand-new UUID-native projects preset the flag during init since
they have no legacy numeric telemetry to alias.

https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
…per-app

The alias links the per-machine installation distinct_id, so gating it on a
per-app reflex.json flag was the wrong scope. Replace that flag with a marker
file next to the installation id in the Reflex dir, recording that the install
uses v0.9.5 UUID distinct_id semantics.

- New installs write the marker when the id is first generated
  (ensure_reflex_installation_id), so they never attempt a pointless alias.
- Legacy installs (id present, marker absent) attempt the one-time
  $create_alias, then write the marker regardless of outcome so it is not
  retried on every run.

The marker lives in the per-user Reflex dir, which no Reflex version clears,
so it persists across downgrades and upgrades. reflex.json is no longer
touched for telemetry.

https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
adhami3310
adhami3310 previously approved these changes Jun 5, 2026
The "Update branch" merge of main combined `import importlib.metadata` (#6610)
with this branch's `import uuid` but left a blank line splitting the stdlib
import group, which ruff's isort rejected in CI pre-commit. Remove it.

https://claude.ai/code/session_0162Wc1GmkskbgCRs7fjg9Cy
@masenf masenf merged commit 3794655 into main Jun 5, 2026
105 of 106 checks passed
@masenf masenf deleted the claude/compassionate-planck-wjwQy branch June 5, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants