fix: normalize importer message/reaction dates to UTC#399
Merged
Conversation
processBatch tracked the oldest message date via time.UnixMilli without .UTC(), leaving it in the local zone — so code reading its calendar day (and TestProcessBatch_OldestDatePropagation) was off by one in zones east of UTC. Match the .UTC() normalization already used for the stored message date and parsed.Date. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01UVTtwc4MNS8L4ztnJ87QMj
roborev: Review Unavailable (
|
Same class as the sync oldest-date bug: WhatsApp message SentAt (mapping.go) and reaction createdAt (importer.go) were built with time.Unix without .UTC(), leaving them in the local zone while every other importer stores UTC. Off-by-one calendar day and wrong Parquet year-partition near boundaries east of UTC. Adds TestMapMessageSentAtIsUTC. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01UVTtwc4MNS8L4ztnJ87QMj
roborev: Combined Review (
|
Member
|
Great, thank you! |
wesm
approved these changes
Jun 22, 2026
Member
|
fixing CI |
The WhatsApp UTC regression test used enough direct testify package calls to trip the repo's custom assertion-helper lint in CI, even though the behavior test itself passed. Use a local assert helper in that assertion-heavy test so the regression coverage stays in place while matching the enforced test style. Generated with Codex (GPT-5) Co-authored-by: Codex <codex@openai.com>
roborev: Combined Review (
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Several importers built
time.Timevalues from epoch timestamps withtime.Unix/time.UnixMillibut without.UTC(), leaving them in the runner's local zone — while the rest of each importer stores dates in UTC. Any code reading the calendar day (or the Parquet year partition) is then off by one in zones east of UTC.Fixes:
internal/sync/sync.go—processBatcholdest-message date (progress tracking).internal/whatsapp/mapping.go— messageSentAt.internal/whatsapp/importer.go— reactioncreatedAt.Why it matters
TestProcessBatch_OldestDatePropagationfails on any machine east of UTC (e.g. NZ): the fixture2024-01-10T12:00:00Zreads back as Jan 11 local. The tests are correct; the production code was the bug. AddsTestMapMessageSentAtIsUTC(asserts the stored zone is UTC, machine-independent).Possible later fixes (out of scope here)
The same
time.Unix(...)-without-.UTC()pattern also appears in the embedding-generation status timestamps, but these are operator-facing status values round-tripped from unix-int columns (not message dates), so they don't affect partitioning/dedup/cross-system date semantics. Local-time display is arguably fine; normalizing them to UTC would be a consistency-only follow-up. Sites:cmd/msgvault/cmd/embeddings_manage.go—StartedAt,SeededAt,CompletedAt,ActivatedAt.internal/vector/pgvector/backend.go—StartedAt,CompletedAt,ActivatedAt.internal/vector/sqlitevec/backend.go—StartedAt,CompletedAt,ActivatedAt.Left unchanged here to avoid churning working code on a style call; documented so a future pass can decide.
Scope
Independent of the Teams PR (#398) — branched from
main, touches onlyinternal/syncandinternal/whatsapp.