Skip to content

Planned (deferred) work: survivor metadata-quality tie-break and identity discovery signals #397

Description

@jesserobbins

Context

The AIC/dedup design spec describes two capabilities that the initial implementation deliberately did not ship. These were planned from the start — not drift, not accidental omissions, and not errors in the spec. They're future work that's still worth doing. Filing this so the design-vs-shipped gap is tracked and can be picked up.

The user docs were aligned to the shipped behavior (kenn-io/msgvault-docs#30, #31) so they don't describe unbuilt features. This issue is the home for the planned remainder.

Planned, not yet implemented

1. Survivor selection — "source metadata quality" tie-break

The spec's priority list includes step 3 — source metadata quality (provider IDs, threading info, Message-ID presence). The shipped comparator Engine.isBetter implements the other five steps (source priority → raw MIME → label count → earliest archived_at → row ID) but not this one.

Planned: add the metadata-quality comparison as a tie-break between raw-MIME presence and label count, per the spec ordering.

2. Identity discovery signals

The spec's Discovery signals section describes deriving identities from observed message data — is_from_me (ingest metadata), sent-folder/sent-label (sent-mail placement), and oauth (provider account metadata). The shipped code confirms identities only at account-setup time, writing account-identifier, phone-e164, config_migration, and manual.

Planned: the discovery signals that populate identities from observed mail, beyond setup-time confirmation.

Separate small doc fix: the spec's signal list omits phone-e164, which the code does write (import.go, import_gvoice.go) — worth adding so the spec enumerates the shipped set too.

Why it matters

These complete the originally-designed survivor-selection and identity model. Until they land, the shipped behavior is a deliberate subset of the design; this issue tracks finishing the planned work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions