Skip to content

feat(contacts): apply duplicate merge plans#817

Merged
steipete merged 2 commits into
mainfrom
codex/contacts-dedupe-apply
Jun 15, 2026
Merged

feat(contacts): apply duplicate merge plans#817
steipete merged 2 commits into
mainfrom
codex/contacts-dedupe-apply

Conversation

@steipete

@steipete steipete commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Closes #815

Summary

  • add guarded contacts dedupe --apply / --merge
  • merge all supported People API fields into the selected primary contact before deleting redundant contacts
  • add repeatable --resource people/... scoping for exact reviewed automation targets
  • reject ambiguous singleton fields and unsupported secondary photo/cover-photo/skill data
  • re-read etags before each destructive mutation and preserve undeleted contacts on partial failure
  • add command docs, changelog entry, regression coverage, and permanent Contacts live-test coverage

Validation

  • autoreview --mode local --parallel-tests 'make ci': clean, no actionable findings
  • make ci: pass
  • focused Contacts apply/resource and live-harness tests: pass
  • live E2E with clawdbot@gmail.com: created two disposable duplicate contacts, verified exact-scoped dry-run, applied one merge/delete, read back both phone values from the primary, then cleaned up

@clawsweeper

clawsweeper Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge. Reviewed June 15, 2026, 8:56 AM ET / 12:56 UTC.

Summary
This PR adds opt-in contacts dedupe --apply/--merge, resource-scoped dry-run/apply planning, People API update/delete logic, docs, unit tests, and live-test harness coverage.

Reproducibility: yes. for the review finding: source inspection shows same normalized email/phone values are de-duplicated by value only, and the new test already expects a same-value email collapse. For the feature request itself, reproduction is not a bug path because current main is documented as preview-only.

Review metrics: 3 noteworthy metrics.

  • Files changed: 18 files, +1,231/-26. The feature spans command behavior, generated docs, manual docs, unit tests, and live-test harness paths.
  • People API mutations: 2 mutation calls added. UpdateContact and DeleteContact make this a destructive remote-data path that needs stronger merge safety than preview-only dedupe.
  • Release-owned file touched: 1 file. The branch edits CHANGELOG.md, which OpenClaw PR review guidance reserves for the release flow rather than normal feature branches.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🌊 off-meta tidepool
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Add refusal or lossless handling for same normalized email/phone values with differing mutable attributes.
  • Remove the normal-PR changelog edit and keep release-note context in the PR body or commit message.
  • [P2] Post or record the pending disposable real-account E2E result before merge, with private data redacted.

Risk before merge

  • [P1] --apply can silently discard user-entered email/phone attributes such as labels or types when two contacts share the same normalized value but differ in the rest of the field payload.
  • [P1] The PR body says disposable real-account E2E is still pending, which matters because this path updates one contact and deletes others through the People API.

Maintainer options:

  1. Refuse same-value field conflicts (recommended)
    Treat matching email/phone values with different cleaned payloads as unmergeable unless the code can preserve all mutable attributes, and add regression coverage before merge.
  2. Accept documented metadata loss
    Maintainers could intentionally decide labels/types on duplicate same-value fields are not preserved, but the docs and PR body should make that destructive tradeoff explicit.
  3. Pause until live proof is posted
    Because the PR itself says real-account E2E is pending, maintainers can pause merge until the live destructive workflow is proven on disposable contacts.
Copy recommended automerge instruction
@clawsweeper automerge

Special instructions:
In internal/cmd/contacts_dedupe_apply.go, update mergeContactsDedupeKeyedItems so duplicate normalized email/phone values are accepted only when their cleaned non-metadata payloads are identical; otherwise return a clear refusal error. Add tests covering same email/phone with different type/label/user-entered attributes, and keep metadata/formattedType stripping behavior.

Next step before merge

  • [P2] There is a narrow repair path in this PR branch: make keyed email/phone merges refuse differing cleaned payloads and remove the normal-PR changelog edit.

Security
Cleared: No dependency, workflow, credential, or supply-chain change was found; the destructive contact-data concern is tracked as a functional merge risk.

Review findings

  • [P1] Refuse conflicting same-value email and phone fields — internal/cmd/contacts_dedupe_apply.go:285-286
  • [P3] Leave the changelog entry to the release flow — CHANGELOG.md:7
Review details

Best possible solution:

Keep the guarded apply design, but refuse or deliberately merge same-value email/phone conflicts before deletion and leave release-owned changelog updates to the release flow.

Do we have a high-confidence way to reproduce the issue?

Yes for the review finding: source inspection shows same normalized email/phone values are de-duplicated by value only, and the new test already expects a same-value email collapse. For the feature request itself, reproduction is not a bug path because current main is documented as preview-only.

Is this the best way to solve the issue?

No. The implementation is directionally aligned, but the narrow safe fix is to refuse or preserve same-value email/phone fields with differing mutable attributes before deleting the redundant contact.

Full review comments:

  • [P1] Refuse conflicting same-value email and phone fields — internal/cmd/contacts_dedupe_apply.go:285-286
    When a keyed field uses the normalized email/phone value as key, every later item with that same value is skipped. If the redundant contact has the same address/number but a different mutable label/type or other user-entered attribute, --apply updates the primary without that data and then deletes the redundant contact; same-value fields need to be merged losslessly or treated as unmergeable.
    Confidence: 0.9
  • [P3] Leave the changelog entry to the release flow — CHANGELOG.md:7
    OpenClaw release-note guidance keeps CHANGELOG.md release-owned for normal PR review. This feature branch adds a release entry directly, so the PR should move that context to the PR body or commit message and let the release flow update the changelog.
    Confidence: 0.78

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against f212bc5f8970.

Label changes

Label changes:

  • add P2: This is a normal-priority feature PR with a concrete destructive contact-data correctness risk but no urgent current-main regression.
  • add merge-risk: 🚨 other: Merging this PR could cause opt-in contact merges to drop mutable field attributes before deleting redundant contacts.
  • add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🌊 off-meta tidepool and patch quality is 🧂 unranked krab.
  • add status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The external-contributor proof gate is not applied to this collaborator-authored PR, though the PR body still notes disposable real-account E2E is pending.

Label justifications:

  • P2: This is a normal-priority feature PR with a concrete destructive contact-data correctness risk but no urgent current-main regression.
  • merge-risk: 🚨 other: Merging this PR could cause opt-in contact merges to drop mutable field attributes before deleting redundant contacts.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🌊 off-meta tidepool and patch quality is 🧂 unranked krab.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The external-contributor proof gate is not applied to this collaborator-authored PR, though the PR body still notes disposable real-account E2E is pending.
Evidence reviewed

Acceptance criteria:

  • [P1] go test ./internal/cmd -run 'TestBuildContactsDedupeApplyPlan|TestContactsDedupeApply|TestNormalizeContactsDedupeResources'.
  • [P1] go test ./scripts -run TestContactsLiveDedupe.
  • [P1] make ci.

What I checked:

  • Repository policy read: AGENTS.md was read fully and its PR-link review-mode guidance was applied by inspecting the PR without switching branches or editing files. (AGENTS.md:35, f212bc5f8970)
  • Current main remains preview-only: Current main still advertises contacts dedupe as preview-only, so the central feature is not already implemented there. (internal/cmd/contacts.go:19, f212bc5f8970)
  • Docs confirm no apply flag on main: The current docs say contacts dedupe is read-only and has no apply flag. (docs/contacts-dedupe.md:47, f212bc5f8970)
  • Same-key items are skipped: The PR's keyed merge path replaces the structural item key with normalized email/phone value and skips later items with the same key, which drops differing labels/types or other mutable attributes before deletion. (internal/cmd/contacts_dedupe_apply.go:285, 330dbd55befd)
  • Existing test codifies value-only email dedupe: The new merge-fields test expects duplicate normalized email values to collapse to one address but does not cover differing mutable attributes on the same value. (internal/cmd/contacts_dedupe_apply_test.go:38, 330dbd55befd)
  • PR body notes live E2E is pending: The PR validation section lists make ci and HTTP-level tests, but says disposable real-account E2E is still pending before merge. (330dbd55befd)

Likely related people:

  • steipete: Introduced the current contacts dedupe preview and has the recent contacts/live-test history around this command surface. (role: feature owner and recent area contributor; confidence: high; commits: 62a7257aba85, 7d03521337c6, db31aae92b41; files: internal/cmd/contacts_dedupe.go, internal/cmd/contacts.go, scripts/live-tests/contacts.sh)
  • Andy: Added contacts address-field support in the adjacent People API contact field surface that this merge code now tries to preserve. (role: adjacent contacts field contributor; confidence: medium; commits: f1645917a796; files: internal/cmd/contacts.go, internal/cmd/contacts_crud.go, internal/cmd/contacts_update_more_test.go)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90aed08591

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +285 to +286
if key == "" || seen[key] {
continue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve differing labels on duplicate email/phone values

When keyFn is used for emailAddresses or phoneNumbers, this branch drops every later item with the same normalized value. If a redundant contact has the same email/number but a different mutable label/type or other user-entered attributes, --apply will keep only the primary's item and then delete the redundant contact, silently losing that metadata; conflicting same-key items should be merged deliberately or cause the group to be refused.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 330dbd55be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if decodeErr := json.Unmarshal(data, &object); decodeErr != nil {
return nil, "", fmt.Errorf("decode contact field: %w", decodeErr)
}
delete(object, "metadata")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve primary field metadata during merges

When an applied group contains a multi-valued field with a user-selected source primary (for example a non-first primary email/phone), this deletes the entire field metadata object before updateContact replaces that field, so the PATCH no longer carries the People API metadata.sourcePrimary marker documented for source-level primary fields (https://developers.google.com/people/api/rest/v1/people#FieldMetadata). The merge can therefore silently change which email/phone/URL is primary even though the values were copied; strip only the read-only/source bits while retaining one appropriate sourcePrimary marker.

Useful? React with 👍 / 👎.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 15, 2026
@steipete steipete merged commit 02c893f into main Jun 15, 2026
9 checks passed
@steipete steipete deleted the codex/contacts-dedupe-apply branch June 15, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(contacts): add an apply/merge step to dedupe (currently preview-only)

1 participant