Skip to content

Fix CSV re-import dedup to use raw symbol for stable keys#763

Closed
afadil wants to merge 1 commit intomainfrom
feature/competent-mirzakhani
Closed

Fix CSV re-import dedup to use raw symbol for stable keys#763
afadil wants to merge 1 commit intomainfrom
feature/competent-mirzakhani

Conversation

@afadil
Copy link
Owner

@afadil afadil commented Mar 19, 2026

Problem

CSV re-imports were using resolved asset UUIDs for idempotency key computation, causing keys to differ between the review step and import step, breaking deduplication.

Solution

  • Extract symbol-based asset identifier logic into symbol_based_asset_id() helper that uses raw symbol + exchange MIC instead of UUID
  • Update build_import_idempotency_key() to use the new helper for consistent key generation
  • Update CSV import deduplication in repository to call the same helper, ensuring keys match across steps
  • Improve duplicate handling with explicit error catching for UniqueViolation constraints with debug logging

Changes

  • activities_service.rs: Add symbol_based_asset_id() helper and refactor key building logic
  • repository.rs: Switch from bulk insert to per-activity insert with duplicate detection and logging

…dempotency keys

The review step (check_activities_import) computed idempotency keys using
raw CSV symbol text (e.g. 'AAPL'), while the import step used the resolved
asset UUID. Since the DB stored UUID-based keys, the review step never
found duplicates on re-import, and the import step could also mismatch
if asset resolution varied between imports.

Fix:
- Extract symbol_based_asset_id() helper that builds 'SYMBOL' or
  'SYMBOL@MIC' consistently for both paths
- Import step now uses get_symbol_code()/get_exchange_mic() instead of
  get_symbol_id() for key computation
- create_activities inserts row-by-row and gracefully skips unique
  constraint violations as a safety net

Closes #754
@afadil
Copy link
Owner Author

afadil commented Mar 19, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8a9b6ce85c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2635 to +2638
let asset_id_for_key = Self::symbol_based_asset_id(
activity.get_symbol_code(),
activity.get_exchange_mic(),
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match legacy idempotency keys during CSV dedupe

This change computes import keys from symbol@MIC only, but previously imported activities in existing databases were keyed with the old UUID-based asset_id path. On the first re-import after this upgrade, check_existing_duplicates and the unique index compare against the new hash and will miss those legacy rows, allowing the same historical activity to be inserted again instead of being flagged/skipped. Keep backward compatibility by checking both key formats (or backfilling old rows) before switching to symbol-only keys.

Useful? React with 👍 / 👎.

@afadil afadil closed this Mar 19, 2026
@afadil afadil deleted the feature/competent-mirzakhani branch March 19, 2026 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant