Skip to content

Data Migration Tool inserts massive duplicate gallery rows in catalog_product_entity_media_gallery_value causing major slowdown #934

@ioweb-gr

Description

@ioweb-gr

Bug Description

After migrating from Magento 1 to Magento 2 using the Data Migration Tool, we identified a severe performance issue in the Magento\Catalog\Model\Product\Gallery\ReadHandler::sortMediaEntriesByPosition method. Product image galleries contain massive duplicate entries in the catalog_product_entity_media_gallery_value table.

Detailed Findings

  • Example: Product SKU BC-1-GOLD (entity_id 3477) only has 3 valid gallery value_ids, but each value_id appears 198 times with the same (value_id, store_id=0, entity_id) tuple in catalog_product_entity_media_gallery_value.
  • As a result, gallery queries return hundreds of thousands of rows per product (e.g., 198×3 value_ids = 594, joined twice yields 117,612 rows for one product), causing severe slowdowns in catalog operations.
  • The Magento 1 database did not contain these duplicates — the issue only arose post-migration.

Root Cause Analysis

  • The source repository code at src/Migration/Handler/Gallery/InsertValueToEntity.php is responsible for inserting gallery value-to-entity records. This code does not appear to check for pre-existing (value_id, entity_id) records before insertion, potentially allowing for many duplicates during migration.
  • Migration config XML only defines document keys, but does not enforce idempotency or uniqueness at the DB/module level for this case.
  • There is no unique constraint enforced for (value_id, store_id, entity_id) in the target table, increasing the risk.
  • No raw direct SQL for inserts is found in this repo, but any repeated/looped execution of the handler or scripts against the same data would amplify duplication.

Magento 2 runs this query

SELECT `main`.`value_id`,
       `main`.`value`                                                           AS `file`,
       `main`.`media_type`,
       `entity`.`entity_id`,
       IFNULL(`value`.`label`, `default_value`.`label`)                         AS `label`,
       IFNULL(`value`.`position`, `default_value`.`position`)                   AS `position`,
       IFNULL(`value`.`disabled`, `default_value`.`disabled`)                   AS `disabled`,
       `default_value`.`label`                                                  AS `label_default`,
       `default_value`.`position`                                               AS `position_default`,
       `default_value`.`disabled`                                               AS `disabled_default`,
       IFNULL(`value_video`.`provider`, `default_value_video`.`provider`)       AS `video_provider`,
       IFNULL(`value_video`.`url`, `default_value_video`.`url`)                 AS `video_url`,
       IFNULL(`value_video`.`title`, `default_value_video`.`title`)             AS `video_title`,
       IFNULL(`value_video`.`description`, `default_value_video`.`description`) AS `video_description`,
       IFNULL(`value_video`.`metadata`, `default_value_video`.`metadata`)       AS `video_metadata`,
       `default_value_video`.`provider`                                         AS `video_provider_default`,
       `default_value_video`.`url`                                              AS `video_url_default`,
       `default_value_video`.`title`                                            AS `video_title_default`,
       `default_value_video`.`description`                                      AS `video_description_default`,
       `default_value_video`.`metadata`                                         AS `video_metadata_default`
FROM `catalog_product_entity_media_gallery` AS `main`
         INNER JOIN `catalog_product_entity_media_gallery_value_to_entity` AS `entity`
                    ON main.value_id = entity.value_id
         LEFT JOIN `catalog_product_entity_media_gallery_value` AS `value`
                   ON main.value_id = value.value_id AND value.store_id = 0 AND value.entity_id = entity.entity_id
         LEFT JOIN `catalog_product_entity_media_gallery_value` AS `default_value`
                   ON main.value_id = default_value.value_id AND default_value.store_id = 0 AND
                      default_value.entity_id = entity.entity_id
         LEFT JOIN `catalog_product_entity_media_gallery_value_video` AS `value_video`
                   ON value.value_id = value_video.value_id AND value.store_id = value_video.store_id
         LEFT JOIN `catalog_product_entity_media_gallery_value_video` AS `default_value_video`
                   ON default_value.value_id = default_value_video.value_id AND
                      default_value.store_id = default_value_video.store_id
WHERE (main.attribute_id = '88')
  AND (main.disabled = 0)
  AND (entity.entity_id = '3477')
ORDER BY IF(value.position IS NULL, default_value.position, value.position) ASC

Which in our case yielded 117k results to sort by usort

17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    

Proposed Solution

  • Add logic in the migration handler and/or resource model to check for existing records before inserting into catalog_product_entity_media_gallery_value.

Additional Information

  • No duplicates existed in source (Magento 1) DB before migration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions