MCHPM

Official implementation of:

Lim, H., Park, S., Li, Q., Li, X., & Kim, J. (2026). What makes a review helpful? A multimodal prediction model in e-commerce. Electronic Commerce Research and Applications, 76, 101586. Paper

Overview

This repository is the official implementation of MCHPM (Multimodal Cue-based Helpfulness Prediction Model), published in Electronic Commerce Research and Applications (2026).

Most multimodal review helpfulness prediction (MRHP) models rely on deep semantic representations of text and images and overlook surface-level cues such as readability, sentiment intensity, and image quality. MCHPM addresses this gap by drawing on the Elaboration Likelihood Model (ELM) from consumer psychology, which describes how readers process information through two parallel routes — a central route based on careful cognitive engagement, and a peripheral route based on superficial heuristics.

For each modality (text and image), MCHPM extracts both central cues (deep semantic representations from BERT and VGG-16) and peripheral cues (surface-level features like readability and image clarity). Within each modality, central and peripheral cues are integrated through co-attention; the resulting text and image representations are then fused via a Gated Multimodal Unit (GMU) that adaptively weights the two modalities.

The model predicts a continuous review-helpfulness score, defined as log(1 + helpful_vote), as a regression target. Quantitative comparisons against unimodal and multimodal baselines on large-scale Amazon datasets are reported in Experimental Results.

Repository Structure

├── data/
│   ├── raw/                        # Source datasets — place {fname}.jsonl.gz here
│   ├── processed/                  # Pipeline parquet caches (labeled / cued)
│   └── review_images/              # Downloaded review images, grouped by dataset name
│
├── model/
│   ├── mchpm.py                    # MCHPM architecture, trainer, tester
│   ├── mchpm_architecture.png      # Architecture diagram
│   └── save/                       # Best checkpoint per dataset (best.pth)
│
├── src/
│   ├── config.yaml                 # Single source of truth for all hyperparameters
│   ├── data_processing.py          # DataProcessor pipeline + DataLoader factory
│   ├── text_cue_extractor.py       # BERT central + peripheral text cues
│   ├── image_cue_extractor.py      # VGG-16 central + peripheral image cues
│   ├── review_image_downloader.py  # Parallel review image downloader (cache-aware)
│   ├── text_processing.py          # Review-text cleaning and row filters
│   ├── path.py                     # Project path constants (auto-creates runtime folders)
│   └── utils.py                    # Generic helpers — I/O, metrics, seeding
│
├── main.py                         # Entry point: data preparation → train → test
├── requirements.txt
└── README.md

Model Description

MCHPM consists of three sequential modules. Cue extraction runs in src/text_cue_extractor.py and src/image_cue_extractor.py; the integration and fusion network is in model/mchpm.py. The full architecture is illustrated below.

1. Multi-Cue Extraction Module

Extracts central and peripheral cues from review text and images in parallel.

Central cues (deep semantic representations):

Text: BERT [CLS] embedding
Image: VGG-16 fc2 activation

Peripheral cues (surface-level features):

Text — polarity, subjectivity, readability, extremity
Image — brightness, contrast, saturation, edge intensity

2. Cue-Integration Module

Within each modality, central and peripheral representations attend to each other through co-attention (CoAttentionBlock): central queries peripheral, peripheral queries central, and the two attended outputs are combined via element-wise multiplication. The same pattern is applied independently to the text and image sides, yielding modality-specific integrated vectors.

3. Multimodal Fusion Module

The integrated text and image vectors are passed through tanh projections, then fused by a Gated Multimodal Unit (MCHPM.gate_layer). A sigmoid gate, computed from the concatenated representations, adaptively weights the contribution of each modality. The fused vector is forwarded to an MLP regressor (MCHPM.regressor) that outputs the predicted helpfulness score.

How to Run

Configuration

All hyperparameters live in src/config.yaml — it is the single source of truth. Defaults reproduce the paper experiments.

A CUDA-capable GPU is recommended; main.py falls back to CPU with a warning if CUDA is unavailable. See requirements.txt for the GPU wheel and CPU-only setup.

End-to-end run:

conda create -n mchpm python=3.11
conda activate mchpm
pip install -r requirements.txt
python main.py

Data Preparation

Place the dataset as data/raw/{fname}.jsonl.gz where {fname} matches data.fname in config.yaml. The file is read as gzipped JSON-lines (one review object per line) — each line must carry the columns below, or the run aborts at load with a KeyError.

Column	Role
`user_id`	Reviewer id (non-null; also disambiguates downloaded image filenames).
`parent_asin`	Product id (non-null).
`timestamp`	Epoch-millisecond review time → `review_date`.
`text`	Review body → `raw_review`; cleaned and fed to BERT.
`images`	List of review-image URLs → `review_images`; rows with no image are dropped.
`helpful_vote`	Helpful-vote count; the label is `log(1 + helpful_vote)`, and zero / missing-vote rows are dropped.
`verified_purchase`	Boolean flag; only verified-purchase reviews are kept.

Any other columns are ignored. The pipeline writes two cache layers under data/processed/:

{fname}_labeled.parquet — written after the row filters, text cleaning, and label construction.
- Columns: user_id, parent_asin, timestamp, review_date, raw_review, clean_review, review_images, helpful_vote, label (the regression target log(1 + helpful_vote)).
{fname}_cued.parquet — adds the downloaded image paths and the extracted cues.
- Columns: the labeled columns + review_image_paths, review_text_central / review_text_peripheral (BERT semantic + readability/sentiment text cues), review_image_central / review_image_peripheral (VGG-16 semantic + clarity image cues).

To reuse externally-extracted BERT/VGG features, save the data as {fname}_labeled.parquet with review_text_central and/or review_image_central columns pre-populated. The pipeline will skip BERT/VGG and only compute peripheral cues.

Re-runs and caching

On every python main.py, the pipeline resumes from the most-complete cache on disk, checking newest-first (cued → labeled → image folder) and falling through to the next-earliest stage. The train/test split is rebuilt fresh in memory each run, so changes to test_size, seed, or val_ratio take effect on the next run. To re-trigger an upstream stage, delete its parquet (or the data/review_images/{fname}/ folder for image re-downloads).

Experimental Results

MCHPM was evaluated on two large-scale Amazon review datasets: Cell Phones & Accessories and Electronics. The results demonstrate that MCHPM consistently outperforms strong unimodal and multimodal baselines across all evaluation metrics, achieving average improvements of 3.864% in MAE, 4.061% in MSE, 2.172% in RMSE, and 6.349% in MAPE compared with the strongest benchmark model.

Model	Cell Phones & Accessories				Electronics
Model	MAE	MSE	RMSE	MAPE	MAE	MSE	RMSE	MAPE
LSTM	0.647	0.821	0.849	56.702	0.711	0.896	0.946	57.678
TNN	0.643	0.714	0.845	56.650	0.722	0.904	0.851	59.556
DMAF	0.625	0.691	0.836	53.139	0.697	0.880	0.939	55.198
CS-IMD	0.615	0.681	0.825	52.392	0.687	0.831	0.912	56.032
MFRHP	0.625	0.695	0.837	53.116	0.695	0.840	0.916	57.488
MCHPM (Proposed)	0.607	0.679	0.824	50.706	0.674	0.825	0.908	53.712

Citation

If you use this repository in your research, please cite:

@article{LIM2026101586,
  title = {What makes a review helpful? A multimodal prediction model in e-commerce},
  author = {Heena Lim and Seonu Park and Qinglong Li and Xinzhe Li and Jaekyeong Kim},
  journal = {Electronic Commerce Research and Applications},
  volume = {76},
  pages = {101586},
  year = {2026},
  doi = {10.1016/j.elerap.2026.101586}  
}

Contact

For research inquiries or collaborations, please contact:

Seonu Park
Ph.D. Student, Department of Big Data Analytics
Kyung Hee University
Email: sunu0087@khu.ac.kr

Qinglong Li
Assistant Professor, Division of Computer Engineering
Hansung University
Email: leecy@hansung.ac.kr

Last updated: June 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCHPM

Overview

Repository Structure

Model Description

1. Multi-Cue Extraction Module

2. Cue-Integration Module

3. Multimodal Fusion Module

How to Run

Configuration

Data Preparation

Re-runs and caching

Experimental Results

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
model		model
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MCHPM

Overview

Repository Structure

Model Description

1. Multi-Cue Extraction Module

2. Cue-Integration Module

3. Multimodal Fusion Module

How to Run

Configuration

Data Preparation

Re-runs and caching

Experimental Results

Citation

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages