Skip to content

werdavpapeno/trustpilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Trustpilot Scraper

Trustpilot Scraper collects structured company profiles and customer reviews from Trustpilot so you can analyze brand reputation at scale. It turns messy review pages into clean, ready-to-use data for reporting, dashboards, and automation. Use this Trustpilot scraper to monitor feedback trends, benchmark competitors, and stay ahead of customer sentiment.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Trustpilot you've just found your team — Let’s Chat. 👆👆

Introduction

Trustpilot Scraper is a data collection tool designed to fetch detailed company information and customer reviews from Trustpilot. It solves the manual, time-consuming task of copying reviews one by one and converts them into a structured format you can plug into your analytics stack. This project is ideal for marketers, data analysts, agencies, e-commerce brands, SaaS businesses, and anyone who needs ongoing visibility into public customer feedback.

Trustpilot Review Intelligence

  • Collects company profile metadata such as name, domain, category, location, and TrustScore.
  • Extracts individual reviews with ratings, titles, text, dates, and reviewer details.
  • Supports scraping multiple companies in a single run based on a configurable input list.
  • Normalizes timestamps, ratings, and counts for consistent downstream analysis.
  • Outputs data in structured formats suitable for databases, BI tools, spreadsheets, and automation workflows.

Features

Feature Description
Company profile extraction Captures company name, website, categories, TrustScore, review count, and location metadata.
Detailed review harvesting Scrapes review rating, title, text, date, language, and reviewer information for deeper sentiment analysis.
Multi-company input Accepts a list of company profile URLs or identifiers to process many brands in one batch.
Pagination handling Automatically follows review pages to gather as many reviews as allowed by your configuration.
Flexible filtering Configure limits by number of reviews, minimum rating, or date range to match your use case.
Clean structured output Produces standardized JSON/CSV-friendly data for analytics, warehousing, or further automation.
Robust error handling Skips broken entries gracefully and logs failures for later inspection.
Configurable delays Adjustable request pacing to reduce the risk of being rate-limited and to improve stability.

What Data This Scraper Extracts

Field Name Field Description
companyName Display name of the company as shown on Trustpilot.
companySlug URL slug or identifier used in the company profile URL.
companyUrl Official website URL of the company, if available.
trustpilotUrl Full Trustpilot company profile URL that was scraped.
category Primary business category or sector associated with the company.
country Country or region the company profile is associated with.
trustScore Overall TrustScore rating for the company (e.g., 4.3).
ratingValue Average star rating for the company on Trustpilot.
reviewCount Total number of reviews listed for the company at the time of scraping.
claimedProfile Boolean flag indicating whether the company claims its Trustpilot profile.
reviewId Unique identifier or hash for the individual review.
reviewUrl Direct URL to the specific review, when available.
reviewTitle Short title or headline of the review.
reviewText Full text content of the customer review.
reviewRating Star rating given in the review (1–5).
reviewLanguage Language code of the review content (e.g., en, de, fr).
reviewDate Human-readable date when the review was posted.
reviewTimestamp Machine-friendly timestamp representation of the review date.
reviewerName Display name or alias of the reviewer.
reviewerCountry Country or region associated with the reviewer, if shown.
reviewerProfileUrl URL of the reviewer’s public profile page, where available.
verifiedOrder Boolean indicating whether the review is marked as “Verified” for a genuine purchase.
companyReplyText Text of the company’s public reply to the review, if present.
companyReplyDate Date when the company responded to the review.
scrapedAt Timestamp indicating when this record was scraped.
source Static identifier for the data source (e.g., "trustpilot").

Example Output

[
  {
    "companyName": "Example Electronics",
    "companySlug": "example-electronics",
    "companyUrl": "https://www.example-electronics.com",
    "trustpilotUrl": "https://www.trustpilot.com/review/example-electronics.com",
    "category": "Electronics Store",
    "country": "United States",
    "trustScore": 4.6,
    "ratingValue": 4.5,
    "reviewCount": 1523,
    "claimedProfile": true,
    "reviewId": "63f9b749e2afec0012a9c0c1",
    "reviewUrl": "https://www.trustpilot.com/reviews/63f9b749e2afec0012a9c0c1",
    "reviewTitle": "Fast delivery and great service",
    "reviewText": "Very happy with my order. Delivery was faster than expected and support was responsive.",
    "reviewRating": 5,
    "reviewLanguage": "en",
    "reviewDate": "2025-01-18",
    "reviewTimestamp": 1737168000000,
    "reviewerName": "Sarah M",
    "reviewerCountry": "US",
    "reviewerProfileUrl": "https://www.trustpilot.com/users/abcd1234",
    "verifiedOrder": true,
    "companyReplyText": "Thank you for your feedback! We are glad you enjoyed your experience.",
    "companyReplyDate": "2025-01-19",
    "scrapedAt": "2025-02-01T12:04:53Z",
    "source": "trustpilot"
  }
]

Directory Structure Tree

Trustpilot/
├── src/
│   ├── runner.py
│   ├── client/
│   │   ├── trustpilot_client.py
│   │   └── http_utils.py
│   ├── extractors/
│   │   ├── company_parser.py
│   │   └── review_parser.py
│   ├── outputs/
│   │   └── exporters.py
│   └── config/
│       └── settings.example.json
├── data/
│   ├── companies.sample.txt
│   └── sample_output.json
├── tests/
│   ├── __init__.py
│   └── test_parsers.py
├── requirements.txt
└── README.md

Use Cases

  • Agencies use it to monitor multiple client brands on Trustpilot, so they can generate regular reputation reports and track improvements over time.
  • E-commerce teams use it to centralize reviews from Trustpilot into their BI stack, so they can correlate customer sentiment with sales and product changes.
  • Product managers use it to identify recurring complaints or feature requests, so they can prioritize roadmap decisions with real customer data.
  • Customer success teams use it to detect negative reviews quickly, so they can intervene, respond, and prevent churn.
  • Competitive intelligence analysts use it to benchmark competitors’ TrustScores and review patterns, so they can position their brand more effectively.

FAQs

Q: What input does the scraper require? A: You typically provide one or more Trustpilot company profile URLs or slugs. Optionally, you can configure limits on how many reviews to fetch per company and whether to include only recent reviews.

Q: In what format is the data exported? A: The scraper is designed to output structured JSON that can easily be converted into CSV, loaded into a database, or fed into analytics tools. Each record contains both company-level and review-level fields.

Q: Can I limit how many reviews are collected per company? A: Yes. You can set configuration values for maximum pages or maximum review count per company, allowing you to run quick samples or full historical scrapes depending on your needs.

Q: Does this handle companies with a large number of reviews? A: The scraper supports pagination and processes reviews in batches. For very large profiles, you can tune concurrency and delay settings to balance speed, stability, and resource usage.


Performance Benchmarks and Results

Primary Metric: On a typical broadband connection, the scraper can process around 200–400 reviews per minute for a single company when using moderate concurrency and delays.

Reliability Metric: With conservative configuration, successful page fetch rates above 95% are achievable, even for larger company profiles with many review pages.

Efficiency Metric: Memory usage remains low because reviews are processed in streams or batches, allowing long runs across dozens of companies without exhausting system resources.

Quality Metric: Under stable conditions, field completeness for core attributes (ratings, dates, text, company metadata) routinely exceeds 98%, making the dataset suitable for dashboards, machine learning models, and detailed reporting.

Book a Call Watch on YouTube

Review 1

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

Review 2

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

Review 3

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Releases

No releases published

Packages

 
 
 

Contributors