Trustpilot Scraper collects structured company profiles and customer reviews from Trustpilot so you can analyze brand reputation at scale. It turns messy review pages into clean, ready-to-use data for reporting, dashboards, and automation. Use this Trustpilot scraper to monitor feedback trends, benchmark competitors, and stay ahead of customer sentiment.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Trustpilot you've just found your team — Let’s Chat. 👆👆
Trustpilot Scraper is a data collection tool designed to fetch detailed company information and customer reviews from Trustpilot. It solves the manual, time-consuming task of copying reviews one by one and converts them into a structured format you can plug into your analytics stack. This project is ideal for marketers, data analysts, agencies, e-commerce brands, SaaS businesses, and anyone who needs ongoing visibility into public customer feedback.
- Collects company profile metadata such as name, domain, category, location, and TrustScore.
- Extracts individual reviews with ratings, titles, text, dates, and reviewer details.
- Supports scraping multiple companies in a single run based on a configurable input list.
- Normalizes timestamps, ratings, and counts for consistent downstream analysis.
- Outputs data in structured formats suitable for databases, BI tools, spreadsheets, and automation workflows.
| Feature | Description |
|---|---|
| Company profile extraction | Captures company name, website, categories, TrustScore, review count, and location metadata. |
| Detailed review harvesting | Scrapes review rating, title, text, date, language, and reviewer information for deeper sentiment analysis. |
| Multi-company input | Accepts a list of company profile URLs or identifiers to process many brands in one batch. |
| Pagination handling | Automatically follows review pages to gather as many reviews as allowed by your configuration. |
| Flexible filtering | Configure limits by number of reviews, minimum rating, or date range to match your use case. |
| Clean structured output | Produces standardized JSON/CSV-friendly data for analytics, warehousing, or further automation. |
| Robust error handling | Skips broken entries gracefully and logs failures for later inspection. |
| Configurable delays | Adjustable request pacing to reduce the risk of being rate-limited and to improve stability. |
| Field Name | Field Description |
|---|---|
| companyName | Display name of the company as shown on Trustpilot. |
| companySlug | URL slug or identifier used in the company profile URL. |
| companyUrl | Official website URL of the company, if available. |
| trustpilotUrl | Full Trustpilot company profile URL that was scraped. |
| category | Primary business category or sector associated with the company. |
| country | Country or region the company profile is associated with. |
| trustScore | Overall TrustScore rating for the company (e.g., 4.3). |
| ratingValue | Average star rating for the company on Trustpilot. |
| reviewCount | Total number of reviews listed for the company at the time of scraping. |
| claimedProfile | Boolean flag indicating whether the company claims its Trustpilot profile. |
| reviewId | Unique identifier or hash for the individual review. |
| reviewUrl | Direct URL to the specific review, when available. |
| reviewTitle | Short title or headline of the review. |
| reviewText | Full text content of the customer review. |
| reviewRating | Star rating given in the review (1–5). |
| reviewLanguage | Language code of the review content (e.g., en, de, fr). |
| reviewDate | Human-readable date when the review was posted. |
| reviewTimestamp | Machine-friendly timestamp representation of the review date. |
| reviewerName | Display name or alias of the reviewer. |
| reviewerCountry | Country or region associated with the reviewer, if shown. |
| reviewerProfileUrl | URL of the reviewer’s public profile page, where available. |
| verifiedOrder | Boolean indicating whether the review is marked as “Verified” for a genuine purchase. |
| companyReplyText | Text of the company’s public reply to the review, if present. |
| companyReplyDate | Date when the company responded to the review. |
| scrapedAt | Timestamp indicating when this record was scraped. |
| source | Static identifier for the data source (e.g., "trustpilot"). |
[
{
"companyName": "Example Electronics",
"companySlug": "example-electronics",
"companyUrl": "https://www.example-electronics.com",
"trustpilotUrl": "https://www.trustpilot.com/review/example-electronics.com",
"category": "Electronics Store",
"country": "United States",
"trustScore": 4.6,
"ratingValue": 4.5,
"reviewCount": 1523,
"claimedProfile": true,
"reviewId": "63f9b749e2afec0012a9c0c1",
"reviewUrl": "https://www.trustpilot.com/reviews/63f9b749e2afec0012a9c0c1",
"reviewTitle": "Fast delivery and great service",
"reviewText": "Very happy with my order. Delivery was faster than expected and support was responsive.",
"reviewRating": 5,
"reviewLanguage": "en",
"reviewDate": "2025-01-18",
"reviewTimestamp": 1737168000000,
"reviewerName": "Sarah M",
"reviewerCountry": "US",
"reviewerProfileUrl": "https://www.trustpilot.com/users/abcd1234",
"verifiedOrder": true,
"companyReplyText": "Thank you for your feedback! We are glad you enjoyed your experience.",
"companyReplyDate": "2025-01-19",
"scrapedAt": "2025-02-01T12:04:53Z",
"source": "trustpilot"
}
]
Trustpilot/
├── src/
│ ├── runner.py
│ ├── client/
│ │ ├── trustpilot_client.py
│ │ └── http_utils.py
│ ├── extractors/
│ │ ├── company_parser.py
│ │ └── review_parser.py
│ ├── outputs/
│ │ └── exporters.py
│ └── config/
│ └── settings.example.json
├── data/
│ ├── companies.sample.txt
│ └── sample_output.json
├── tests/
│ ├── __init__.py
│ └── test_parsers.py
├── requirements.txt
└── README.md
- Agencies use it to monitor multiple client brands on Trustpilot, so they can generate regular reputation reports and track improvements over time.
- E-commerce teams use it to centralize reviews from Trustpilot into their BI stack, so they can correlate customer sentiment with sales and product changes.
- Product managers use it to identify recurring complaints or feature requests, so they can prioritize roadmap decisions with real customer data.
- Customer success teams use it to detect negative reviews quickly, so they can intervene, respond, and prevent churn.
- Competitive intelligence analysts use it to benchmark competitors’ TrustScores and review patterns, so they can position their brand more effectively.
Q: What input does the scraper require? A: You typically provide one or more Trustpilot company profile URLs or slugs. Optionally, you can configure limits on how many reviews to fetch per company and whether to include only recent reviews.
Q: In what format is the data exported? A: The scraper is designed to output structured JSON that can easily be converted into CSV, loaded into a database, or fed into analytics tools. Each record contains both company-level and review-level fields.
Q: Can I limit how many reviews are collected per company? A: Yes. You can set configuration values for maximum pages or maximum review count per company, allowing you to run quick samples or full historical scrapes depending on your needs.
Q: Does this handle companies with a large number of reviews? A: The scraper supports pagination and processes reviews in batches. For very large profiles, you can tune concurrency and delay settings to balance speed, stability, and resource usage.
Primary Metric: On a typical broadband connection, the scraper can process around 200–400 reviews per minute for a single company when using moderate concurrency and delays.
Reliability Metric: With conservative configuration, successful page fetch rates above 95% are achievable, even for larger company profiles with many review pages.
Efficiency Metric: Memory usage remains low because reviews are processed in streams or batches, allowing long runs across dozens of companies without exhausting system resources.
Quality Metric: Under stable conditions, field completeness for core attributes (ratings, dates, text, company metadata) routinely exceeds 98%, making the dataset suitable for dashboards, machine learning models, and detailed reporting.
