This repository contains the code and data for the paper "The European Film Industry as a Market for Lemons".
This paper looks at the impact of film criticism on predicting the commercial success of European films. We focus on the EU-funded film journal Cineuropa and compare it against the three largest industry press outlets. Using sentiment analysis to evaluate film reviews and a binary classification algorithm to assess commercial success, we find that Cineuropa has a positive bias. This bias leads the prediction model to overestimate the likelihood of commercial success of films regardless of their actual market performance. We interpret this as evidence that Cineuropa functions more as an advertisement platform than as a source of film criticism, creating misleading signals independent of a film's artistic or commercial merit. This paper makes three contributions: (1) it presents the first predictive study on the European film industry as a whole, (2) it introduces a novel method for classifying commercial success with an emphasis on model explainability, and (3) it adds to empirical research on signaling and asymmetric information.
1. Data
- This folder contains datasets used for the analysis.
master0402024.xlsx: Master file with market data and sentiment scores for European films.classification_24072024.xlsx: Admission thresholds for classifying commercial success.master_2022report_24072024.xlsx: Master file with market data according to the method of the EAO 2023 Report and sentiment scores for European films.classification_2022report_24072024.xlsx: Admission thresholds for classifying commercial success according to the method of the 2023 Report of the European Audiovisual Observatory.films_withimdb09092024.xlsx: Market and Festival data for European films retreived from LUMIERE (lumiere.obs.coe.int) and IMDb (imdb.com).moviesOBS09092024.xlsx: Market data for all European films with theatrical release retreived from LUMIERE (lumiere.obs.coe.int).ticketspercountry2021.xlsx: Average movie theater ticket prices per country retreived from the 2022 report of the European Audiovisual Observatory.
2. Scrapers
- This folder contains the scripts for data retrieval. The jupyter notebook Scrape_and_Analyse_Market_Data.ipynb includes the script we used to retreive data from LUMIERE and IMBb and to structurate them for further analysis, i.e. classifying commercial success.
- The script for analyzing the sentiment of film reviews with SiEBERT and the script for explaining the sentiment analysis with Captum and Layer Integrated Gradients.
- The jupyter notebook Master_File.ipynb used for the economic analysis including the XGBoost model and SHAP.