Wharton Customer Analytics

Research Paper Series

Large-Scale Cross-Category Analysis of Consumer Review Content and Sales Conversion Leveraging Deep Learning

How consumers use review content in their decision making has remained a black box due to the labor-intensive nature of extracting content from review text and the lack of data on review-reading behavior. In this study, we overcome this challenge by applying deep-learning- based natural language processing on a comprehensive dataset that tracks individual-level review reading, searching, and purchasing behaviors on an e-commerce site to investigate how consumers use review content. We extract quality and price content from more than 500,000 reviews spanning nearly 600 product categories. We achieve two objectives. First, we describe consumers’ review content reading behaviors. We find that although consumers do not read review content all the time, they do rely on review content for products that are expensive or of uncertain quality. Second, we quantify the causal impact of content of read reviews on sales. We use a regression discontinuity in time design and leverage the variation in the review content seen by consumers due to newly added reviews. To extract content, we use supervised deep learning techniques that identify six theory-driven content dimensions. We find that aesthetics and price content in the reviews significantly affects conversion across almost all product categories. Review content has a higher impact on sales when the average rating is higher and the variance of ratings is lower. Consumers depend more on review content when the market is more competitive, immature, or when brand information is not easily accessible. A counterfactual simulation suggests that reordering reviews based on content can have the same effect as a 1.6% price cut for boosting conversion.

Keywords: Consumer Purchase Journey, Product Reviews, Review Content, Deep Learn- ing, Content Engineering, Economic Impact of Text