Turning Online Review Data Into Consumer Insights

3 types of insights you can uncover by analyzing reviews

This is a guest post from Rob Sullivan of social media marketing agency, Social Chain. Follow Rob on LinkedIn for more social insights. 

Reviews are everywhere on the web. From Amazon to TripAdvisor, you can find reviews on just about any category of product or service, in nearly any language. They can be a useful additional source of consumer insights alongside Twitter, Facebook, Instagram, blogs, forums, news’ articles, survey responses, call transcripts and any other alternative datasets.

But customer reviews are still a frequently overlooked source of consumer insights, and they pose their own set of opportunities and challenges for brands that want a fuller picture of their customers’ opinions and preferences.

In this post, we’ll look at the unique value of online customer reviews, how brands can use them to enhance consumer understanding, and provide several real-world use cases for adding them into your consumer insights mix.

What to consider when analyzing reviews

Analyzing review data can help uncover powerful consumer insights, but there are a few caveats to be aware of. Academics have noted some of these issues in various pieces of research, the most prominent coming from doctoral students at NYU, MIT and the University of Jerusalem (Muchnik et al, 2013), who in a paper, noted a powerful “social influence” bias in review data, creating a positive ratings bubble. Essentially, on most review sites, you will find many people who leave 4 and 5 star reviews and then much fewer people leaving 2 and 3 star reviews and an uptick in the 1 star range, i.e. a J-Curve. Max Woolf a freelance data consultant has visualised this well on his blog.

Image credit: Max Woolf

 

The doctoral students designed a simple randomised experiment using a social news-aggregation website, they manipulated comment and article votes by upvoting some far more than others and what they discovered was that positively manipulated scores were 30% more likely to hit the maximum score level than the control group. Selection bias plays a role too, with purchasers being less likely to admit they made a mistake when reviewing products they have bought, hence producing the J-Curve.

Clearly, looking at just ratings across sites would be missguided. Reviews do not capture the full scope of your audience and should never be used in a silo. Like most sources of social and web data, review data has biases in which audience or opinions it is capturing, so it is best used in conjunction with other online consumer data to come to more accurate conclusions.

Understand the review source first

Different review sites have different purposes and what I refer to as sentiment or emotion “baselines” i.e. what the NPS and emotion of the review sites are on aggregate (often across millions of reviews). We can see this below — whilst the Chinese site Jingdong elicits 70% positive sentiment, Trustpilot is considerably more neutral (+10%) and negative on aggregate (+8%). And the sentiment can even be very different between product categories on the same site.

So here we come to a crucial lesson on using reviews, we have to compare products in the same or similar categories on the same website. Trying to compare reviews for Giorgio Armani on Macy’s to Calvin Klein on Sephora for example would not be useful. Comparing one perfume brand mostly with reviews on Amazon to another, mostly with reviews on Sephora would also not work well because of the different baselines involved.

Caveats aside, there are many data points in reviews, the star rating, the date the review was left, on some sites you will find further metrics (e.g. Nordstrom offers a slider scale for rating the quality of a clothing item or accessory) and, of course, the text of the review itself. Most of the interesting insights from reviews come from this text data.

Below, are three examples of how you can get creative with this. In many ways getting review data to shine is a combination of having good keyword strings in Crimson Hexagon’s Forsight platform setup to filter with in monitors and then knowledge of text mining processes (document term matrices, n-grams, latent dirichlet allocation etc.). I won’t delve too extensively into the latter here, but a basic understanding will help a researcher immensely.

Here are three examples that highlight the types of insights you can uncover by analyzing online review data:

NPS and emotion in Macy’s reviews

Crimson Hexagon has access to about 1.3 million reviews on Macy’s site. I decided to look at reviews in the beauty category on the Macy’s website, which included about fifteen different brands.

A core functionality of Crimson is its emotion measurement, which is a sophisticated classification of text content based on psychologist Paul Ekman’s categories of emotion. Emotion is a crucial metric for marketers to track in text, image and video because it is the heartbeat of selling. If people do not emote sufficiently about your product, if they don’t feel joyful, sad, surprised or even angry, if they are just neutral, it will be harder to get them to become a new or repeat customer.

An interesting way to visualise the review data is in a bubble chart, with the NPS (positive – negative reviews) on the x axis, emotion on the y axis and the volume of reviews for the brand as the size of the bubble. This is shown below.

The relationship between NPS and emotion is not so meaningful or important here but what is worth looking into are the differences between brands. We can see brands like Chanel and Marc Jacobs do well for this dataset skewing high and to the right.

The two brands which stand out are Gucci and Clarisonic. The former has a modest NPS but relatively low emotion, meaning there are products the brand has on Macys which are not causing the demographic (majority US women over 25 years old) to feel as strongly about their purchases as with other brands. Meanwhile, Clarisonic has certain products which attract much more negativity than the other brands, hence the lower NPS (over 15% lower than Clinique).

So both of these brands may have marketing or positioning lessons to take from these reviews, especially Clarisonic, where the percentage difference is more stark. Naturally this could be a starting research point for the brands to see if other social and web sources exhibit the same patterns.

Deeper nuance in Amazon review analysis

Another example of using reviews is looking at the occurrence of what is referred to as “n-grams.” N-grams are phrases of a certain word number, so for example “excellent material” is a bi-gram, whilst “I adore this” is a tri-gram.

Choosing two major sports and athleisure brands, I retrieved about 32,000 reviews for one brand and 41,000 for the other from Amazon using Crimson. These reviews were then manipulated in R, with a lot of processing just to get the raw text. Using a few packages in R, I set up a function to find bi-grams for each brand. Naturally given the differences in reviews it was then important to normalise the data. With a little bit of plotting magic in R, a graph as below was produced.

The graph shows the normalised occurrence of the bi-grams found in each set of reviews. The bi-grams which skew to the right and downwards are more common on a relative basis for brand X and ones which skew left and upwards are more so for brand Y. So brand Y may look at these reviews and notice that people are buying their products on Amazon for husbands and sons, this can inform some of their marketing approaches on the platform.

Fast shipping and fast delivery are more equivalent for their presence between the two brands. This can be extended to three and four word phrases too, and stemming can also help group phrases which are very similar (“fit good”, “fit perfect”). Using n-grams gets past simplistic approaches like word counts which can miss a lot of the nuance happening across reviews, but still provides us a quantifiable metric from unstructured text.

Places and moments of consumption in Influenster reviews

Focusing on the beer brands Corona and Blue Moon, I wanted to compare where people are claiming to consume the brands’ products. Using Crimson Hexagon’s opinion monitor I trained the algorithm to recognise about six different locations where people were claiming to drink either brand.

We can see that Corona is much more of a beach and poolside kind of beer, while Blue Moon is paired in consumers’ minds with bars and restaurants. Both brands see people consuming them on their patios and porches to the same extent. Corona’s big marketing effort in the US over the past few years has centered around lying on the beach, kicking back and having fun, as a BusinessWeek article from 2015 highlighted – “Corona isn’t selling a beer, it is selling the idea of having a beer on a beachside vacation.”

Thus, the review data, if similar patterns were spotted through image analytics, Google Trends, surveys and other text sources as well as shipments and company financial data would seem to validate the success of the brand’s positioning in the US.

Conclusion

In summation, using reviews is really a question of a researcher’s imagination more than anything else. The above examples are just short snippets of what can be done and there is no doubt there are countless other ways to use the data. The same text mining methods can also be applied to forums and blogs and other longer form content to equal effect.

For more on using a variety of sources in your consumer insights analysis, check out this guide: Guide to Social Data Sources for Brands and Analysts

Request a Demo

Ready to transform your business?

Get a walkthrough of Crimson Hexagon and learn how consumer insights can help you make better business decisions.