Five months ago, I was just a run-of-the-mill social media user, below average in posting frequency and above average in using the word “haggard.” I used Facebook for updating my cover photo to screenshots of tweets from Bernie Sanders, Jaden Smith, and Kanye West; Twitter for venting about politics to my few followers; and Instagram for posting photos that are not of trash.
Since joining Crimson Hexagon, I found myself on the other side of the glass, observing social media from the outside and deriving insights from those observations. With experience using structured datasets such as the World Bank’s global development data, I was not new to data analysis, but what could have prepared me to analyze social media content like the ones I shared?
To approach social media data analysis, I adopted a new framework of thinking and adapted skills gained from prior analytical experience, in addition to learning new skills like Boolean logic to retrieve relevant social media posts.
Oftentimes blasted as a convenient tool used by brunchstagram-obsessed millennials for self-aggrandizement, the demographics for social media users are wide-ranging and people use social media for a variety of reasons — discussing what they want to buy, reviewing recently purchased products, interacting with brands, reacting to current events, sharing memes, complaining about life, and more. You can probably find any offline conversation topic on social media. Because social media is essentially unfiltered conversation condensed for the web, it is a valuable research tool.
The world of social media data is expansive, but it is not unnavigable. Here are a few things I’ve learned.
Approach social media data with flexibility and curiosity
The inanities of social media never end and sifting through a high volume of unstructured posts can raise blood pressure. Looking at social media data can be similar to dissecting rogue, free-form responses on a survey. How does an analyst make sense of it? Approaching social media data like a researcher approaching survey data can help you effectively cut through the clutter.
One of the most popular datasets for socioeconomic research, The National Longitudinal Survey of Youth 1979 is a nationally representative sample of American youth born between 1957 and 1964. A dataset like this contains multiple variables that fall into the following categories — education, employment, geography, parents, marriage, income, health, attitudes, and crime. Once selecting the variables of interest and cleaning the data, the analysis options are endless. You can run summary statistics, build regression models, and visualize the data into charts.
Analyzing social media data takes a similar form. Start with a research question and structure your analysis to answer it. As with analyzing more conventional datasets, social media analysis is not always clear-cut. It takes exploring and testing to develop topics and variables of interest. It takes flexibility in switching directions based on what the data shows. Let’s say you are unearthing insights about peer-to-peer carsharing compared to ridesharing and you discover that those discussing ridesharing also tend to discuss topics about late night activities, such as dancing or going to bars. This may lead to researching something simple like the demographic differences between people discussing peer-to-peer carsharing and ridesharing on social media. The variables of interest are the age and gender mix of the two groups. Now that you have the demographic data for the two groups, you can analyze the differences. Inquisitive and investigative traits are necessary for any data analysis, but especially for social.
Connect social media data to non-social media data
Social media data can unveil many insights, but those insights do not exist in a vacuum. Since people use social media to discuss subjects relevant to their current existence on earth, their discussions are connected to data that shows the results of their actions — examples include Super Bowl analysis incorporating ad revenue, Oscars and cult classic analyses with box office data, and Zika analysis with news developments.
Whether findings from social media data align or contrast with findings from non-social media data, incorporating social media data unlock a new dimension.
For example, when looking at people discussing expensive fares for taxis, Uber, and Lyft, it is clear that in the major cities we looked at, people discuss high taxi fares more. Incorporating actual fare data for taxis, Uber, and Lyft validates consumer concerns.
The objective of data visualization is to present findings clearly and cohesively, not to overwhelm with crowded graphics that detract from the data. For example, visualizing the most popular Pokemon by state with a map can be more engaging than providing a table containing the top six Pokemon and the names of the states in which they are most popular.
With so much interesting data at your disposal, it can be tempting to try to consolidate the data into one visualization. The tricky part is figuring out a focal point for the visualization, eliminating supplemental metrics. For example, the Pokemon popularity by state map could have included additional information about each state’s sentiment about Pokemon, displayed as another layer on the map with positive and negative sentiment. However, the main question we wanted to answer whether there are differences in Pokemon preferences by state, not the differences in sentiment about Pokemon. This may seem rather intuitive but understanding the story you want to tell with your data can work as a guide in how you create a visualization.
Understand that social media is a matter of humanity
The data we observe are not simply numbers. They tell a story about why humans behave the way they do. Why are consumers afraid of self-driving cars? How come millennials don’t shop at Whole Foods? Why is The Babadook likely to become the next cult classic?
Social media is an extension of how people perceive their place in the world. If you posted a photo of the “I Voted” sticker after voting in the 2016 presidential election, it indicates that you cared about the fact that you voted. If you tweeted to your local public transportation provider to grieve about crumbling infrastructure, it means you cared about how local government affects your daily commute. Because social media’s a way to connect with others and reflect on topics, it has seamlessly become a part of our culture. Social media is the digital extension of our lives and brands should incorporate this data into their analysis to better understand their customers. Traditional surveys, for example, may ask customers whether or not they’re satisfied with a product. But social media provides so much more — it helps you understand the why behind the satisfaction or dissatisfaction with more honest, unsolicited opinions that can be analyzed instantly. But getting to those insights can be challenging, especially if you are new to social media data.
Get social insights delivered to your inbox.
Starting to conduct social media analysis is akin to entering search terms or questions in Google and then finding what you’re looking for, except that the results you receive originate on platforms like Facebook, Twitter, Instagram, Tumblr, blogs, and forums. With the approaches outlined above, analyzing insights with social media data hopefully feels less cumbersome and more intuitive.
To read more about the analyses we’re conducting at Crimson Hexagon, read our blog on The Story Behind the Cult Classic Ratio.