Social Media Data Reveals the True Influencers in Sochi

Athleticism, national pride, weird little collectable pins, and tweeting - all part of the Olympic experience. Following athletes at the games can help us feel we're actually there with our teams, celebrating every victory and getting a behind-the-scenes view of what goes on in the Village. But there's a larger opportunity with social data - to go beyond the easy-to-find metrics and discover the unknown influencers in the Winter Games community. Today we'll luge through Twitter data to find which athletes and sporting events are earning social gold at the Sochi Winter Games.

Say 'Social Sochi' Ten Times Fast

Over the next few weeks we'll all see who the winners are on the podium, but I wanted to uncover who the big competitors are this year via social data analysis. To do this, I created Twitter lists of almost 500 athletes across 12 countries and 14 sports that are participating in the Winter Games. I then wrote a series of Python scripts to grab info (data was collected the week before the start of the Winter Games) via Twitter’s public API and analyzed influence, subnetworks, and conversation patterns to give some color and insight to the upcoming games. 

Who is really the most influential social athlete, beyond just the celebrities? What can relationships between athletes tell us about countries and sports? Who's feeling lucky? Today I’ll use data to attempt to answer most of these questions.

A mapping of 479 Winter Games athletes on Twitter and the 7,262 connections between them. Colors represent subgroups found within the network.


Athletes on Twitter - By Country, By Sport

There are lots of ways we can slice this data set, but let’s kick off with a country by country view.

The United States leads the pack with 97% of the athletes I reviewed having an active Twitter presence, followed by Australia and Canada. The Scandinavian countries (Norway, Sweden, and Finland) follow with around ~60% of each team’s athletes on Twitter. These groups have produced above average athletes who are also above average socialites: Twitter usage over-indexes vs. each country's overall adoption rate. Takeaway: these athletes are social people.

But how does that break out by sport? Let's find out.

If you guessed that women’s hockey and bobsledding are the most social sports of the Winter Games...well, you didn’t, so let’s not even get into it. What’s going on here? We’ll dive into those two groups in a minute.

Athletes on Twitter - Individual Rankings

Let's try another angle and look at data points for each individual athlete. We'll start with the most obvious metric: followers.

Shaun White (1.3M+), Yuna Kim (700K+), Alex Ovechkin (650K+), Patrick Kane (400K+), and Lolo Jones (380K+) lead this popularity contest with the general public. But these metrics aren't that surprising (or really that interesting.) Celebrities from past games and NHL players have the most followers...makes sense. We can do better than this. 

To find influence at the games, let's examine only the network of Winter Games athletes and find out who is connected to the most other athletes. Championing this search is none other than the social gold medalist and indicator of influence and authority: Eigenvector Centrality.

Most Influential 2014 Winter Game Athletes

Kaillie Humphries is leading all athletes by this measure, being followed by 68 other Winter Games athletes (resulting in an Eigenvector of .202.) She is followed by Jesse Lumsden with 64 other athletes following him (Eigenvector of .190.)  So what’s so interesting about this? Well, eight of the most influential athletes at the Winter Games are bobsledders. The other two are Skeleton or Luge athletes.

Wait, Did You Say 'Bobsled'? What Just Happened?

It turns out that bobsledders (both men and women) sure do love Twitter. And not only do the bobsledders connect to each other in large numbers, lots of other athletes connect to bobsledders as well. Kaillie Humphries is followed by 68 other athletes, 30 of which are not bobsledders. Celebrity snowboarder Shaun White is only followed by 41 other Winter Games athletes total.

If we step back and look at the other data points we’ve seen today, this actually starts to make sense. As I showed above, bobsledding is one of the two top winter sports represented on Twitter by percentage of athletes, so it’s not crazy to think that athletes in each sport would have more authority than other groups.

So all the top influencers are bobsledders, but the medal for largest group of athletes on Twitter belongs to women’s hockey. So why aren't any athletes from women’s hockey on the top influencer list? Funny, I wondered the same thing.

Women's Hockey and the Impact of High Sub-Network Density

To demonstrate why, let’s look at the social networks for bobsled/luge/skeleton and women’s hockey. We’ll start with bobsled/luge/skeleton.

Twitter network map of 2014 Bobsled, Luge, and Skeleton Winter Games athletes

Contrary to what you might think, the image above is not my preschooler’s art project, it’s a visual representation of the Twitter connections within the bobsled/luge/skeleton community. The algorithm I used to analyze the subnetworks found two major groups: USA (blue) and Canada (red), with a few other countries lumped in with each. But the two groups, while distinct, have a ton of connections between them. Overall - bobsledders connect to a diverse group of athletes, both inside and outside their community. Bobsledders are connected heavily throughout skiing, speed skating, and figure skating communities.

So what does women’s hockey look like?

Twitter network map of 2014 Women’s Hockey Winter Games athletes

Just a quick glance at the women’s hockey subnetwork tells us that their community is very different. Although the number of athletes on Twitter is very high, the data shows that women's hockey players connect tightly with their own teammates, but not so much with others. In fact, the algorithm I used correctly lumped 100% of these athletes into their respective countries (green is Germany, purple is Sweden, etc.) based only on their Twitter connections. This is the only sport out of the fourteen that I analyzed that shows this exaggerated pattern of connectedness within each country’s team, vs connecting to other athletes.

This explains why they aren't as influential as bobsledders - other influential athletes aren't connecting to them. That’s not to say that Women’s Hockey players aren’t active on Twitter - women's hockey players actually follow a higher percent of athletes outside their sport vs. bobsledders, but the other athletes just aren't returning the favor. Bobsled/luge/skeleton athletes get 25% of their followers from other sports. Women's hockey? 14%. And because the authority calculations I use only count people following an individual (vs the other way around), bobsledders gets the advantage.

See how much more fun this is than just looking at who has the most followers? (Don’t answer that.)

So What Did We Learn?

The rewarding (and valuable) part of data analysis is in the exploration. If I had stopped with a high-level, easy to get metric this would have been a pretty boring post. But by bringing in additional data sources and views, we can uncover patterns and insights that would otherwise go unseen. When you’re looking to find patterns in your data, don’t be afraid to do some hard work and build out your data across multiple dimensions to expand your view of the world. It takes time, but sometimes it pays off in finding insights that no one else has seen before.

And for any brand managers out there with a few extra sponsorship dollars lying around, there’s still time to play some Moneyball this winter. Four of the most influential Winter Games athletes currently have no individual sponsorships. 

Based on their authority in the Winter Games social community, they might be worth the investment. All I ask in return is a trip to the games and a hotel room in Sochi (maybe with a working toilet?)

Chris Kerns's picture

Chris Kerns

Chris Kerns has spent more than a decade defining digital strategy and is at the forefront of finding insights from digital data. He currently leads Analytics and Research at Spredfast. His research has appeared in The New York Times, Forbes, USA Today and AdWeek, among other publications.