Speaking the Same Social Language: World Cup Conversation Shows a Changing World

World Cup fever is sweeping the globe, and Spredfast is knee-deep in the action. From building out war rooms to helping to curate world-wide conversation around the games, social media is playing a huge part in this year’s World Cup.

As different cultures celebrate, commiserate, and type “GOOOAAALLL” on their mobile devices, they are leaving behind a trail of valuable social data - revealing micro and macro patterns for data-driven marketers. Today we’ll look at two different dimensions of social data that can help tell the international story of the World Cup - language and country.

Who’s excited? You’re excited.


Twitter World Cup Language: 2010 vs. 2014

Last week we worked with AdWeek to produce some numbers around World Cup -language growth this year, and patterns surrounding the week before the World Cup vs. the first week of matches. Now we’ll take a macro view of the data and look at how language patterns have shifted from the last World Cup in 2010 to this year’s big, global event. Counting mentions of an event across multiple languages and years can be trickier thing than it might seem at first, but we’ll walk through a couple of ways to look at the data and learn something along the way.


Overall World Cup Twitter Conversation Growth: 2010 vs. 2014

The first thing we encounter when gathering a conversation from a global audience is that not everyone says “World Cup” the same way. Sure, in English that works fine, but Twitter users speaking Russian (“Кубок мира”), Korean (“월드컵”), French (“Coupe du Monde”) and many other languages might beg to differ. So what we’ll need to do is track each language mentioning both the native translation for World Cup as well as “World Cup” in each language. This doesn’t cover all the possible hashtags that might live out there as well, but gives us a good base for tracking overall trends across cultures.

The second piece we’ll need to address is that, well, the World Cup is still happening as I write this post. Because we can’t look at the conversation across the entire event yet, we’ll look at is the lead-up conversation to the World Cup in both years (or the two months prior to the event’s first match in both 2010 and 2014.) This will help us gauge excitement for the event vs. fans reacting to the outcome of each individual match.

By looking at raw mentions across the top ten languages on Twitter from 2010 to 2014, we see that World Cup mentions have grown 614%. That’s great, but we can do much better than that, right?


World Cup Twitter Conversation Growth by Language: 2010 vs. 2014

If we start adding dimensions to our data analysis, we can get more insights than just the obvious “Twitter conversation is up since the last World Cup.” So let’s do that.  


While we’re seeing impressive increases in mentions across the majority of languages, Turkish and French have really taken off in 2014 World Cup conversation vs. the 2010 World Cup, with Russian and Spanish hot on their heels.

We could just stop here and call it a day. But there’s a problem with our analysis, and that has to do with the four-year timeline between the two events we are measuring. We can’t just assume that Twitter in 2010 was the same as in 2014 - if you think about it, a lot has changed in the past four years. Back in 2010, you probably thought that Miley Cyrus seemed like a nice girl and could come over to babysit anytime. Not so much anymore.

Things in 2014 are different, and our analysis needs to reflect that new reality.

World Cup Twitter Conversation Growth by Language, Normalized: 2010 vs. 2014

To get past that, we’ll normalize our data. Wait, don’t go away - don’t let the word “normalize” scare you, we’ll walk through this together. To understand the growth of conversation across languages, we also need to know how many more people are speaking each language on Twitter this year vs. 2010, and then apply those numbers to our growth analysis. We need to understand how the world of Twitter in 2010 compares to 2014, and how much Twitter has grown internationally in the past four years.

Luckily, this great article by the NYTimes (with data supplied from the good people at Gnip) does the hard work for us and tells us how each of our languages has grown in the past few years. So while we found above that Turkish, French, Spanish and Russian conversation around the World Cup were growing the most, those were actually four of the highest growth languages on Twitter in the past four years as well. So counting just the raw number of mentions is kind of cheating - what we need to do is show the percentage increase based on our new 2014 population.

When we normalize for Twitter user growth across languages, we see a new ranking:   growth_normalized  

So English is our new growth title-holder, but what does that mean? It means that not only are English-speaking Twitter users talking a lot more about the World Cup in 2014, those mentions are outpacing the English-speaking Twitter user growth more than any other language.

Even with the huge increase in international users on Twitter - making English a smaller percent of Twitter's overall user base - English conversation about the world cup continues to be a contender. Despite these shifting demographics on the social network, English kept a steady share of World Cup language from 2010 to 2014 (holding the lead the lead with around 70% of conversation in each year.) That means English is now over-indexed for World Cup conversation, or the new “Man of the Match” when it comes to social World Cup conversation.

So What’s Up With Portuguese?

The normalized numbers are interesting, but in looking at these results something seems off. As you probably know, the World Cup is being held in Brazil this year, and you probably also know that Brazil’s official language is Portuguese. So why isn’t Portuguese on this list?

Because most of the Twitter conversation growth in Brazil wasn’t in Portuguese - it was in English.

Strange. That would make sense if this was data pulled while the World Cup matches were happening, as tourists headed to Brazil might skew the language patterns there, but this is data pulled in the two months leading up to each World Cup so that shouldn’t have a huge impact on our numbers. Portuguese mentions of the World Cup in Brazil are up over 200% in 2014 vs. 2010, but the growth of English mentions in Brazil has been much more aggressive - almost 1,000% - between the two World Cups. Or said another way - in 2010, 10% of World Cup mentions in Brazil were in English. In 2014, that number is close to 30%.

Why is this? It could be that the preparations and marketing materials for a global World Cup have conditioned the Brazilian Twitter audience to become more English-friendly, or that the efforts by the Brazilian government to spread knowledge about the English language to prepare for a global community have worked. Either way, by not settling for a high-level metric and doing a deeper investigation, we’ve learned something about the local audience that we didn’t know before diving into the data. That means, in data terms, we get to advance to the next round.

Learn the Language of Data

As a marketer, high-level data points are a good start for a general view of what’s happening in your world. Coming into any conversation armed with at least a basic level of understanding around what’s happening with your audience or campaigns will make you smarter. But by diving deeper into the data - looking at what different dimensions of data can tell you, and which direction the numbers are trending over time - will help tell the story of what’s going on with your campaign, event, or social audience. With data on your side you’ll find insights, and most importantly, make sure that everyone is speaking the same language.

Chris Kerns's picture

Chris Kerns

Chris Kerns has spent more than a decade defining digital strategy and is at the forefront of finding insights from digital data. He currently leads Analytics and Research at Spredfast. His research has appeared in The New York Times, Forbes, USA Today and AdWeek, among other publications.