The classic "soda" vs. "pop" vs. "coke" debate has long drawn invisible borders across the United States. The geographic variation inspired Edwin Chen, a data scientist at Twitter, to use information from the popular social media site to a create a map showing regional differences in how we refer to carbonated beverages.
"I've always had a big interest in language and linguistics, and the Twitter dataset is pure gold for anyone who wants to study both what people are saying and how they're saying it," Chen told us.
The math maven describes how he created the visualization on his personal blog:
To make this map, I sampled geo-tagged tweets containing the words “soda”, “pop”, or “coke”, performed some state-of-the-art NLP technology to ensure the tweets were soft drink related (e.g., the tweets had to contain “dnk soda” or “drink a pop”), and tried to filter out coke tweets that were specifically about the Coke brand (e.g., Coke Zero).
The findings are consistent with previous "soft drink" maps:
- Coke (red): the brand-name dominates the South (the soft drink was originally manufactured in Atlanta)
- Pop (green): rules most of the Midwest
- Soda (blue): dominates in the Northeast and California