What it means to be a data-driven city in a big data world
“More data has been created in the past two years than in the entire previous history of the human race.” On social media platforms, websites, forums, streaming services and the like, the amount of data keeps growing each year. The opportunity in all this data is that so much of the content is user-generated, and in the case of cities, citizen-generated. Cities are realizing that citizen feedback happens everywhere on the web: in Facebook groups dedicated to a specific neighborhood, issue or protest; on local news’ social media channels and websites; on social media channels of local organizations, businesses and NGOs; on public Twitter feeds, and of course, on city managed social media channels. This abundance of data introduces a huge challenge: sorting out the “ham” from the “spam” is no longer a trivial task. Moreover, just monitoring city-managed social media channels – a feat of its own – is no longer enough. So how do cities find the relevant data in such a huge haystack? In this blogpost we’ll explore some ideas.
Choose your data sources wisely
Take great care in selecting which data sources to pay attention to. A city can have tens and even hundreds of data sources, and this is exactly where the opportunity for cities lays and why there is so much potential to the data out there. While citizens may talk a lot on social media, they won’t always be talking about the core city issues important for decision making. But the same social media data can bring real gold: concerns about public safety, complaints about clogged roads, recommendations about schools and after-school activities for kids, community service projects, and the list goes on. You have to know which sources might contain the gold.
“80% of the data collected on social media is relevant to decision makers in the city.”
Understand what you’re not interested in
Once you’ve identified relevant sources, in order to sort out the “ham” it’s important to understand what you’re interested in as a city, and even more so, what you’re not interested in.
Cities are used to categorizing 311 reports in order to assign each issue to the responsible service department. With other inputs, like social media, the same logic applies: is this post about garbage disposal or about parking? Alternatively, if a post is about selling a car or someone offering babysitting services – those posts shouldn’t waste valuable city time. Technology can actually help cities both aggregate mass amounts of data, and then systematically categorize all the data so that it’s easier to put all the irrelevant topics in the “irrelevant” bin and never look at them again.
At ZenCity, out of millions of data items we collect every month from different cities from all of these different channels, about 80% of the items actually contain relevant feedback for decision makers which means only 20% is irrelevant.
Data categorized by city-centric topics
Find out what citizens are most interested in
Sometimes data only becomes relevant once it piles up. One of our cities experienced a heated debate about schools. As schools were the responsibility of the School District and not the city directly, citizen concerns weren’t paid attention to at first. But as conversations about the issue grew in volume, it became more evident that the city needed to listen in and should probably respond. The city decided to step in and get the School District involved to prevent a controversial activity city-wide.
In other cases, just being able to exclude irrelevant data from analysis helps to accurately evaluate the meaningful data to better understand what the most important issues to citizens are.
Aggregating many data points from various sources (e.g Twitter, Facebook, 311, surveys, etc.) allows a broad perspective on the most important topics to both citizens and decision makers. But just counting the number of posts is not sufficient because citizens are not always posting original content on social media, rather they’re what we call “passive” users, engaging by liking, sharing, retweeting, commenting, or adding emoticons to others’ posts. Turns out that listening to this “silent majority” of citizens can actually add up to meaningful insights and actions. This is still a pretty big haystack though.
How to make decisions based on all this data?
Undoubtedly, irrelevant data on social media exists. It can amount to 20% of all data collected from a city on average, even after carefully culling out relevant data sources and only collecting from these. But let’s not forget the remaining 80% of potential “gold.” Yes, that amount of data is overwhelming, but innovative technology can definitely help with that. Leveraging cutting-edge technology to help sort out “spam” from “ham,” covering a variety of data sources, and prioritizing topics by volume of citizen engagements is key for making more informed decisions.
A data-driven approach in the information age is first and foremost a citizen-driven approach, and what makes a smart city really smart is the ability to make both strategic and real-time decisions in an innovative way. We at ZenCity do just that.