Troy Anderson is the CEO of KnowledgePlex, Inc. In this article, The Problem With Data, Troy discusses how the importance of data:
Data doesn't kill people, people kill people. And yet, more lives are affected by data than guns: data determines how many Congress people represent you (unless you live in DC) and often how much money your state gets; data is one of the primary things (some say the only thing) that determines your mortgage interest rate; and data is often the last refuge of a specious argument.
With data so important these days, you'd better have some or you'll get left out, competed away, or find yourself unable to prove anything to anyone. Miss providing data and you'll miss out on money or opportunities for you, your organization, or your community.
The problem with data, as it currently exists in federal agencies and web sites, is that it's very difficult to use despite being very relevant, down to a neighborhood level.
You can read the rest of the article here.
The good news is the number of new initiatives -- http://www.dataplace.org/, of course, but also swivel and freebase and numberpedia and a host of other efforts trying to make sense of data and make it available in easier fashions.
The bad news is that the problem isn't that there aren't enough numbers to play with, any more than there's a problem with not enough online news sources. As data become more accessible, we shift the problem from availability to data integrity -- sure we can get the data, but is it good data? Dataplace has been really good about metadata, but not every data site has been as diligent.
Check out this discussion of Geocommons -- how good is a tool if you're unsure about the data? Metadata standards -- data about the data -- are becoming increasingly important.
One more warning from The Problem With Data:
But ease of use and democratization of data are not the only problems faced by people who need data. The other problem with data, and with wanting data, is that data usually comes with a bias. People who gather and use data often already have an answer in mind: “Let’s disprove this hypothesis.” “Let’s see what’s well correlated with default risk” “What’s the income of this neighborhood?” The answers to their questions often depend on what can be measured, how often, when, and where. The usual answer to “What data can we get on this?” is “This is the data that’s available.” Anyone familiar with the story of the drunk looking for his keys under the lamppost because that’s where the best light was should see the problems here.
Community indicators efforts too often are forced to rely on what's available instead of what's important to measure ... and the hope is for an increasing range of accessible, good data to solve that problem as well.
0 comments:
Post a Comment