Readers may know that I started my work in data quality while at Bell Labs. After we had a couple of practical successes with AT&T units, we decided it was time to do some fundamental thinking about data (Anany Levitin and Chris Fox, now at Villanova and James Madison, respectively led most of the work described here). The first thing we needed to do was get a good definition of “data.”
There have been dozens of attempts to define data throughout history and we worked hard to figure out which was best suited to quality. We selected an approach that called out the data model, the data value, and presentation as separate and distinct. This is critical – each arises from different processes, with distinct “quality dimensions.”
As we completed work speculating on those distinctions we had an epiphany—our definition completely missed the point! We had defined data as “static,” sitting in a database (be it paper or electronic). From a quality perspective, data in a database are completely uninteresting. It is when they come into existence, get moved from place to place, combine with other data, put to use by a customer, and on and on, that data are interesting. It led us to more fundamental and important thinking about the “organic nature of data.”
I bring this up as a means of introducing Brown and Duguid’s The Social Life of Data. I see no evidence that the authors were aware of our thinking, but never mind. They advance our thinking at least one level, probably more. My favorite example of theirs involves a medical historian who is tracing the spread of cholera by sniffing packets of letters from that period in the archives of a centuries old business. The researcher is sniffing for traces of vinegar. During that time, when the disease occurred in a town, all letters were disinfected with the stuff to prevent the spread of disease.
A long and unexpected lifetime for a (not-yet-digitized) bit of data indeed!
The Social Life of Information is brilliantly written, accessible to all. If you want to help your management chain understand the importance of data and data quality, please put a cover letter on my latest, Data Driven, and send it along. But if you can send them two, add this one! That’s how good it is.
Up next: Uncertain. I’ve got several more favorites, but I’m itching for something new, at least for a bit.