As with many complex challenges, data quality often feels as if it is caught within the eternal struggle between theory and practice.
I refer to the theory of data quality as The Big Q.
I refer to the practice of data quality as the little q.
Therefore, and with apologies to Charles Dickens and his A Tale of Two Cities, I refer to data quality’s struggle between theory and practice as A Tale of Two Q’s:
“It was the best of times, it was the worst of times.
It was the age of wisdom, it was the age of foolishness. It was the epoch of belief, it was the epoch of incredulity. It was the season of Procrastination, it was the season of Perfection. It was the spring of Maturity, it was the winter of Reality. We had everything before us, we had nothing before us, we were all going direct to High Quality Data, we were all going direct the other way.
In short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for Theory or for Practice, in the superlative degree of comparison only.”
The Big Q, Defect Prevention, and “Best Theory”
The primary trait of The Big Q is defect prevention, which I refer to as “Best Theory.”
The Big Q is the proactive approach to data quality.
Advocating root cause analysis and business process improvement, defect prevention is essentially the cure for the quality issues that ail your data—by preventing data quality problems before they happen.
This is undeniably the Best Theory of Data Quality.
Most Data Quality Theoreticians usually play the Maturity card—as in, does your organization possess the necessary maturity for proactive data quality.
Numerous capability and maturity models are available, providing stages ranging from initial or undisciplined, through tactical or reactive, then strategic or proactive, up to optimized or governed.
The bottom-line is that a data governance framework is necessary. As is considerable patience, understanding, and dedication—because it will require a strategic organizational transformation that doesn’t happen overnight.
the little q, data cleansing, and “actual practice”
The primary trait of the little q is data cleansing, which I refer to as “actual practice.”
Yes, the little q is the reactive approach to data quality.
The common (and deserved) criticism is that it essentially treats the symptoms without curing the disease—by correcting data quality problems after they have been created—and without correcting their root cause (and sometimes even ignoring it).
However, this is undeniably the actual practice of data quality.
Most data quality practitioners usually play the Reality card—as in, the unavoidable reality is that data cleansing is used to correct the data problems that are currently plaguing critical business decisions on a daily basis.
In fact, many would argue that although it only alleviates the symptoms without curing the disease, reactive data cleansing is a triage, where the priority is to stabilize the patient—since a cure for the underlying condition is worthless if the patient dies before it can be administered.
Doing data quality well is a far, far better thing to do . . .
But how exactly—do you—do DQ?
Are you a Data Quality Theoretician or a data quality practitioner?
In A Tale of Two Q’s, which Q are you?
Or is this apparent struggle all just Much Ado About Nothing?
(And yes, I realize that I just mixed my literary metaphors.)
Perhaps data cleansing should be used to correct your critical business problems today, while defect prevention is busy building a better tomorrow for your organization?
Maybe theory and practice merge, combining data cleansing and defect prevention into your hybrid discipline for enterprise-wide data quality?
What say you?