Data Quality and The Middle Way

Data Quality and The Middle Way

Jan 06, 2010 by in Data Quality

“In seeing things

To be or not to be

Fools fail to see

A world at ease.”

—Nagarjuna, “Verses from the Center”

(as translated by Stephen Batchelor)

 

Phil Simon recently channeled the wisdom of Confucius in making the argument in favor of an all or nothing approach to data quality.

Channeling another source of ancient Chinese wisdom, Ch’an (Zen in Japanese) Buddhism, I offer an alternative, which I will refer to as:

Data Quality and The Middle Way

 

The Pair of Perilous P’s

As with many complex challenges, data quality can be viewed as a binary problem, where at first it appears we must choose between two polar opposites.

The pair of perilous p’s in data quality are procrastination and perfection

However, data quality is not an either/or problem, but instead a neither/nor problem. 

In other words, it is not a choice to either procrastinate or perfect, you must chose to neither procrastinate nor perfect—and instead follow The Middle Way.

Dylan Jones previously posted about the perennial perils and pillars of data quality procrastination, providing me the pleasure of proceeding to the pursuit of perfection.

 

The Pursuit of Perfection

First, I definitely agree with both Phil Simon and Confucius—whenever possible, organizations should launch a full data quality initiative, which should begin with an initial phase of performing a data quality assessment, some preliminary data cleansing, and “baking in” data quality monitoring functionality into the enterprise architecture.

However, it is simply unrealistic to be able to either identity or resolve every data quality problem—and attempting to do so is a sure fire way to guarantee failure.

Data quality initiatives are easy to get started, even easier to end in failure, and often lack the decency of at least failing quickly.  Just like any complex problem, there is no fast and easy solution for data quality.  In order to be successful, data quality must always be understood as an iterative process.

On my own blog, I have discussed the need to watch out for the “Goldilocks Zone” on data quality initiatives, which is the time when the efforts of the current iteration, although not perfect, are “just right” for implementation.

Evan Levy has reminded us that data quality isn’t the same as data perfection, and David Loshin previously posted about the Pareto principle and the point of diminishing returns in incremental data quality improvements, both of which provide additional insights into why you should not attempt the pursuit of perfection.

 

Data Quality and The Middle Way

I am not suggesting that unresolved data quality problems should simply be ignored.  Improving the quality of your data is the whole point of your data quality initiative.

However, chasing perfection can undermine the best of intentions.

Data quality practitioners must learn to strive for continuous improvement, but without losing themselves in an ideal such as data perfection.

Today, simply try to do the best that you are currently capable of doing. 

Tomorrow, try to do a little better.

Incremental data quality improvements build momentum to larger success over time.

Instead of focusing on unresolved problems—focus on continuous improvement.

7 Responses to “Data Quality and The Middle Way”

  1. Phil Simon

    Jan 06, 2010

    Good stuff, Jim. I’ve been accused of not being terribly moderate before.

    At the risk of moving away from ancient philosophers, allow me to use a golf analogy.

    Aiming for par is a great way to start each hole but plans often go awry. Because of many factors (trees, “the drink”, “the beach”, and poor shot making), golfers often take a bogie or worse. Foolish are those who think that they can pull off miracle shots when faced with adversity. If you put the ball in the woods, then just chip it out and take the quadruple bogie out of play.

    I’d argue that setting the bar high is a really good idea and, if things happen, it makes sense to lower that bar a bit.

    In my view, organizations ought to aim relatively high on DQ initiatives and only make compromises as needed. Especially on new system implementations, it’s easier to go the Confucius route.

    Reply to this comment
  2. Jill Wanless

    Jan 06, 2010

    Success will never be a big step in the future, success is a small step taken just now. ~Jonatan Mårtensson

    Great post Jim!

    Reply to this comment
  3. Jim Harris

    Jan 06, 2010

    Thanks for sharing a great quote, Jill!

    I really like your golf analogy, Phil. It basically sums up why I am so bad at golf – I always try to crush it off the tee, and always opt for trying the miracle shots – they make it look so easy on TV!

    I was simply offering my opinion as an alternative to yours.

    Returning to ancient philosophy, The Middle Way should never be confused with either The Right Way or The Only Way.

    As the Buddha taught:

    “Put no head above your own, but also cease to cherish opinions – especially your own.”
    :-)

    Reply to this comment
  4. Dylan Jones

    Jan 06, 2010

    Great post Jim.

    I think the pareto concept has to rule here and it’s also important to start from where you’re at, for most companies starting at a low level of maturity there is a vast amount of low-hanging fruit. Picking this wins you all kinds of leverage with sponsors and co-workers, endlessly iterating for smaller and smaller gains can kill off the motivation quite quickly.

    I know there are purists who say we should reach “5×9′s” quality, nothing else is acceptable but I think that must come when you’re higher up the maturity scale and the culture of data quality is fully bedded in.

    The key of course is to measure the impact of what improvements you choose to leave, I see this a lot on migration projects, “…ah well, the budget for DQ is blown so that’s that, we’ll just have to fix any dirty data in the target…” – cue project failure.

    Reply to this comment
  5. Phil Simon

    Jan 06, 2010

    Don’t make me start quoting Carl Spackler.

    Reply to this comment
  6. Jim Harris

    Jan 06, 2010

    Thanks for sharing your perspective, Dylan – and for writing the previous posts about data quality procrastination, which covered 99.999% of The Middle Way :-)

    And for Phil:

    In the immortal words of Carl Spackler, echoing immortal words of Jean Paul Sartre:

    “Au revoir, gopher”

    Reply to this comment

Leave a Reply