Tag Archives: data cleansing
Aug 16, 2011 by David Loshin
Errors do not appear in data randomly. Actually, the concept of “error” might actually be flawed itself. The data itself is not an error, rather the result of a scenario where one or more events caused the data to be inconsistent with the real world entity being modeled. This could be a user error (such as typing the wrong address) or it could be a sequence of other types of events with which the data did not “track.” For an inconsistent address, that might mean that the individual moved, or the house was destroyed in a hurricane, or the Post Office modified the ZIP codes.
Aug 09, 2011 by David Loshin
In my last entry I started to ask about the difference between the current view of “proactive data quality,” in which we really mean “early reactivity to known errors” as opposed to truly proactive data quality in which we anticipate, find, and eliminate errors latent within our existing contexts. I gave two examples of error detection.
Aug 02, 2011 by David Loshin
A few weeks ago I posted a blog entry about analyzing the existence of errors in a data set as a way of anticipating data quality flaws/failures prior to their incurring any business impact. More to the point, we always talk about being “proactive” when it comes to data quality inspection, but what is normally meant is the continuous monitoring of compliance to data quality rules of which we are already aware. If we really think about that, are we being proactive or just arranging to be reactive earlier in the process?
May 16, 2011 by Joyce Norris-Montanari
While on Santa Cruz Island in the Galapagos, we stayed at a great bed and breakfast called The Solymar. It has a swimming pool, which was one of the reasons I picked this establishment. After a long day of hiking across broken-up lava to get to the “special” swimming hole of the locals, I went swimming at a great little beach, then chose to come back to the hotel instead of ocean kayaking. I was a bit tired from all that hiking, and I swear to you, I could hear the bartender at the pool calling me to come back for a cold refreshing beverage.
Oct 12, 2010 by David Loshin
Once we have come to grips with the fact that it is not a cardinal sin to correct a data error when necessary, the next set of questions center on the pragmatic aspects of data cleansing, such as:
- What kinds of data cleansing will we do? There is a broad range of corrections that can be applied, ranging from…
Oct 05, 2010 by David Loshin
Data quality consultants consistently beat the drums about the difference between data quality management and data cleansing. In fact, there may even be a perceived level of condescension. Data cleansing? Hah, of course you would never change the data when you could eliminate the root cause of the introduction of errors. Why, that is preposterous!
Or is it? Actually, let’s…