Have you ever wondered why bottled water has an expiration date?
Well, in the United States at least, it’s mostly New Jersey’s fault. A 1987 state law required all food products sold in New Jersey to display an expiration date of two years or less from their manufacturing date. So, in order to standardize interstate distribution, most bottled water manufacturers gave every bottle a two-year expiration date.
Even after New Jersey amended the law a few years ago, a bottled water expiration date had become somewhat of an industry standard, so many manufacturers still use one today.
Unlike bottled water, data has an expiration date – but hardly ever uses one.
The era of big data seems to be fostering the false notion that we have an obligation to retain any data that we come across because of its potential usefulness. Instead of a “use it or lose it” attitude toward data, we have a “retain it and maintain it” attitude, which is making data hoarders of us all.
Some data is retained to support historical analysis, so we can learn from the past in order to predict probable futures – especially to try to predict the near future with real-time analytics. But there are limitations to historical analysis. Even though velocity is one of big data’s 3Vs, nowadays the world is changing just as fast as the data is moving, so the future is resembling the past less and less.
Instead of mountains of data that are managed just because they’re there, we need to acknowledge that all data has an expiration date, after which the data should at least be archived, or possibly even deleted.