It’s been about ten years since the band Coldplay released their song “The Scientist,” and I have to admit I really enjoy listening to this song. I listen to music quite often while working and when this song plays it reminds me of my data management journey since arriving at my current shop. Let’s take a peek at the lyrics and see why.
“Come up to meet you, tell you I’m sorry, You don’t know how lovely you are”
Like any good data geek, upon my arrival at my current organization I spent quite a bit of time meeting my new data, finding beauty and value in the data, and trying not to say I’m sorry too often when I knew the data we had wasn’t perfect. While nowadays I consider myself somewhat seasoned, there very rarely passes a day where I don’t meet a new and different situation with the data we manage.
“Tell me your secrets and ask me your questions, oh, let’s go back to the start”
While learning about our data, we spent an inordinate amount of time just finding the secrets by issuing SQL SELECT statements against the database, most of which typically included things like aggregate and analytic functions. Once we had DataFlux in house we began profiling the data and doing cool things like profiling database views where we denormalized some of the data and created sample sets. During this time, many folks came to us with all kinds of questions about the data; most times the team and I would have to go back to the start and determine how the records were created, who and what updated them, when (and if) they would be archived, and why some of the data might be incomplete or have issues.
“I was just guessing at numbers and figures, pulling the puzzles apart”
It takes time to pull the puzzles apart with data found anywhere, and we all have to do this to move beyond guessing at numbers and figures. This particular line is one of the two most quintessential lyrics that parallel the true meaning of the song (a person being powerless in the face of love) and that of the life of a data geek. We’re continuously pulling apart the data, trying to move beyond guessing at numbers and figures; in the end we do the best we can and hope our passion and determination allow us and our data to persevere.
“Running in circles, chasing our tails, coming back as we are”
If there ever was a better metaphor for the way data needs to be managed (particularly data quality), I don’t know what it would be. Most data management professionals spend quite a bit of time running in circles, chasing their tails. If you take a peek at the SAS DataFlux Data Management Methodology, you’ll notice it runs in a circular motion and has absolutely nothing to do with “chasing tails.” I can tell you from experience that if you adopt a methodology to manage data correctly like the one illustrated by DataFlux, you will find yourself chasing your tail much less often than you would be had you nothing in place.
“Nobody said it was easy, no one ever said it would be so hard”
When we started out we were so naive. We had data issues that are clearly illustrated in our applications and reports, and we thought we could just buy a data quality tool and start “cleaning.” Years later we’ve learned so much and have applied real resources (people, technology and process changes) to our data quality issues. We’ve founded a data governance committee and learned how to use data as an advantage over our competition (both pre-sales and at client renewal time). This “nobody said it was easy” is hands-down the second quintessential line of the song for me, and believe it or not the difficult part wasn’t about the technology; the amount of negotiations and hand-holding throughout this journey have been nothing but significant. Some people understood the value of managing data correctly without much discussion, but most didn’t. Some folks took years to “get it” and a select few might never really come around. Only time will tell.
Well, “Nobody said it was easy, oh, it’s such a shame for us to part”; but for me, I’ll stick to what I know best: “I’m going back to the start.”
Until next time…Rich