Harmonizing Hard-Coded Metadata

Harmonizing Hard-Coded Metadata

Apr 24, 2012 by in Data Governance, Data Quality, Metadata

I have been exploring the issue of the discrepancy between presumption of logical naming for data elements and their corresponding uses in application code, and the potential dependencies (and eventual data failures) that can erupt when reference data values are hard coded into application code. The problem can’t be solved by globally searching out uses of the data element and changing its name, because the embedded dependencies will still exist.

The solution, unfortunately, requires some real effort. If you wanted to adjust data element names, you must not just find where those data elements are used, you have to read the code to see how the data element’s value(s) are used, determine that those uses are consistent with the data element’s real meaning, and make any changes in a controlled manner with proper testing and validation.

It sounds pretty labor intensive. Can this be automated? Actually, I believe that some it can be, but would require some relatively sophisticated techniques. For example, one can use a compiler to parse out the structure of the program and then flag the uses of an accessed data element. Within the surrounding code fragments, look for uses of the data element’s value loaded into a variable and then look for certain types of “signatures” in the code that would need to be examined. For example, fetching a record and storing a value into a variable that is then the subject of an if-then-else or a case statement might be a sentinel for deeper examination. So we might be able to adapt some automation techniques, but to get it right you still have to actually look at the code.

I am curious as to the extent of this potential problem – email or post a comment if you have had a similar experience!

Read David Loshin’s previous post, “Hard-Coded Metadata.”

No comments.

Leave a Reply