The Semantic Future of MDM

The Semantic Future of MDM

Mar 24, 2010 by in Master Data Management

“Master data objects are those core business objects used in the different applications across the organization, along with their associated metadata, attributes, definitions, roles, connections, and taxonomies.” 

“Common examples of master data include customers, employees, vendors, suppliers, parts, products, locations, contact mechanisms, profiles, accounting items, contracts, and policies.”

“Master Data Management (MDM) incorporates business applications, information management methods, and data management tools to implement the policies, procedures, and infrastructures that support the capture, integration, and subsequent shared use of accurate, timely, consistent, and complete master data.”

All of those quotes are from the book Master Data Management by David Loshin, which remains my favorite resource for understanding MDM as we know it today.

 

Let’s simplify things for the purposes of discussion

First, let’s simplify the definitions in order to differentiate master and transaction data:

Master data is an abstract description of real-world entities.  Transaction data is an abstract description of real-world interactions involving two or more of these entities.

Now, let’s use a simple (and fictional) example of MDM as we know it today:

Michelle Davis purchases a life insurance policy from Vitality Insurance. 

In this example, Michelle Davis (customer), the life insurance policy (product), and Vitality Insurance (vendor) are all master data objects, and the premium payments that Michelle Davis sends to Vitality Insurance exemplify the transaction data involved.

Currently, both master and transaction data management is focused entirely on the perspective of Vitality Insurance, which for the most part, does make sense. 

Both the vendor and product master data objects used in this example are owned by Vitality Insurance since they are the vendor and they make the product being sold. 

Vitality Insurance also owns the transaction data because that is how the company makes money—especially if Michelle Davis lives a relatively long life.

However, Vitality Insurance doesn’t own the other master data object in this example, namely their customer Michelle Davis. 

But they would claim (no pun intended) to own the master data that describes her.

It is this particular aspect I will focus on in this discussion about the future of MDM. 

I believe this aspect is not only the most significant challenge facing MDM today, but also the fundamental flaw that the future of MDM must resolve.

 

How many (copies of your) customers do you have?

It can be easily argued that achieving a single view of your customers is one of the fundamental goals of a MDM implementation.

A single customer view allows your organization to understand how many customers you actually have and what the most “accurate, timely, consistent, and complete” data you actually have available to describe those customers. 

However, attempting to achieve this goal is fraught with complexities. 

The larger your organization and the longer it has been in business, the greater the likelihood you have many disparate systems for storing and managing master data.

Therefore, your organization probably suffers from inconsistent (or a total lack of) standards and ownership for all of your master data objects—and not just customer.

The reality that so many (and potentially conflicting) customer definitions as well as so many (and potentially redundant) copies of customer master data exist throughout the enterprise is what makes MDM such a daunting challenge.    

Achieving a single customer view is the holy grail of today’s MDM and is often referred to as creating the “golden copy” of each unique customer.

 

But even the “Golden Copy” is still just a copy

All data (master and transaction) is an abstraction.  Creating golden copies is an attempt to perfect the abstraction, which remains disconnected from reality. 

Even the best maintained golden copies still suffer from the digital distance that exists between these internal abstract descriptions and the external real-world entities that they are attempting to describe. 

Nothing can change the fact that the text string “Michelle Davis” is only an abstract description of the human being whose name is currently Michelle Davis. 

Even if near real-time updates modify the text string as “Michelle Davis-Donovan” after Michelle Davis marries Michael Donovan in a beautiful seaside wedding, the same digital distance remains as a fundamental flaw in our current MDM world-view.

The inconvenient truth is this real-world event was not simply a partial concatenation of two text strings swimming in a beautifully maintained digital sea of information within the world-class MDM system of Vitality Insurance.

 

Attack of the (Digital) Clones

Let’s switch the perspective of this discussion—to your perspective. 

No, I don’t mean your perspective either as someone working on a MDM solution for your organization, or as someone working for a vendor selling MDM solutions.

I mean your own individual and personal perspective.

How many companies currently view you as a customer?  How many companies have previously viewed you as a customer at one or more times in the past?

How many copies of your personal information (i.e., your master data) do you think exist within the databases and file systems of all of the companies that you have ever done business with in your entire life? 

Don’t forget to count all of the companies that obtained your personal information indirectly from the companies that you directly provided your personal information. 

(In our example, imagine all of the third party companies Vitality Insurance sends Michelle Davis’ personal information in order to assess her insurance risk.)

I refer to all of these copies of your master data as your digital clones.

Now imagine how many of your digital clones still look like you.  In other words, how many have your current postal addresses?  E-mail addresses?  Telephone numbers? 

How many of your digital clones have all of your relevant personal information? 

How many of them don’t know how old you are?  How many of them don’t know how many times you have been married or how many children you currently have? 

Would you even recognize all of your digital clones if you saw them today?

Now imagine you are a customer of Vitality Insurance.  They have implemented a world-class MDM system.  Therefore, they are maintaining an accurate, timely, consistent, and complete golden copy of your customer master data.

Great—now what about all of the other companies you do business with?

This is the fundamental flaw of MDM today—the current focus is entirely on companies (e.g, Vitality Insurance) and not on individuals (e.g., Michelle Davis).

Why is your personal information being managed by anyone other than who owns it?

 

The Personal Data Locker

In his excellent book Pull: The Power of the Semantic Web to Transform Your Business, David Siegel discusses the concept of the personal data locker, which will be your secure online account that stores all of your personal information, where it will be managed by who truly owns it—you.

You will grant permission to access the relevant aspects of your personal information to the vendors and other service providers with which you conduct business.

In most cases, the only personal information released will be your unique identifier (e.g., your OpenID or i-namePlease Note: these are only examples).

You will view all the transaction data connected to your master data—or requesting your verification to connect.  For those familiar with online banking, imagine something similar to (but more advanced than) how e-bills work in online bill pay.

You will run your own personal MDM system.  You will maintain the accurate, timely, consistent, and complete single view of your master data.

There will be no copies (golden or otherwise) of your personal information.

All of your digital clones will be deleted.

 

The Semantic Future of MDM

The semantic web is a disruptive paradigm shift, which will impact more than just the future of MDM. 

However, the semantic web is still in its nascent phase.  Although it is rapidly evolving, it will take more time not only before everything necessary is in place, but also for the defenders of the status quo to stop trying to fight the future.

The semantic web is also about much more than just simply cloud computing and software as a service (SaaS), both of which generate a lot of industry buzz today.

There are even some vendors already beginning to offer new MDM “solutions” where master data moves into the cloud.  However, these apparent early adopters are still missing the fundamental flaw underlying MDM today. 

It is not simply that master data needs to move into the cloud.

The most important aspect of the future of MDM is transitioning the management of master data to the real-world entities that actually own the data, thereby virtually eliminating both the abstraction and the digital distance undermining MDM today.

When this transition finally happens, organizations will be able to focus on managing the data they truly own—transaction data and only the master data that describes the organization and the core business objects it actually owns (e.g., its products).

In the semantic future of MDM, organizations will stop wasting time and money attempting to manage data they do not own.

30 Responses to “The Semantic Future of MDM”

  1. Phil Simon

    Mar 24, 2010

    Jim

    Great post. This is my favorite of the series so far. I can see why the semantic web would be so disruptive to so many people.

    I’d argue a few things:

    1. I should be responsible for updating my own information in the fewest number of places as possible.
    2. When (not if) the semantic web reaches critical mass, it will make many data-related things much more efficient.
    3. MDM in the cloud had better be really secure. I can only imagine how many hackers are licking their lips at the thought of being able to get at this information.

    Reply to this comment
  2. Jill Wanless

    Mar 24, 2010

    I can’t wait for this to happen! How much money would organizations save by not having to manage their customer’s master data?

    Can I add to Phil’s additional comments:
    1/ This will require ‘the individuals’ to have access to update this information, and not everyone who is a customer of someone has access to a computer or the internet.
    2/ Although I have no problem providing my suppliers with up to date information when forced to (over the phone or email), if asked to do this on my own I’d probably add it to the ‘when I get around to it’ pile :)

    I’m hoping that all of these outstanding questions do not cause undo delay, I personally think it’s a brilliant idea! Thanks for a great post,

    Reply to this comment
  3. Jim Harris

    Mar 24, 2010

    Thanks for your feedback, Phil.

    To your points:

    (1) Yes, you should be responsible for updating your own information in the fewest number of places — and ultimately one place should be the ultimate goal. The semantic web is about getting the data to stop moving — the data stays in one place, on the web and shared with appropriate levels of privacy and security.

    (2) The semantic web is definitely a “when” and not an “if” and it will definitely bring enormous efficiencies to data-related disciplines.

    (3) The security issue is most common question/objection to “the cloud” or the future fully semantic web — my response is “do you think your data is secure today?” I don’t mean just you personally. Hackers can get your personal information much easier today because so many copies exist in so many systems all around the world — many will little or no real security. As someone who has been the victim of identify theft, I speak from experience — my data was stolen from a third party company involved with another company that I hadn’t done business with in over 10 years. I would rather be hacked from my personal account than some old copy of my personal information that I wasn’t even aware existed.

    Best Regards,

    Jim

    Reply to this comment
  4. Phil Simon

    Mar 24, 2010

    Really good points. Better to keep your data locked down in one place than dispersed in many.

    Reply to this comment
  5. Charles Blyth

    Mar 24, 2010

    Great post Jim, very creative and definitely food for thought.

    Here’s a question though, what about non-customer data? i.e. product, part, campaign, sales agent, policy, the type of data that has no general standard. In the world you describe, how would these be

    Reply to this comment
  6. Ken O'Connor

    Mar 24, 2010

    Jim,

    Stunning article – thank you, and congrats on writing this.

    I agree with you – the semantic web is definitely a “when” and not an “if”.

    Like Phil and Jill, I forsee many challenges – none insurmountable.

    One example:
    There are many people, who prefer to live “outside the law”. They will deliberately seek to hide their true identities, and provide misguiding information. (Such people pose challenges today to Anti Money Laundering, Anti Fraud, and Anti Terrorist Financing systems). Perhaps the semantic web and the “personal data locker” will prevent people hiding their identities?

    I have a personal vision starting in the area of Health. When a baby is born, he/she will get a unique identifier, and the details of the pregnancy, birth etc. will be recorded (I believe our medical details should belong to us as individuals). Initially the baby’s parents would control access to the information, probably sharing it with the hospital, and the GP looking after the baby. One central “Golden Record” of our health record… Nirvana?

    Thanks again for this brilliant post,

    Ken

    Reply to this comment
  7. Brave post Jim.

    I wonder if the emerging semantic web will lead to this data quality manifesto:

    “While we value that data are of high quality if they are fit for the intended use we value more that data correctly represent the real-world construct to which they refer in order to be fit for current and future multiple purposes”.

    Reply to this comment
  8. Jim Harris

    Mar 24, 2010

    Thanks for your comment, Jill.

    Yes, I believe an extremely significant amount of wasteful spending will be eliminated from MDM efforts by re-focusing on only the master data they truly own.

    To your points:

    (1) Yes, customers without access to a computer or the Internet would need to be taken into account. However, even aggressive estimates point to a fully semantic web needing at least another 10 years to become completely viable. More realistic estimates predict at least another 20 years. Therefore, this particular concern will become less of an issue than it is today — but still an issue for sure.

    (2) Good point, but you still need to change your mindset a little. For example, today if you changed your telephone number, you would have to (or to your point, are supposed to) contact all of your service providers/bill collectors and inform them of your new number. In the semantic future, you will simply update your personal data locker. No updates will be sent to anyone. Companies (and people — even your family and friends) will never know your telephone number — they will place a “call” to your unique identifier. If you have authorized the caller to be allowed to contact you this way, then your personal data locker will indicate which telephone number you want the call routed to (mobile, home, work, etc.) or perhaps you will have the call sent directly to voicemail. You customize your contact settings within your personal data locker, either as a global setting or a case-by-case basis. So for example, you go on vacation to Fiji, you tell your personal data locker that the hotel room telephone number is the only active phone “line” you can be reached at — and most likely only for select friends and family, all other calls can be routed to a temporary “Jill is on vacation” voice mail message.

    Best Regards,

    Jim

    Reply to this comment
  9. Jim Harris

    Mar 24, 2010

    Thanks for your comment, Charles.

    The master data that describes the organization and the core business objects it actually owns — such as your examples of product, part, policy — would still require MDM similar to what we know today.

    However, “standards” are another huge aspect of the semantic web. To quote David Siegel:

    “Another word for semantic is unambiguous. In the semantic web, we declare what we mean in precise, standardized terms. Data that is semantic means exactly the same thing to any system or person who uses it.”

    Therefore, even the master data objects managed by the organization in the semantic future will be leveraging universal and/or vertical industry specific open data standards. Gone will be the days of proprietary software or database vendor formats or proprietary bespoke solutions.

    If this sounds like a pipe-dream, please note that some of these types of data standards exist today.

    For example, the Extensible Business Reporting Language (XBRL) for the electronic communication of business and financial data, which is revolutionizing business reporting around the world, and currently being used by more than 110 countries.

    Cheers,

    Jim

    Reply to this comment
  10. Jim Harris

    Mar 24, 2010

    Thanks for your feedback, Ken.

    Yes, there will definitely still be people deliberately hiding their true identities and providing misleading information for the purposes of fraudulent and other criminal activities. After all, human beings will still exist in the semantic future :-)

    However, I believe that our identities will be much more secure in the fully semantic web.

    In most cases, the only personal information released will be your unique identifier. Very little personal information, especially sensitive (financial, health, etc.) information will need to be made available — and you will decide whether or not such information “needs” to be released and to whom.

    David Siegel’s book also discusses the concept of a digital birth certificate (and not just for people, but also parts and products too) and I completely agree with you about our medical records belonging to us as individuals.

    The personal data locker would be the one (and only) “Golden Record” of everything about us, including identity, finances, health, legal, communication, employment, etc.

    Best Regards,

    Jim

    Reply to this comment
  11. Jim Harris

    Mar 24, 2010

    Thanks for your feedback, Henrik.

    Yes, I believe the fully semantic web will fulfill your stated data quality manifesto.

    Fully semantic data available on the web and following open standards will resolve many (but obviously not all) of today’s data quality issues. The “point of view paradox” where data quality is in the eye of the beholder will obviously persist.

    However, the greatest data quality advantage of the semantic web is that it will be built on a foundation of shared, unambiguous, and therefore reusable data.

    Although reusable data can be used in different ways, semantic data will be fit to serve as at least the basis for each and every purpose — otherwise, it would not be semantic.

    Once again, to use David Siegel’s words:

    “Another word for semantic is unambiguous. In the semantic web, we declare what we mean in precise, standardized terms. Data that is semantic means exactly the same thing to any system or person who uses it.”

    Best Regards,

    Jim

    Reply to this comment
  12. Eric Franzon

    Mar 24, 2010

    Well said Jim! Thanks for this thoughtful, insightful piece.

    To Jill, I’d say that I have that ‘when I get around to it’ pile today — of systems with outdated information about me that I keep meaning to update, but rarely do. Usually, these systems have unique interfaces and processes. (In some cases, I can use a form interface. In others, I am asked to send an email with updated information…) It’s largely because of this pile that I love the idea of having a single place in future to update my own master data.

    Some of the other criticisms above are cultural rather than technological, but I believe that structuring data in this way allows for stronger security, proof, and trust than we have available today.

    Thanks again,
    Eric

    Reply to this comment
  13. David Siegel

    Mar 24, 2010

    Thanks Jim! A few of us can see it coming, and our numbers grow every day. Let’s hope the Singularity doesn’t obfuscate human cognition.

    Reply to this comment
  14. Julian Schwarzenbach

    Mar 25, 2010

    Jim,

    I like the general concept of what you are proposing. Placing a responsibility on customers for maintaining their own master data sounds like a good overall concept which should reduce data clones and should keep data more up to date. I agree with Ken that there could be risks around people who may be less law abiding and may have more than one identity, and none of which agrees with the real person.

    There is probably a bigger problem here, namely apathy. I have come across a number of situations where organisations have set up ‘self service’ facilities on their HR databases to allow employees to keep their records up to date. This sounds like a similar concept, however, even in such situations where an employer arguably has more power to compel people to comply, staff still tend not to keep their records up to date. This problem is only likely to be worse where there is less compulsion for people to keep data up to date.

    An additional factor to consider are the ‘digitally excluded’ – people who either have not got computer access or do not have the skills to use computers. For those who have not got access, but are willing and able to use computers, then it should be possible to create suitable arrangements to allow this. However, this still leaves a fair slice of the population who cannot use computers (either due to age, disability etc.) who need to be covered by such an approach. For example, my mother in law struggles with technology and has poor eyesight, when she goes to her doctors she is forced to use a touch screen based system to register that she is in the building and ready for her appointment, however, this has to be done for her as she is not able to register for herself. We need to ensure that any future proposals for personal master data recognise the needs of all parts of society.

    Picking up on Charles point, the ‘owner’ of a physical item (e.g. part, equipment, vehicle) should be responsible for keeping the data on these entities up to date in a similar manner. This would be reliant on a commonly agreed set of data standards, which should be technically achievable once their is agreement on the approach to adopt.

    Julian

    Reply to this comment
  15. Charles Blyth

    Mar 25, 2010

    Great debate going on here, no surprise really off the back of a very thought provoking post.

    Do you see data entities in the semantic web being available in a ‘shopping list’ format? I’m not saying that the data will be put out on show for commercial gain, more that the entity type be available to be drawn down and used, i.e. a developer might be able to browse the entity contents and then chose an entity that suits his/her requirements? “I like that entity, it matches what I am looking for, let’s add it to my domain!”

    This would drive standardisation further, the more enterprises use the same entities the more standard they become.

    Reply to this comment
  16. Phil Simon

    Mar 25, 2010

    Along these lines, I recently read:

    http://gadgetwise.blogs.nytimes.com/2010/03/25/should-you-pay-to-search-on-the-web/?partner=rss&emc=rss

    Interesting post. If we all owned our own records via some type of “personal MDM”, then this problem would be ameliorated.

    Reply to this comment
  17. Jim Harris

    Mar 25, 2010

    Thanks for your extensive feedback, Julian.

    The apathy you describe may be a challenge — Jill, Eric, and I discussed the non-employee aspects of it in our comments above (which I realize are getting too numerous to read all of them, especially after reading a long blog post :-) ).

    Jill also brought up the “digitally excluded.” I can relate to the situation with your mother since my father is legally blind and using a computer continues to be a significant challenge for him. Provisions would obviously need to be made, especially for the elderly who even today remain the most vulnerable to fraud and identity theft (even without technology being a factor).

    In my response above to Charles’ first comment, I noted that Product MDM may continue to be implemented in a way similar to current approaches. But as we all have noted, standards are the key. Open semantic web based standards and the elimination of software/database proprietary formats are a crucial (and disruptive) change necessary to make this happen. I am waiting to see startup companies offering legitimate semantic web based solutions for Product MDM in the near-future.

    Cheers,

    Jim

    Reply to this comment
  18. Jim Harris

    Mar 25, 2010

    Thanks for your follow-up question, Charles.

    Yes, fully semantic data available on the web and following open standards will provide shared, unambiguous, and therefore reusable data.

    This is the major data paradigm shift of the semantic web:

    Stop replicating and needlessly “customizing” data and start re-using data.

    Replication and customization has historically had two causes:

    (1) Limitations in technology (storage, access speed, processing speed, and a truly sharable infrastructure like the Internet) meant that the only option was to create and maintain an internal copy of all data.

    (2) Proprietary formats and customized (and also proprietary) versions of common data was viewed as a competitive differentiation — even before the recent (non-semantic web related) dawn of realization that data is a corporate asset.

    Hording common data in a proprietary format and viewing “our private knowledge is our power” needs to be replaced with shared data in an open format and viewing “our shared knowledge empowers us all.”

    Sorry for the mini-rant — but to your point, yes the more enterprises use the same shared data the more standard they become — and that is good for everyone.

    Cheers,

    Jim

    Reply to this comment
  19. Jim Harris

    Mar 25, 2010

    Thanks for your additional comment, Phil.

    Advances in semantic search engines, combined with future concepts such as the personal data locker, will definitely revolutionize the way we search the web.

    As with all semantic web concepts, we are not quite there yet.

    However, here are a few examples of promising developments working towards (but still far from) a semantic search engine (which, by the way is also discussed in David Siegel’s book):

    True Knowledge

    Google Squared

    WolframAlpha

    Best Regards,

    Jim

    Reply to this comment
  20. Andy Hayler

    Mar 25, 2010

    Excellent, thought-provoking article. It is perhaps interesting to consider the barriers to such a future, in order that they can be understood and solutions found. There are technical things I can think of, but I suspect that the trickiest may be economic. One issue is the sheer weight of existing software applications deployed in companies that maintain master data; an ex-colleague mentioned to me the other day that a system that I wrote in 1986 was still in operational use at Exxon, and in general it is scary just how much application code is deployed, and how tough it is to justify modifying or replacing such systems. It is to be hoped that as more companies actually start counting the cost of maintaining multiple copies of data, and addressing the quality of this data, that such a switch may be economically justified. The industry needs to make it plain what the benefits are compared to the cost of migration; if it does a compelling job then there is some hope of a better future.

    Reply to this comment
  21. Jim Harris

    Mar 26, 2010

    Thanks for your feedback, Andy.

    Yes, it is a challenging paradox.

    Although an economic justification is easy to make due to the tremendous cost reduction that could be achieved by eliminating the redundant maintenance of multiple copies of (especially customer) master data, the reality (as you so nicely phrased it) of the sheer weight of existing software applications deployed makes it no easy (and far from operationally risk-free) task to engineer this transition.

    I believe this barrier is what prevents many large organizations from even attempting MDM implementations using the technology and methodology that is currently available today.

    We all talk about the need to make the business justification for MDM (as well as any other enterprise initiative) as if having this justification will convince companies to implement a “solution.”

    However, the harsh reality is that even a legitimate business-justified solution, which in the long run will reduce costs, mitigate risks, and increase revenues — will, in the immediate future only increase costs, increase risks, and decrease revenues.

    This is a difficult sell to an organization’s shareholders — we will lose (and spend more) money this year so that we can make (and spend less) money next year.

    Although it is easy for me to criticize short-term thinking in organizations, I must admit that I too am hesitant to personally sacrifice in the short-term for the hope of a better future.

    This paradox maintains the status quo perhaps more than anything else.

    Cheers,

    Jim

    Reply to this comment
  22. Kelly Lautt

    Mar 28, 2010

    I won’t jump into the fray because at this point the blog post itself plus the comments have pretty much covered it. But I can’t help but completely geek out for a minute and suggest that this type of MDM that is single-owner, unified, in the cloud and covers everything from medical to hobbies to finances to religious beliefs (it could… why not?) will bring us the kind of information used on the TV show Caprica to recreate an entire human consciousness using their data.
    Super cool!

    Reply to this comment
  23. Jim Harris

    Mar 29, 2010

    Thanks for your comment, Kelly.

    Zoe Graystone would be proud.

    So say we all!
    :-)

    Reply to this comment
  24. Daragh O Brien

    Mar 30, 2010

    Wonderful vision of the future Jim. There is just one teeny tiny problem that I need to raise. It’s only a small one, but it might be worth a mention.

    That problem is that, almost without exception, the current wave of Cloud Computing hosting platforms are not compliant with EU Data Protection principles, which focus on the protection of exactly the type of personal “master data” which you describe (name, address, medical data, hobbies, religious beliefs etc.). The non-compliance stems from the terms and conditions which underpin the hosting services, which affects the ability of any company building a service on those platforms to be fully compliant with Data Protection rules and principles and is worth a blog post in and of itself (I’m doing sessions at @CloudCamp in Dublin and Cork next month on it). Among the issues are the question of where your data actually is at any time.

    While there is some work ongoing in the EU to look at if or how the underlying legislations need to change to embrace cloud computing but the underlying principles are unlikely to change because they are based on the UN Charter of Human Rights and the EU Charter of Human Rights and, in the case of Europe, fairly recent and painful memories in Europe about what happens when governments or organisations have access to personal data and abuse it.

    However, if we assume that someone bites the bullet and actually builds a cloud environment that complies with the Data Protection principles then your vision would most definitely be very cool indeed.

    Reply to this comment
  25. Jim Harris

    Apr 01, 2010

    Thanks for raising an excellent point, Daragh.

    As one example, let’s look at the seven principles governing the Organization for Economic Cooperation and Development (OECD)’s recommendations for protection of personal data:

    (1) Notice — people should be given notice when their data is being collected.

    (2) Purpose — collected data should only be used for the purpose stated.

    (3) Consent — data should not be disclosed without personal consent.

    (4) Security — collected data should be kept secure from any potential abuses.

    (5) Disclosure — people should be informed as to who is collecting their data.

    (6) Access — people should be allowed to access to correct any inaccurate data.

    (7) Accountability — people should be able to hold data collectors accountable.

    I realize that these principles are not universally accepted or enforced, and I am not trying to oversimplify this discussion.

    However, if the cloud computing (or future fully semantic web) provider was only providing a platform that people used to manage their own personal data, then wouldn’t all seven of these principles be fulfilled?

    In most cases, the only personal information any company or individual would have about you would be your unique identifier. Under this admittedly far futuristic vision, there would no longer be any “collection” of your personal data, and the only “access” to your personal data would come from your own authorization.

    Best Regards,

    Jim

    Reply to this comment
  26. Mark Montgomery

    Apr 03, 2012

    Hi Jim,

    Just getting around to this fine post. The functionality you envision here is quite similar to what I envisioned as evidenced by our Kyield individual module, which among other functions allows for the management of relationships between the individual and groups and organization in the digital workplace. Since that time when I filed the patent application (4/2006) it’s matured considerably, especially in terms of automation and analytics.

    Just for the record I did test the individual module (as well as other parts of our system) beginning almost a decade ago, so most of the basic functionality has been around for a while, it’s just no partners existed back then to exchange data with in any similar structure.

    Of course as our efforts in 2010 and 2011 have clearly demonstrated with our semantic healthcare platform, with essentially no regulation on privacy and control of data until very recently- and I argue confusion over who owns the data and who doesn’t (seem clear to me and others here but conveniently confusing to those with misaligned business models), it requires the participation of the other ‘stakeholders’ or perhaps more accurate ‘stewards’ of data.

    I actually came across very senior execs in healthcare for example who laughed at the idea of allowing patients to self-manage their own data–and this after regulation. Personally I think this path is inevitable simply due to the impossible costs to the economy otherwise, but I also have strong evidence that a substantial portion of incumbents not just in IT but other industries are threatened by any such functional system. Even many government agencies that one would think would support citizens and consumers managing their own data, and often give lip service to, have actually been among the most difficult to deal with — the reason is pretty simple — smart systems that work are also accountable, and those cultures that are accustomed to power over others generally don’t voluntarily give it up.

    It will happen anyway of course.

    Reply to this comment
  27. Kimmo Kontra

    Jun 05, 2012

    Great post and follow-up discussion! A few thoughts & comments.

    Data.com is (sort of) attempting to create a “data locker” for business party data. There are elements of crowdsourcing included, to add one hype term to the soup.

    There are a number of organizations attempting to create a “locker” for product data: the real golden record for product data that multiple parties can and should access for real-time product data. For example GS1 operated “Sinfos” data pool that has gained some popularity in a number of Northern European countries, especially in retail and wholesale sectors.

    The underlying challenge in any of the “lockers”, be it for personal or business information, is that the data are – as you wrote – an abstraction of reality and thus always imperfect. The abstraction of reality a company moving furnitures might be different from an express courier (=furniture mover may want to know whether and what kind of elevator there’s in an address while courier is happy with the street address only).

    The same applies for product data – and I’d assume for personal data also, very much so.

    The answer to cover the differences in abstraction levels is complexity: having semantics that cover just about everything. But complexity is rarely a real answer…

    A more simple approach of course is to claim that only the “core” data is stored centrally and if an organization is interested in something more exotic, then, well that should be stored outside of the locker, within the company itself. At least the “core” would be self-managed by the true owners. Drawing the lines between the “core” and the rest remains a challenge, and even within the core they may be surprising differences e.g. among countries.

    No, I am not pessimistic with the semantic data and common “lockers” / “pools”. I believe they will come. We need them: having dozens of “digital clones” attacking even within a single organization is just not working.

    How to arrive there? Jim, any ideas to overcome differences in abstraction

    Reply to this comment
    • Jim Harris

      Jun 07, 2012

      Thanks for your comment, Kimmo.

      Data ownership is the central challenge for the data locker concept, whether or not, as you said, the locker is for personal or product (or any other kind of) data.

      I agree with you that only core data could be self-managed by the true owners, but establishing the true owners first requires that others relinquish their claims of ownership. As I mentioned in the post, this is what I see as the fundamental flaw of Customer MDM — customers, who are the true owners of their own master data, are not allowed by the companies they do business with to establish ownership over their own data.

      Your point about the challenges with overcoming the differences in abstraction is an excellent one, but from my perspective, the biggest abstraction that the enterprise data management industry has to overcome is the abstraction of enterprises assuming ownership of their customers’ data. Now, of course, the Single Version of the Truth is that too much money is being made by too many companies (both data management vendors and their clients) off of the perpetuation of that abstraction for such a seismic paradigm shift to occur anytime soon — the Status Quo will always Fight the Future :-(

      Best Regards,

      Jim

      Reply to this comment

Leave a Reply