Wednesday, August 31, 2011

Master Data Management : Open Source Solutions???

The movement towards digital records is generating exponential amounts of data, tremendously valuable data. But building a system to manage that data and extract value from it requires acceptance of a paradox; the system needs to be flexible and tolerant, while simultaneously enforcing structure and standards. This is hard.

Many of us have felt the pain of Master Data Management (MDM). In large enough enterprises, even something as simple as keeping addresses current across multiple lines of business is a herculean task. Furthermore, it’s a problem that’s difficult to ignore because poor data management can drive substantial expense and leave invaluable opportunities on the table.

Since joining Health Market Science (HMS), a company focused on MDM for the healthcare space, that sentiment has been reinforced tenfold. Two things became immediately apparent: the complexity of a complete solution and the value of the same.

The problem is complex because there is a temporal aspect to the data. A complete MDM solution doesn't just provide a current view of entities, but a historical perspective as well. It provides that perspective across all the entities and the all the relationships between those entities. Then, add in the fact that each source system may have a different schema representing each entity and that the schema itself may evolve over time. Then, pile on all the necessary processing to analyze the data. Entities need to be consolidated based on precedence rules, standardized, and matched using fuzzy matching. What a fantastic recipe for a fun problem to solve.

But given the complexity, it isn't difficult to see why companies shy away from an "in-house" MDM solution. Solving these problems isn't easy. You may think that if you get yourself a good Data Architect and a massive Oracle instance, you could crank something out. You’ll soon find that standard relational structures quickly become unwieldy and you end up in “meta-meta” world building out schemas to manage schemas. This can be extremely painful.

I'm wondering if people have seen any open source solutions capable of tackling this problem. From what I can find, the open source "MDM" solutions only tackle the Extract-Transform-Load (ETL) portion of the problem. Unfortunately, that is the easy part. Does anyone know of any open-source communities focused on delivering a complete solution, ETL through down into storage?

8 comments:

Skaldo said...

Hi there I'am just doing research on some MDM solution/tool that we need for our project. Did you found meanwhile anything? Can you provide any interesting link? Thanks a lot Peter

Ruby said...

These are great points you have here, Brian. Indeed, a mediocre MDM solution can lead an enterprise to hit rock bottom. I believe an efficient MDM considers the importance of understanding how the data got to the current state, and the system must be capable of maintaining MDM hierarchies.

-Ruby Badcoe

france pope said...

This is very good information.i think it's useful advice. really nice blog. keep it up!!!
master data management

The Dyches said...

Have you looked at Talend? Seems they are touting a full data integration product that includes...an Enterprise Service Bus (ESB), Data Synchronization, Data Governance, Big Data, Data Quality in CRM, Data Quality in Data Warehouse, ETL and MDM

peterjohn said...

I appreciate you sharing this article. Really thank you! Much obliged.
This is one awesome blog article. Much thanks again.


sap online training
software online training
sap sd online training
hadoop online training
sap-crm-online-training

peterjohn said...

I really enjoy the blog.Much thanks again. Really Great.
Very informative article post. Really looking forward to read more. Will read on…


oracle online training
sap fico online training
dotnet online training
qa-qtp-software-testing-training-tutorial

Nasreen Basu said...

Really cool post, highly informative and professionally written and I am glad to be a visitor of this perfect blog, thank you for this rare info! , Regards , informatica mdm training in hyderabad

Sarah MacAdams said...

When looking for data management solutions for storing, organizing and managing your important data, data management solutions there are many things you need to take care of.