Sunday
Sep282008
Consolidating different table structures into one model structure
Sunday, September 28 I have to migrate data from one data model to another data model in the same schema.
Actually
in our current data model we are dealing with 4 different regions
America, Asia, Middle East and Australia to deal with different leagal
systems.
Now, we are changing the data model to deal all the 4 regions into one table instead of 4 different tables.
Can you tell me the different problems i could face in this migration process.
I
am planning to implement pipelined table function, object types and
partitioning logic for extracting the data, changing the data format
and loading into tables.
Is this approach fine to get a better
performance?
Thanks in advance.
2 Comments | | in
Consolidation 
Reader Comments (2)
In order to provide the guidance you are asking for, a lot more detail would be required. In general, the most basic approach is the best, and given the information you have provided, pipelining does sound like an efficient methodology.
The problems you might face are inherently tied to the variances between the given structures. Depending on the relative age of each of the four different systems and the entities who designed and maintained them, it is possible the quality of the data could vary considerably between them, which would increase the complexity. Not knowing how the combined structure differs from the existing structures makes it difficult to estimate any problems you might encounter.
If you could provide more detail on what you're up against, we could likely give more insight on the risks you might encounter.
I will second Jay's comment. In data consolidations the greatest and most commonly overlooked challenge is data quality.
Of course, if the data in your 4 sources has no overlapping objects than you could think of it as 4 separate data migration exercises. Even in that case though there is a risk that what apperas to be the same attribute in different sources really has different meanings. The only bulletproof way to deal with it is to perform extensive profiling of data sources before performing actual data movement.
If, as is common, there is an overlap between data sources, the problem becomes much more difficult. The common winners-losers approach inevitably fails. The right approach is to start by identifying all object overlaps between sources (ie. if this is customer data, are there same customers; if this is employee data, are there same people). Next you would need to implement a set of data quality rules that would check consistency of the data across sources as only then work on the strategy for consolidation.
I have written a short article on the topic that would provide more details if you are interested - "Ensuring Data Quality in Data Consolidations". You can find it at www.dataqualitygroup.com/Resources.htm or in the library of www.dataqualitypro.com.