American Express Company’s Data Quality Evaluation

Words: 1247 Pages: 4

Formulate a high-level plan for how you will proceed with the task of determining the quality of the redundant data elements that will become the master data

American Express is one of the largest multinational companies providing services for helping merchants in buying and selling goods and services. The company is a large organization that serves large numbers of clients and runs many transactions. This means that the company stores large chunks of data such as information about clients, information about employees, and transaction data among others. Effective management of these large chunks of data is essential for ensuring the security of the data, privacy, and easier retrieval as well as manipulation of the data (Dubov, 2007).

The most appropriate approach for the effective management of the large chunks of data about the company’s employees, clients, and all transactions is master data management. Master data management is where all the data is stored as a single load referred to as the master copy. The master copy is closely monitored to ensure data security as well as the accuracy of that data (Alex, 2005). American Express being a large multinational organization will require ensuring the effective management and security of its data. The use of master data management enhances the accuracy and quality of data by avoiding external data sources.

One of the main reasons for implementing the master data management approach is to eliminate excessive data in the organization. Some of these excessive data include repetitive data; hence, it is important to consider the quality of data that will be used as the master data (Dyche & Levy, 2011). The examining of the large chunks of data at American express will be enhanced by the data profiling tool, which easily evaluates the data, generates interpretable results that will help the users to prioritize the data, and determines which data should be used as the master data. The user will be required to put in place a well designed effective data profiling tool. The data profiling tool will be used during the transition of data, and during the data integration process (Dubov, 2007). Additionally, the data profiling tool will also be important in determining the accuracy of the data.

Develop a set of factors that might influence the quality of the data, for which you will devise tests. For example, when dealing with multiple systems, each containing customer data, and the challenge might be to match data for a single unique customer

American express needs to examine the quality of data to be used as the master copy. One of the most important factors to consider while determining this is the behavior of the data. The data to be used as the master data should be able to interact with the other data in the load. In transactional data, for example, the master data about the product, employee, and the client should all be related. The data should also have a hierarchical arrangement according to the company’s policies.

Another factor that should be considered while selecting the data to be used as the master data is its life cycle. The data should be easily described through its creation, access, manipulation, and retrieval. This is referred to as the CRUD cycle of data. The CRUD cycle of data should be based on the company’s rules and policies.

Another factor that should be considered while determining the master data is the lifetime of the data. In this case, the master data should be less volatile as compared to the transactional data. For data to be used as the master data, it has to have a longer lifespan. If the data is not likely to have a longer life span it is classified as transactional data.

The other factor that American express should consider in determining the master data is the complexity of the data. The company should select the data that is not likely to be changed or manipulated shortly (Sarsfield, 2009). An example of data that is more complex is assets. The value of the data should also be considered while selecting the data to be used as master data. The data should be of value to the company; however, data that is less volatile is most likely to be of more value to the company.

Develop a high-level plan for resolving differences in data between systems. Among those differences, there will some data that has data synchronization errors. For example, you are successful in integrating unique client data across systems but discover that the address listed for some of these clients is not the same across all systems

The process of master data management involves several activities one of them being data synchronization. The synchronization process is likely to encounter errors in cases where consistency is established between the systems (Alex, 2005). Some errors are likely to occur during the synchronization process. Some of the errors likely to occur due to the synchronization process are data repetition and loss of relationships between the data.

One of the most appropriate solutions to errors encountered during data synchronization is data transformation whereby, the values of the data are transformed from the data source format to the format of the data system of the destination. Data transformation is categorized into two; data mapping and code generation (Sarsfield, 2009). Data mapping is where the data from the source captures the transformation while the code generator generates the entire transformation process. Another approach to handling synchronization errors is by using the normalization process where all the fields in the database are well structured and organized to reduce the repetition of data as well as dependency. During the normalization process, the tables in the database are subdivided into smaller manageable tables (Sarsfield, 2009). This helps in reducing data redundancy and defines the relationships between the tables. Some of the other solutions to errors encountered during the data management process include rule administration, error detection and correction, data distribution, and date classification among others.

Develop a high-level plan for how you will address missing data. For example, some of the systems that contain customer data do not require input, so those fields have been left blank in those systems

The data management process may encounter errors that might, in turn, cause the loss of some data. The data that is considered lost could have been misplaced in the wrong systems or records that cannot be easily retrieved. One of the methods to handle this issue is data consolidation where all the records from the source are consolidated into one record which is referred to as the master record. The new consolidated record is then easily identified in the system.

The other method of handling the loss of data is through data deduplication where all the data stored in separate records in the company’s data are matched and identified as one master record. Error detection should be used to identify incomplete and inconsistent data and ensure that it is identified and sent back to the source before it’s loaded on to the master data. Data correction is where the user needs to manually review the records and identify errors that should be corrected (Dyche & Levy, 2011). One of the most important measures to prevent data loss during master data management is data governance and cleansing, processes that involves defining the rules that the data has to follow (Alex, 2005). Data governance is important as poor master data quality may affect the performance of the company.

References

Alex, B. (2005). Master Data Management & Data Governance. New Delhi: McGraw- Hill.

Dubov, L. (2007). Master Data Management and Customer Data Integration for a Global Enterprise. New York: Morgan Kaufmann.

Dyche, D & Levy, E (2011). Customer Data Integration: Reaching a Single Version of the Truth. New Jersey: John Wiley & Sons.

Sarsfield, S. (2009). Data Governance Imperative: London: IT Governance.