Profiling and Cleansing
Data Profiling is used to identify key data elements, and to examine them for issues. It entails the automated and manual review of production data in order to establish metadata (information about data) about each critical data entity. This exhaustive analysis of each data element is subdivided into logical categories based on data usage.
Data Cleansing is conducted once data integrity issues have been identified. Issues can be remedied via manual or automated means, either in-situ (in the source database), as part of the data migration design, or, rarely, in the new system post-migration. Many public system data issues are very complex, and may require a mix of the above approaches.


