Data Cleansing
Knowledge and ability to fix or remove incorrect, corrupted, incorrectly formatted, duplicate or incomplete data within a dataset. It includes fixing structural errors, filtering unwanted outliers, handle missing data and validating data.
Proficiency Level
Level 1 (Follow)
N/A
Level 2 (Assist)
N/A
Level 3 (Apply)
- Experience in utilising a number of data cleansing techniques and approaches for structured and unstructured data such as data wrangling, batch processing, data mining, data enhancement, data harmonisation and data standardisation.
- Conduct data cleansing of noisy, incomplete data or data with established data quality issues using experience of relevant tools and programming languages.
- Utilise knowledge of how the interaction of multiple data issues, such as missing data, outliers, multiple values and meaning of data, impact analysis and identify an appropriate cleansing approach.
Level 4 (Ensure)
- Extensive and/or in-depth knowledge of best-practice data cleansing techniques and approaches for a variety of data types such as data wrangling, batch processing, data mining, data enhancement, data harmonisation and data standardisation.
- Extensive experience in utilising these techniques and approaches for cleansing complex, large, incomplete data or data with established quality issues.
- Ability to design and implement data cleansing approach for complex data and projects.
Level 5 (Strategise)
N/A