Data Cleansing

Knowledge and ability to fix or remove incorrect, corrupted, incorrectly formatted, duplicate or incomplete data within a dataset. It includes fixing structural errors, filtering unwanted outliers, handle missing data and validating data.

Proficiency Level

Level 1 (Follow)

N/A

Level 2 (Assist)

N/A

Level 3 (Apply)

  • Experience in utilising a number of data cleansing techniques and approaches for structured and unstructured data such as data wrangling, batch processing, data mining, data enhancement, data harmonisation and data standardisation.
  • Conduct data cleansing of noisy, incomplete data or data with established data quality issues using experience of relevant tools and programming languages.
  • Utilise knowledge of how the interaction of multiple data issues, such as missing data, outliers, multiple values and meaning of data, impact analysis and identify an appropriate cleansing approach.

Level 4 (Ensure)

  • Extensive and/or in-depth knowledge of best-practice data cleansing techniques and approaches for a variety of data types such as data wrangling, batch processing, data mining, data enhancement, data harmonisation and data standardisation.
  • Extensive experience in utilising these techniques and approaches for cleansing complex, large, incomplete data or data with established quality issues.
  • Ability to design and implement data cleansing approach for complex data and projects.

Level 5 (Strategise)

N/A