Skip to content

progrexor/SainsburysTest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SainsburysTest

DPP Technical Assignment

Data Analyst

  1. Use a tool of your choice to create a test data set containing at least 1000 tuples of email addresses, phone numbers, and names. The data should show evidence of at least 5 distinct data quality problems per column, which you will address with a data transformation script.

  2. Write a transformation script (e.g. Python, R, SAS) to clean up your test data. Each of the 15 problems you identified should be clearly identifiable in the code. The output should be a CSV file.

==================================================================================================================

Test data

The test data is located in the data folder

To run the code

sbt run

Results

The results of the run will appear in the data folder. There will be two files one is cleaned data, another file is rejected rows.

http://progrexor.github.io/SainsburysTest/

About

This is Technical Assignment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages