15 February 2012
Messy girl

Giving students dirty data

Dirty data is real data as it is collected before someone gets hold of it and takes out the tricky bits. You won’t find dirty data in textbooks. Dirty data is what real researchers have to deal with. And even amateur researchers and students doing real-life projects will have to deal with dirty data. Yet not much is said about dirty data, and what to do with it. Elements of dirty data Mistakes – people put down the current year for their date of birth, give their weight in the wrong unit, put an extra decimal point. Missing data – […]