Cleaning up dirty data

Just came across this – Google Refine – nice example of a product for cleaning up inconsistencies in data.  Unfortunately part of the linked open data movement is dealing with the realities of inconsistencies in data.

There are lots of products out there to assist in data cleansing efforts.  Thought this video gives a nice, practical example of the types of issues and how they can be addressed.  (Brought to my attention by @BarbaraStarr on twitter).

Enhanced by Zemanta