Good Quality Open Data - Be Consistent

Be consistent

The golden rule for open data that’s useful is consistency.

Consistent filenames
Consistent filenames

Consistency means picking a naming strategy for your files then sticking to it. This makes it easy to spot that files are missing or out of place.

Consistent headers
Consistent headers

You’ll also want to keep your table headers the same for each new file so that anyone using your data, for example combining files, can do that easily. Changes to your headers break code and make your files harder to use.

Consistent content
Consistent content

Finally, keep your contents the same. ’12’ and ‘twelve’ aren’t the same thing. This makes it harder to use the information for analysis [see Tidy Data by Hadley Wickham].

Tip: If you can’t do maths on it, it’s text not a number.

See all the tips in one place: Good Quality Open Data

Published by

ekoner

Consultant: Data Science & Data Analysis || Data confusion → Organisation insight || I make the complex, simple

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s