Good Quality Open Data - Be Consistent

Be consistent

The golden rule for open data that’s useful is consistency.

Consistent filenames
Consistent filenames

Consistency means picking a naming strategy for your files then sticking to it. This makes it easy to spot that files are missing or out of place.

Consistent headers
Consistent headers

You’ll also want to keep your table headers the same for each new file so that anyone using your data, for example combining files, can do that easily. Changes to your headers break code and make your files harder to use.

Consistent content
Consistent content

Finally, keep your contents the same. ’12’ and ‘twelve’ aren’t the same thing. This makes it harder to use the information for analysis [see Tidy Data by Hadley Wickham].

Tip: If you can’t do maths on it, it’s text not a number.

See all the tips in one place: Good Quality Open Data

Published by


Consultant: Data Science & Data Analysis – Making the complex, simple

One thought on “Be consistent”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s