How do you tame the cost and risk of hoarding data? What does hoarding data even mean? And what does data hoarding have to do with data strategy? Join me for a recap of #CDOSummerSchool #DataLiteracy.
We’re at the penultimate class of Chief Data Officer Summer School with Caroline Carruthers and Peter Jackson in collaboration with Collibra. This free masterclass helps data leaders make change happen and deliver the value their organisation needs around data.
While I’m sad I once again missed the live class, the recording and community have made this recap possible and dare I say delightful. Not only is the LinkedIn group and Data Citizens forum buzzing with questions, tips and ideas, the topic aligns closely with my love of being organised. It’s a match made in KonMari heaven! So yes, this recap will be about sparking joy as much as it is about delivering value. Buckle up.
This week, we learned about Caroline’s ritual of frankly alarming Sunday television and insights into why hoarding data costs your business time, money and increases risks — a no-no for a data mature organisation. Data hoarding comes in many flavours. It manifests when we keep all of the data forever, have lots of duplicate files hanging around with the same data. It’s clear when we don’t know why we store data in the first place, or what value it brings to our organisation.
The reasons we keep hold of data (data retention) longer than we need to include:
- Not having a data retention policy or not applying it.
- Being afraid of needing the data at some point in the future
- Not monitoring who is using data and what for
- Not understanding what data we actually hold and documenting that
There are lots of reasons and a data therapist, much like a decluttering expert, can help you pinpoint why. There may even be more than one reason especially in a large organisation. How we tackle data hoarding will very much depend on why it happens. We want to be careful not to introduce more risk by getting rid of data we really should keep, while retaining only the data that’s useful, useable and actually in use.
So why is data hoarding a bad thing? Firstly, it costs money to store data, especially if you have a ton of duplicate Excel files. Remember I mentioned “data smells”, little red flags that tell you there are problems with how data is used? Data hoarding and duplicates are a particularly smelly problem.
So not only are businesses spending money storing data they don’t need to, they might not know what data is most valuable. This happens when data is slightly changed and worked on, then stored in shared drives, attached to emails, stored on laptops, and in lots of other places.
Then there’s risk. GDPR compliance is a common one. Do you have personal information sitting where it shouldn’t be? Has Dave stored an Excel file with personal data somewhere in case he might need it? Risk also comes in the shape of opportunity cost — when it’s hard to find the right data, that wasted time and effort is a risk. There’s also a risk of someone using the wrong data because there’s nothing written down about the data (or metadata) or it’s stored in multiple places (there’s no single source of the truth).
Fixing data hoarding, like other changes a Chief Data Officer will focus on, needs to be done as part of your data strategy. It could be you start with a quick win, identify data owners, put together a broad policy or even implement a “good enough” solution once you understand the root cause. The data maturity assessment and a survey of your storage, emails and other places people share files (Slack? Teams?) should help unveil the true cost of sitting on data like Smaug on a pile of gold. Whatever you do, work with the organisation and help them understand the true cost of data hoarding. Remember data is a risk as well as an asset
So, we learned that data hoarding does not spark joy. Organised data reduces risks, saves you time and money, and like a well organised space, sparks joy. I love Caroline’s call to focus your resources on the “crown jewels’ of data and not “yesterday’s newspaper”.
Finally, remember that you may need to think about where your data lives. Caroline mentioned having a well organised “garage” with shelves and labels for data you don’t use frequently while having easy-to-reach shelving and storage in your “house” for data you use a lot. Peter mentioned his love of open data and how data for the public good should be available in this way to break down silos and reduce data hoarding within organisations.
In the next section, Peter covered a real life data strategy he’d developed earlier. The data strategy was based on centralising to increase speed and assurance. The key features I loved included:
- Clarity, clarity, clarity: Always being clear about what is being pitched and who it is aimed it.
- Simplicity: Peter did the hard work to make the strategy clear, simple and concise. All the implementation details were saved for appendices.
- Direction of travel: Key milestones and concepts were shared so overall, everyone knows the direction of travel without needing a Gantt chart. The direction of travel also aligned with the business strategy.
- Capability and responsibility: It was useful to see the capabilities for using data explored. Here Peter split the DIKW pyramid by who is responsible for what — the business for Knowledge and Wisdom, the data specialists for Data and Information. Roles and responsibilities of the key data leaders were also visualised.
Seeing a real life data strategy and understanding what Peter would do differently today was helpful in understanding how data strategies evolve, change and align with the business strategy.
In the final part of the session, we had a demonstration of how Collibra supports data intelligence. In my previous role at a large debt charity back in 2013, I’d evaluated Collibra and found it valuable for managing the data dictionary — our vocabulary for data which should reflect the business language.
From finding that 69% of organisations say they struggle to turn data into useful insight, to understanding how Collibra provides a unified platform, the demonstration revealed how much Collibra has grown in the 7 years since I’d seen it. Poor data quality and inconsistent practices are a real challenge for many organisations, so levelling up to tools (other than Excel!) can speed up delivering value and changing behaviour around data.
Once again, a rich week of learning and support that’s invaluable to the work we’re doing right now. Join me next week (actually this weekend!) for my recap of the final class.