Continuous Feedback: A generic data science pipeline

A few years ago, I worked with an organisation that sells automotive intelligence to streamline the way they got insight from data. I came up with a generic data pipeline to explain to the board how their new data science process could work. It was a hit!

Visuals are a great way to explore a concept and explain a process that could otherwise lose folks along the way.

The key to a good data pipeline is it’s part of an overall process (not shown here) where you know what the problem is, why it’s important to solve it and that data is definitely going to help.

The pipeline focuses on continuous feedback – feedback at every key stage of the process. This could be to the problem owner, other teams, or any other stakeholder to keep them informed and fold their feedback back into the pipeline.

So, here’s my blast from the past – feel free to substitute out Domain Data Science step for other processes that make sense or drop it altogether; whatever works for your situation.

Generic data pipeline v1
Generic data pipeline v1

How do I manage information better? 50 tips for humanitarian information

50 humanitarian information management tips that apply to just about every human.

Things I love:

  • The colour: Makes the slides feel instantly less “techie”.
  • The layout: A layered approach to onboarding.
  • The concept: Specific to the humanitarian domain means it can focus on what’s needed there the most.
    • Generic or broad guides can suffer from not being specific enough to be really useful.

Hat tip to my colleague Rory Scott:

 

What do people need from open data in the health sector

Open data in the health sector: Users, stories, products and recommendations is a new report from Giuseppe Sollazzo and David Miller. It asks “What do people need from open data in the health sector?” and sets out some clear recommendations for NHS England.

In it, I reveal the confusion finding how many hospitals there are in the UK. So many public bodies publish their own, slightly different lists. As someone who supports people sharing who they’ve given money to, I’d like to see one single list with a hospital’s identifying number. I’d like that list  to be complete, accurate and kept up-to-date so I can recommend it to people preparing open data

Read the two key recommendations and thoughts from other people who use Open data in the health sector.

Grazing the Open Data Skills Framework

Where are you on your Open Data journey?

From novice to expert, the Open Data Institute’s Open Data Skills Framework has evolved to help guide your Open Data learning experience. With everyone starting at the Explorer stage, learning is balanced so you gain skills and experiences without the fatigue of too much information.

As a trainer and foodie, this subtle tension was familiar; it whetted my appetite to explore a foodie approach to getting the best out of the Open Data Skills Framework.

Sitting comfortably? Let’s begin.

Explorers: an Open Data Explorer has a basic understanding of open data. They can define it, point to examples or case studies and explain how it can be used to create change.

Serving Suggestion

The Amuse Bouche

Focus on mini case studies

Explorers have just started on their open data journey. They may be enthusiastic or apprehensive, or somewhere in-between. New information and ideas may need to be integrated and mulled over.

For explorers, I recommend bite sized case studies to entice them to learn more and clear signposting to where to get more information.

Suggestions
  • The 24 of open data – how open data is changing how we live, work and learn
  • Open data in numbers – a look at open data adoption
  • Crouching tech, hidden data – the open data you use every day

Strategists: an Open Data Strategist is someone who integrates open data into a strategy or manages an open data project. They have the planning and management techniques to drive forward an open data initiative, and they understand the challenges inherent in this process.

Serving Suggestion

The Starter

Focus on Methodology

Strategists know the drill, now they want to deploy it. For strategists, I recommend tips on how to determine what will work for their strategy or project, and what won’t.

This is less about open data itself and more about managing the people, projects, processes and pitfalls that come with introducing new ways of thinking.

Suggestions
  • Open Data Policies and how to get them right
  • Black-box thinking with open data – experimenting your way to smoother adoption
  • Start with why – is Open Data really what you need?

Practitioners: Open Data Practitioners have the practical skills necessary to conduct basic operations on an open dataset. They get hands-on with the data, and are familiar with the tools and techniques necessary to manage and publish an open dataset.

Serving Suggestion

The Taster Plate

Focus on tooling and techniques

Practitioners may range from reluctant to enthusiastic adopters of Open Data, but they want to get the job done.

For practitioners, I recommend revealing what tooling and techniques are out there and what for, including what’s new, what’s hot and what’s not.

Suggestions
  • From understanding to deployment – getting to useful open data using CRISP-DM
  • Automate, Improve, Optimise – how to work smarter with open data
  • Quick and dirty – rapid techniques for open data insight

Pioneers: Open Data Pioneers apply their data knowledge to their sector to solve challenges. They can point to sector-specific case studies, identify future trends in the sector and understand the data challenges specific to their sector.

Serving Suggestion

The Pot Luck

Focus on future trends and sharing knowledge

Pioneers are veterans who’ve tackled the challenges of open data, so they are ideally placed to look at where new challenges and opportunities lie.

For pioneers, I recommend a cross-pollination of ideas, challenges and opportunities from other sectors. Here, a focused conversation and guided workshop around where open data challenges lie may encourage contributions from experts and build a shared understanding of challenges.

Suggestions From the provocative:

  • What has open data ever done for us?
  • What is your open data return on investment?
  • Open data – has it failed?

To the exploratory:

  • What next for open data after Brexit?
  • What lessons can open data learn from open science?

The Open Data Skills Framework provides an ideal opportunity for learners to assess where they are and where they want to be on their open data journey. It also provides a landscape for trainers to adapt, create and innovate around sharing open data skills and techniques.

I hope to deliver one or more of these sessions at the ODI summit and look forward to continuing my own open data journey. Where are you on your Open Data journey?

A Generic Common/Core Data Template – In Trello!

Generic Common/Core Data Template in Trello
Generic Common/Core Data Template in Trello

You know how sometimes you get a brainwave so clever you could put a tail on it and call it a weasel? [Apologies to Blackadder!]

After this piece on Taming Common/Core Data – in 3 simple steps with Artefact Cards, I wanted to make the concept even more grounded by taking away some of the abstraction that tends to come with the territory. A conversation with a colleague later and I’d developed an ideal visual prototype using Trello to organise core data.

The concept is similar to the Artefact cards except:  the board stands in for the box, the lists stand in for the packets and the cards stand in for … the cards. And if you’re wondering what on earth this is all about, go read the piece above.

The Generic Common/Core Data Template is my attempt to wrangle the typical data an organisation captures or uses that is essential for survival: informaiton it must have to operate, comply with regulation or legislation, and report on its activities that is used throughout the organisation.

How do you use it? Copy the board and see what works for your organisation. Remember to ask of each field you consider:

  • What is it for?
  • Is it essential – why?
  • Is it used everywhere in the organisation?