Data, Design and Uncertainty

Originally published on github: Data, Design and Uncertainty

Something that’s been nagging at me is how much materials I’m reading (on data and design or data and storytelling) focus on working in an environment of certainty: goals are clear, missions are understood, time pressures seem nonexistent.

Data should serve, support and enable, so how do you navigate that minefield of uncertainty?

Let’s take a look at some approaches from people working in uncertain environments.

Understand the obstacles, yours and theirs

“Your competition is any and every obstacle your customers encounter along their journeys to solving the human, high-level problems your company exists to solve” says Tara-Nicholle Nelson in Obsess Over Your Customers, Not Your Rivals.

Nelson, who has a background in marketing, is reflecting on lessons learned from their approach at MyFitnessPal. The article is about rethinking competitor analysis but resonates for data people working with organisations where uncertainty is the main certainty.

Tara-Nicholle recommends using information (and data!) to discover roadblocks including:

  • user data
  • surveys
  • ethnographic research
  • online listening
  • subject matter experts
  • third-party data

Handily, the article includes some examples from MyFitnessPal to give some context on how to successful discover and remove roadblocks.

In my role as a consultant, I have some preliminary (but untested thoughts) on roadblock. The competition here for my customers is that they know my services (data + storytelling + design) are important but they can’t always articulate:

  • their goals for the project
  • how they’ll know the project has been successful
  • how the project supports their mission
  • their own data ecosystem and lack the means and framework for doing so (thanks to Ade for highlighting this)

Time (I believe) is another obstacle my customers face. Getting the right people in the room and giving the project enough resources and support to thrive can be a challenge. These are assumptions to test to help me understand my customers better and better help them get what they want out of our relationship.

It’s been useful to learn about obstacles, but I think there are more tricks of the trade, so let’s move on.

Ask the right questions to frame the problem

Asking the right questions to frame the problem is Ben Holliday’s recommendation. As a Chief Design Officer, the focus of this article is on designers, however, “framing the problem is something that teams really struggle with” resonates for anyone working with uncertainty.

Ben Holliday’s is to ask the questions:

  • Why are we doing this work? or What is our motivation for building this product or service?
  • Who are our users? or Who do we think would need to use this product or service?
  • What outcome will users get from this service? or What problem will it solve for people?
  • What outcome are we looking for? or What problem will it solve for our organisation?
  • What are our key metrics? or What do we need to measure against these outcomes?

Ben also covers how to ask the questions, starting from motivations then digging deeper into alignment and finally measuring success.

The questions could be even more useful with some examples and context: why am I asking this and what can we do once we know? I think of these situations like having a giant puzzle — answering each question gives you a piece to build up the picture so you can solve the puzzle. The questions are variations on those found in books on agile, design and user experience (UX), which may be why they are summarised.

The client is the expert, put them front and centre

“It’s not as sexy because it now places the client as the expert instead of the consultant” says Zeroth Labs. The Singapore-based start-up blends anthropology, data, and design with experimentation as a new approach to development consulting.

Like Tara-Nicholle Nelson, research is high on the agenda. Zeroth Labs’s describe their approach in Using anthropology, data and design thinking to disrupt development consulting as:

  • Conducting their own ethnographic research
  • Applying behavioral science
  • Testing, testing, testing

Essentially Zeroth Labs is understanding their customer’s customer using these methods to “build the capability for developing countries to make life better for citizens”. Another approach that could be useful for people working with developing companies.

Help your customers help themselves, by giving you advice

“Teach a man to fish and he’ll know how to fish — but get him to teach others how to fish, and he might actually get on with some damned fishing.” is Oliver Burkeman’s analysis of psychology research, Advice versus choice.

In Why it’s wise to give people advice, Oliver reflects that “giving advice reacquaints us with the knowledge we possess, which instills confidence, which motivates action”. Unfortunately, as the original research is behind a paywall, I can’t be certain if the interpretation is justified, but let’s go with it.

If it’s the case that asking our customers for advice about their areas of expertise will make them more bought into the project, build confidence and skills, then this sounds a lot like co-creation. One thing I can tentatively acknowledge is that as a specialists in data, my customers have the expertise, or can at least point me to the people who do.

In my projects, like the CRM and digital transformation project for Freedom Studios, I’ve encouraged a collaborative approach. We started with me leading the way, transitioned to me facilitating, then ended with my customers telling me what they needed. By the end of the project, we had clarity about the goals and capabilities of the CRM, new skills in the organisation and confidence in the system.

Co-creation works when customers are willing to put the time in and use the outputs outside the project. If whatever lessons you’ve learned are put aside after the project, it’s unlikely any changes will stick.

And in conclusion

Here’s what I learned today about approaches to getting clarity as a data professional working in uncertain environments:

  1. Understand the obstacles, yours and theirs
  2. Ask the right questions to frame the problem
  3. The customer is the expert, put them front and centre
  4. Facilitate them to advise you
  5. Collaborate and experiment to find the right approach

How do you deal with uncertainty? Comment below or tweet me.

How can open data help our city?

Cities are important urban areas where many of us live, work, study and raise families. There’s a push to make cities smarter.

Why? A smart city is a connected city – a place where Anna can get around easily because she can see her bus and train schedules in real time. A smart city is effective, integrated and innovative – a city that embraces Gina’s startup and makes it easy for Ali to find the best school for his daughter.

A smart city uses open data. 

Open data is knowledge for everyone; it’s information that can be shared with anyone for any purpose without restriction. We aren’t talking sensitive or personal information, we’re talking about the data that drives decisions and is the lifeblood of a city’s civic landscape.

Open data helps cities connect people and organisations, share information, create new tools, products and services. Using open data well means smarter cities. This series covers everything you need to know about open data for smarter cities. First, let’s get you started.

Where do I start with open data for my city?

It’s hard to know where to start with open data when you’re a city. With many moving parts, stakeholders and a huge wish list, getting started can be daunting. Here’s the key things to consider:

  1. Start with why: Have some idea of why you’ll need open during ata and how you’ll measure the success of your initiative. It doesn’t have to be perfect at the start but it will keep you on track.
  2. Think about delivery: You’ll need a way to deliver open data to the people and organisations that will use it.  There are several platforms and tools available. The right one will play nicely with your existing platforms. Remember, don’t reinvent the wheel!
  3. Find your audience:  Get to know who could use you open data so you can work out their needs. Check FOIA requests – that tells you what people want to know!
  4. Work with your community: You’ll need a community engaged around data, including developers, citizens & businesses – back to those FOIA requests and materials you already publish. What are people interested in? Who are these people?
  5. Make sure your open data all plays well together: Open data doesn’t have to be perfect. Start where you are and keep improving. Think about what’s needed to connect data together from the start: how would you connect data on parks to data on air quality.
  6. Think value, value, value: Think about the benefit of sharing and connecting data. Local government departments will probably be biggest user and benefit the most from connected open data, so keep them onside.

For more resources, the Open Data Institute is a good place to start.

 

 

3 data wrangling lessons from Arts Council England national portfolio

What questions can we ask? Will this data help solve our problem? Can we use this algorithm or that one?

Welcome to data wrangling 101. Exploring our data before we dive in and start playing with it or reshaping it means more productive data science or data analysis. If you’re lucky, you know enough about the domain to understand the quirks a dataset throws your way or you have someone to badger. On your own with an unfamiliar dataset? That happens too. So here’s 3 lessons from wrangling the Arts Council England 2018-2022 national portfolio dataset.

First, a little bit about Arts Council England:

Arts Council England is a public body supporting arts and culture in England. It is funded by public funds from the UK government and the National Lottery. Between 2015 and 2018, it will invest £1.8 billion in arts, museums and libraries. The funds will support art and culture experiences including theatre, digital art, reading, dance, music, literature, crafts and collections.

Why on earth are we interested in the national portfolio dataset?

The National Portfolio programme supports organisations considered by Arts Council England to represent the best of global arts practice. Funding is given over multiple years, currently 3. Between 2015 and 2018, £1 billion will be invested in 663 organisations.

That’s a lot of money and  lot of prestige! I’m still exploring the dataset but here’s what I’ve learned so far.

Lesson 1: Test your assumptions

My first assumption was a bust. One thing it’s usefuk to know is “Which fields make the data unique?”. This helps us report on stuff like “How many grants were issued by the Arts Council?” and “To how many organisations?”. It was easy to jump in at first glance and say the organisation’s name, the Applicant Name. Unfortunately, an organisation can be awarded under multiple funds.

Ah OK, so maybe Applicant Name and the type of fund, the Funding Band? At first that worked great but then 1 rogue entry popped up… It turns out that most of the time, an organisation gets 1 grant, sometimes 2 but Tyne & Wear Archives & Museums got 3!

Arts Council England - Anomalies
Arts Council England – Anomalies

The upshot? Test your assumptions. This might be an anomaly or it might be legitimate. We can’t always tell, so we’re going to have to ask.

Lesson 2: Don’t be afraid to ask

📢 Data isn’t a perfect reflection of the real world.

When we collect, share or use data, we curate it. We make decisions about what and how much detail to include. We can’t assume that data is perfect, so sometimes we have to ask the hard questions like “Why was Tyne & Wear Archives & Museums awarded 3 grants?”

Other oddities cropped up in the data that needed that human touch. Arts Council England share a lot of geographic information. Check out what you can find:

  • Local Authority
  • ACE Region
  • Area
  • ONS Region

They’re all slightly different. Some are clearly internal like ACE Region and others are official geographies like ONS Region. But what about Area? I was stumped, so I asked the very friendly Arts Council England support team.

Here’s what I heard back:

Dear Edafe,

I have heard back from our Digital Team and they advised that the ‘area’ column on the sheet attached by the person making the enquiry refers to Arts Council areas, these are:

  • London – comprising NUTS 1 region of London
  • Midlands – comprising NUTS 1 regions of East Midlands and West Midlands
  • North – comprising NUTS 1 regions of North East, North West and Yorkshire and the Humber
  • South East – comprising NUTS 1 regions of East of England and South East (excluding the county of Hampshire, and Unitary authorities of Isle of Wight, Portsmouth and Southampton)
  • South West – comprising NUTS 1 regions of South West plus the county of Hampshire, and Unitary authorities of Isle of Wight, Portsmouth and Southampton

More information on the areas can be found here: http://www.artscouncil.org.uk/about-us/your-area

The organisations labelled National are certain Sector Support Organisations with a national remit.

The NUTS 1 region which each organisation is located in can be found in the column headed ‘ONS region.’

Hope this helps.

Ah, that’s really handy to know. If we need to, we can map Area to Nomenclature of Units for Territorial Statistics (NUTS) regions or decide if we know enough about geography from other columns and can ignore Area.

The upshot? Don’t be afraid to ask. Making assumptions can come back to bite you. If you can, ask someone who knows so you understand their design choices. You don’t have to do this for every single column, focus on the ones that are most likely to solve your problem. You can also come back as you iterate. Remember, it’s a cycle.

Lesson 3: Remember it’s a cycle

There are a few methodologies,  good practices and guidelines that help you punch through the worst bits of data wrangling so you can get to the good bits. You might be data mining or predicting or deep learning. No matter your intended application, you’ll most likely be iterating – going around in a cycle of try, test, understand till you have a good enough answer.

When you first start working with data it can seem overwhelming. Remembering it’s a cycle will keep you sane. You might miss things the first time, that’s OK. That’s why we test and iterate.

In conclusion

I started exploring the the Arts Council England 2018-2022 national portfolio dataset to answer a friend’s question and then to streamline my practice. Along the way I made assumptions, backtracked, tried data visualisations that didn’t work and rolled my eyes – a lot. Each iteration, I learned something new and useful about the story of national portfolio funding for the next 3 years. I hope you have too.

Arts Council England - National Portfolio Dataset - Column Count
Visualising Column Count

Featured image: Arts Council England – Sign on the door by Howard Lake (CC BY-SA 2.0)

What’s there? What’s missing? Quick guide to understanding data completeness

When we talk about coverage or completeness, we want to know a couple of things. First, what’s there? and second, what’s missing? We want to survey the land and get a short but complete overview. How do we do this? We look at our data from more than one angle.

A map is not the territory…

Data is a tool

It represents something we’re interested in. That thing could be cars, loans, flowers, or cups. Whatever it is, we want to record or review information about it. Knowing can help us sell the right cars, guide our clients to the right loans, report on the state of the flower industry or manufacture more instragrammable cups.

A white cup in focus on a table with a blue tablecloth in the background
Reality

Data describes concepts

It represents ideas we’re sharing. There are many styles and shapes of cups in the world, but the icon of a cup is pretty much universally understood. I may not know the style or shape of your cup but I understand “cup-ness”.

Cup Icon by Design Revision
Concept

How does this help us understand completeness?

Let’s take a step back. We’re unlikely to be interested in every cup that ever existed, so we have a scope. Let’s say we’re interested in cups we make and sell. Our universe of cups is limited to just those cups.

We want to know a few things about our cups: the materials we used, how large or small they are, that sort of thing. So we decide on headings or columns for each of the attributes (information about cups) that we’re interested in.

cup schema
Schema

This list of things about cups is the schema. It’s a template that describes what we want to know about cups. It isn’t our data on cups (we’ll add that under the headings) but it gives us some direction about what to record.

Unlike the concept of a cup, the schema of a cup isn’t intuitive. We’d struggle to instantly recognise “cup-ness” by looking over this list. We’ve taken reality, abstracted it to a concept then made that into a schema which is the container for our data.

So back to completeness. When we talk about completeness, we could be talking about the concept or the schema. These are different questions but together gives us insight into the state of our cup data.

  • Concept – How many cups are we reporting?
  • Schema – How many cup attributes are we reporting?

Concepts & Schemas: How are they different?

In general, when we talk about a concept of a cup, we have a list of information we need to understand “cup-ness”. So we may agree it’s not a “cup” unless we have these things: cup#,  name and type. That’s close enough to our concept of a cup that we can ask questions about the number of cups. This is the sort of information we use to plan campaigns, make strategic decisions and launch new cups to the market.

In reality, we don’t record everything diligently. We miss things out for a host of reasons. This is even more obvious when we aren’t recording the data ourselves.

Data has gaps

Understanding where those gaps are is important. Gaps affect how we report on concepts. If we’re missing cup names, that reduces the number of cups we report. We use information about gaps to improve our data collection so that we can make better strategic and planning decisions.

Takeaway

The upshot? To understand how complete our data is, we survey our data landscape in two ways: by concepts and by schema. We can count conceptual cups or count cup attributes to find what’s there and what’s missing. The two strands help us understand what’s going on in our data.

  • Some things (or attributes) are more important than others, they map to concepts;
  • Some things are conceptual (“cup-ness”) others are schematic (the cup attributes);
  • Some things are more useful for planning and strategy (concept) and others for improving data quality (schema).

 

 

 

 

Legacy Code Rocks: Open Data with Edafe Onerhime

I just loved chatting with Andrea on the Legacy Code Rocks! podcast. Listen: Open Data with Edafe Onerhime

Edafe Onerhime is a consultant on Data Science and Data Analysis who has over 20 years of experience answering difficult questions about open data. She has helped governments, charities and businesses make better decisions and build stronger relationships by understanding, using and sharing their data. In this episode, we discuss the history of open data, its importance in building communities and its similarities to open source and open science.

Visualisation as collaboration

How can we use visualisation as a tool for collaboration? Insight is best when shared; when every stakeholder not only understands the end result, they’re informed about the context and impact. In a nutshell, they understand “What does this mean?”.

This is a proposal I submitted to Joining The Dots, a symposium to share data visualisation knowledge and techniques.

Visualisation is communication. Making communication clear, concise and unambiguous promotes collaboration and discussion around complex and nuanced data. In my talk, I cover visualising data to promote collaboration and as an antidote to reams of text.

I focus on examples such as visualising metadata like the 360Giving data standard to help adopters understand how the standard fits together and the story they can tell with their data.

360Giving Data Schema Visualised