How to successfully work with large data sets.

By Franco Havenga

“Journalists already know how to ask the right kind of questions when approaching data,” maintains Lailah Ryklief at a workshop on How to Mine Data for Storytelling at this year’s Global Investigative Journalism Conference (GIJC) in Johannesburg.

Ryklief, a curriculum developer, trainer and mentor at OpenUp’s Data Literacy Programme facilitated a methodology workshop on Saturday morning to show journalists how to successfully mine data. Data mining is a process of discovering patterns in large data sets.

“Data storytelling is about being hyper aware of every single action we perform and asking ourselves why we are doing it,” said Ryklief.

“Data is abundant and the minute you dive into it, you find a million things you could do.”

Julia Renouprez, a project manager and special data analyst at OpenUp emphasised this point.  “When you are working with data it is so easy to get side-tracked. You can get lost and lose days and weeks processing large data sets.”

Emma Nawanje, a data-journalist from Uganda said she was attending the workshop because there was very little data available in Uganda. “I wanted to find out how to access data and work with it to create a story.”

Janato Jenato, a journalist from Mozambique, commented on the language barrier because they speak Portugese and most of the documents are in English. “We focus on searching for data on the internet, especially working with other databases beyond Google.”

Part of data mining is making the work accessible to the user by building a dashboard to filter through the information.

This is a skill that most journalists do not have and Ryklief feels internet tutorials are the easiest way to learn.  “In South Africa, journalists tend to use YouTube, not Google,” she added.

“Most of the information and help is sitting in blogs and forums, because that is the preferred mode of sharing in the developer-culture.”

OpenUp is a civil technology organisation and is based in Cape Town. They liberate data as part of the open data movement by making it accessible to citizens.

PHOTO: Lailah Ryklief answers a question about how to mine data for storytelling.                   Photo: Kayla De Jesus Freitas