Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

August 16 2012

08:27

Hyperlocal Voices: Matt Brown, Londonist

The fifth in our new series of Hyperlocal Voices explores the work done by the team behind the Londonist. Despite having a large geographic footprint – Londonist covers the whole of Greater London - the site is full of ultra-local content, as well as featuring stories and themes which span the whole of the capital.

Run by two members of staff and a raft of volunteers, Editor Matt Brown gave Damian Radcliffe an insight into the breadth and depth of the site.

1. Who were the people behind the blog?

Everyone in London! We’re a very open site, involving our readers in the creation of many articles, especially the imagery. But more prosaically, we have an editorial team of 5 or 6 people, plus another 20 or so regular contributors. I act as the main content editor for the site.

We’re more than a website, though, with a weekly podcast (Londonist Out Loud, ably presented and produced by N Quentin Woolf), a separate Facebook presence, a daily e-newsletter, 80,000 Twitter followers, the largest FourSquare following in London (I think), a Flickr pool with 200,000 images, several e-books, occasional exhibitions and live events every few weeks. The web site is just one facet of what we do.

2. What made you decide to set up the blog?

I actually inherited it off someone else, but it was originally set up as a London equivalent of certain sites in the US like Gothamist and Chicagoist, which were riding the early blogging wave, providing news and event tips for citizens. There was nothing quite like it in London, so my predecessor wanted to jump into the gap and have some fun.

3. When did you set up the blog and how did you go about it?

It dates back to 2004, when it was originally called the Big Smoker. Before too long, it joined the Gothamist network, changing its name to Londonist.

We now operate independently of that network, but retain the name. It was originally set up in Movable Type publishing platform, but we moved to WordPress a couple of years ago.

4. What other blogs, bloggers or websites influenced you?

Obviously, the Gothamist sites originally. But we’re now more influenced by the wonderful ecosystem of London blogs out there, all offering their own take on life in the capital.

The best include Diamond Geezer (an incisive and often acerbic look at London), Ian Visits (a mix of unusual site visits and geeky observation) and Spitalfields Life (a daily interview with a local character). These are just three of the dozens of excellent London sites in my RSS reader.

5. How did – and do – you see yourself in relation to a traditional news operation?

Complementary rather than competitors. We cover three or four news stories a day, sometimes journalistically, but our forte in this area is more in commentary, features and reader involvement around the news.

And news is just a small part of what we do — most of the site is event recommendation, unusual historical insights, street art, food and drink, theatre reviews and the like. As an example of our diversity, a few months back we ran a 3,000-word essay on the construction of Hammersmith flyover by an engineering PhD candidate, and the very next item was about a beauty pageant for chubby people in Vauxhall.

6. What have been the key moments in the blog’s development editorially?

I think most of these would be technologically driven. For example, when Google mapping became possible, our free wifi hotspots and V2 rocket maps greatly increased site traffic.

Once Twitter reached critical mass we were able to reach out to tens of thousands of people, both for sourcing information for articles and pushing our finished content.

The other big thing was turning the site into a business a couple of years ago, so we were able to bring a little bit of money in to reinvest in the site. The extra editorial time the money pays for means our output is now bigger and better.

7. What sort of traffic do you get and how has that changed over time?

We’re now seeing about 1.4 million page views a month. It’s pretty much doubling year on year.

8. What is / has been your biggest challenge to date?

Transforming from an amateur site into a business.

We started taking different types of advertising, including advertorial content, and had to make sure we didn’t alienate our readers. It was a tricky tightrope, but I’d hope we’ve done a fairly good job of selecting paid-for content only if it’s of interest to a meaningful portion of our readers, and then making sure we’re open and clear about what is sponsored content and what is editorially driven.

9. What story, feature or series are you most proud of? 

I’m rather enjoying our A-Z pubcrawl at the moment, and not just because of the booze.

Basically, we pick an area of town each month beginning with the next letter of the alphabet (so, Angel, Brixton, City, Dalston, etc.). We then ask our readers to nominate their favourite pubs and bars in the area, via Twitter, Facebook or comments.

We then build a Google map of all the suggestions and arrange a pub crawl around the top 4.

Everyone’s a winner because (a) we get a Google-friendly article called, for example, ‘What’s the best pub in Farringdon?‘, with a map of all the suggestions; (b) we get the chance to use our strong social media channels to involve a large number of people – hundreds of votes every time; (c) the chance to meet some of our readers, who are invited along on the pub crawl, and who get a Londonistbooze badge as a memento; (d) a really fun night out round some very good pubs.

The next part (G for Greenwich) will be announced in early September.

10. What are your plans for the future?

We’re playing around with ebooks at the moment, as a way to sustain the business directly through content. We’ve published a book of London pub crawls (spotting a theme here?), and a history of the London Olympics by noted London author David Long. Our next ebook will be a collection of quiz questions about the capital, drawn from the numerous pub quizzes we’ve ran over the years.

Basically, we’re looking to be the best organisation for finding out about London in any and every medium we can get our hands on.

August 08 2012

12:36
12:36

August 07 2012

11:51

A case study in online journalism part 3: ebooks (investigating the Olympic torch relay)

8000 Holes - book cover

In part one I outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I explained how verification, SEO and ‘passive aggressive newsgathering’ played a role. This final part looks at how ebooks offered a new opportunity to tell the story in depth – and publish while the story was still topical.

Ebooks – publishing before the event has even finished

After a number of stories from a variety of angles I reached a fork in the road. It felt like we had been looking at this story from every angle. More than one editor, when presented with an update, said that they’d already ‘done the torch story’. I would have done the same.

But I thought of a quote on persistence from Ian Hislop that I’d published on the Help Me Investigate blog previously. “It is saying the same true thing again and again and again and again until the penny drops.”

Although it sometimes felt like we might be boring people with our insistence on continuing to dig we needed, I felt, to say the same thing again. Not the story of ‘Executive carries the torch’ but how that executive and so many others came to carry it, why that mattered, and what the impact was. A longform report.

Traditionally there would have been so space for this story. It would be too long for a newspaper or magazine, far too short for a book – where the production timescale would have missed any topicality anyway.

But we didn’t have to worry about that – because we had e-publishing.

It still seems incredible to me that we could write up and publish a book on the missed promises of the Olympic torch relay before the relay had even finished. Indeed: to also publish the day before the book’s main case study was likely to run.

But if we wanted to do that, we had about a week to hit that deadline, with important holes in our narrative, and working largely in our spare time.

First, we needed a case study to represent the human impact of the corporate torchbearers and open our book. Quite a few had been mentioned in local newspapers when they discovered that less-than-inspirational individuals had taken their place, but HMI contributor Carol Miers found one who couldn’t have been more deserving: Jack Binstead had received the maximum number of nominations; he was just 15 (half of torchbearer places were supposed to go to young people – they didn’t); and he was tipped to go to the next Paralympics.

We also needed to find out if there was an impact on the genuinely inspirational people who did get to carry the torch – I had been chasing a couple when Geoff Holt came through the site’s comments (see above). That was our ending.

For the middle we needed to pin down some of the numbers around the relay. Comments from earlier stories had indicated that some people didn’t see why it was important that executives were carrying the torches – unaware, perhaps, that promises had been made about where places would go, and what sort of stories torchbearers should have.

In particular, the organisers had promised that 90% of places would be available to the general public and that 50% of places would go to young people aged 12-24. I had to nail down where each chunk of tickets had gone - and at how many points they had been taken away from availability to the ‘general public’. Ultimately, the middle of the book would describe how that 90% got chipped away until it was more like 75%.

That middle would then be fleshed out with the themes around what happened to the other 25%: essentially some of the stories we’d already told, plus some others that filled out the picture.

Writing in this way allowed us to go beyond the normal way of writing – shock at a revelation – to identifying where things went wrong and how. For all the anger at corporate sponsors for their allocation of torch relay places, it was ultimately LOCOG’s responsibility to approve nominations, to publish 8,000 “inspirational” nomination stories, and to meet the promises that they had made about how they would be allocated. The buck stopped there.

Thanks to the iterative way we had worked so far – publishing each story as it came, asking questions in public, building an online ‘footprint’ that others could find, establishing collaborative relationships and bookmarking to create an archive – we met our deadline.

It was a timescale which allowed us to tap into interest in the relay while it was still topical, and while executive torchbearers were still carrying the torch.

8,000 Holes: How the 2012 Olympic Torch Relay Lost its Way was published on day 66 of the 70-day Olympic torch relay. All proceeds went to the Brittle Bone Society, of which Jack is an ambassador. The publishers – Leanpub – agreed to give their commission on the book to the charity as well. This was all organised over email in 24 hours a couple days before the book went live.

We organised an interview with Jack Binstead which was published in The Guardian the day after – the day that the torch was to go through his home town and the day that he would be flying out of the country to avoid it. An interview with Journalism.co.uk on the ebook itself – Help Me Investigate’s first – was published the same day.

We published data on where torchbearer places went in The Guardian’s datablog two days after that, and serialised the book throughout the week, along with some additional pieces – for example, on how LOCOG missed their target of 50% of places going to young people by other 1,000 places. A lengthier interview with Jack and his mother was published at the end of the week.

In theory this should have captured interest in the torch relay at just the right time – but I think we misjudged two factors.

The first was beyond our control: the weather changed.

Until now, the weather had been awful. When it changed, the mood of the country changed, and there was less interest in the missed promises of the Olympic torch relay. But it also coincided with another change: the final week of the torch relay was also the last few days before the opening ceremony – and as the weather changed, attention shifted to the Olympic Games itself.

The torch relay, which had been squeezed dry of every possible angle for nine weeks, was – finally – yesterday’s news. It was no longer about who was carrying the torch, but about where that torch was going, and who might carry the last one.

Still, the book raised money for a deserving charity, and its story is not over. When the next torch relay comes around, I wonder, will it benefit from a resurgence of interest?

Get the free ebook for the full story: 8,000 Holes: How the 2012 Olympic Torch Relay Lost its Way - Leanpub.com/8000holes

 

11:51

A case study in online journalism part 3: ebooks (investigating the Olympic torch relay)

8000 Holes - book cover

In part one I outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I explained how verification, SEO and ‘passive aggressive newsgathering’ played a role. This final part looks at how ebooks offered a new opportunity to tell the story in depth – and publish while the story was still topical.

Ebooks – publishing before the event has even finished

After a number of stories from a variety of angles I reached a fork in the road. It felt like we had been looking at this story from every angle. More than one editor, when presented with an update, said that they’d already ‘done the torch story’. I would have done the same.

But I thought of a quote on persistence from Ian Hislop that I’d published on the Help Me Investigate blog previously. “It is saying the same true thing again and again and again and again until the penny drops.”

Although it sometimes felt like we might be boring people with our insistence on continuing to dig we needed, I felt, to say the same thing again. Not the story of ‘Executive carries the torch’ but how that executive and so many others came to carry it, why that mattered, and what the impact was. A longform report.

Traditionally there would have been so space for this story. It would be too long for a newspaper or magazine, far too short for a book – where the production timescale would have missed any topicality anyway.

But we didn’t have to worry about that – because we had e-publishing.

It still seems incredible to me that we could write up and publish a book on the missed promises of the Olympic torch relay before the relay had even finished. Indeed: to also publish the day before the book’s main case study was likely to run.

But if we wanted to do that, we had about a week to hit that deadline, with important holes in our narrative, and working largely in our spare time.

First, we needed a case study to represent the human impact of the corporate torchbearers and open our book. Quite a few had been mentioned in local newspapers when they discovered that less-than-inspirational individuals had taken their place, but HMI contributor Carol Miers found one who couldn’t have been more deserving: Jack Binstead had received the maximum number of nominations; he was just 15 (half of torchbearer places were supposed to go to young people – they didn’t); and he was tipped to go to the next Paralympics.

We also needed to find out if there was an impact on the genuinely inspirational people who did get to carry the torch – I had been chasing a couple when Geoff Holt came through the site’s comments (see above). That was our ending.

For the middle we needed to pin down some of the numbers around the relay. Comments from earlier stories had indicated that some people didn’t see why it was important that executives were carrying the torches – unaware, perhaps, that promises had been made about where places would go, and what sort of stories torchbearers should have.

In particular, the organisers had promised that 90% of places would be available to the general public and that 50% of places would go to young people aged 12-24. I had to nail down where each chunk of tickets had gone - and at how many points they had been taken away from availability to the ‘general public’. Ultimately, the middle of the book would describe how that 90% got chipped away until it was more like 75%.

That middle would then be fleshed out with the themes around what happened to the other 25%: essentially some of the stories we’d already told, plus some others that filled out the picture.

Writing in this way allowed us to go beyond the normal way of writing – shock at a revelation – to identifying where things went wrong and how. For all the anger at corporate sponsors for their allocation of torch relay places, it was ultimately LOCOG’s responsibility to approve nominations, to publish 8,000 “inspirational” nomination stories, and to meet the promises that they had made about how they would be allocated. The buck stopped there.

Thanks to the iterative way we had worked so far – publishing each story as it came, asking questions in public, building an online ‘footprint’ that others could find, establishing collaborative relationships and bookmarking to create an archive – we met our deadline.

It was a timescale which allowed us to tap into interest in the relay while it was still topical, and while executive torchbearers were still carrying the torch.

8,000 Holes: How the 2012 Olympic Torch Relay Lost its Way was published on day 66 of the 70-day Olympic torch relay. All proceeds went to the Brittle Bone Society, of which Jack is an ambassador. The publishers – Leanpub – agreed to give their commission on the book to the charity as well. This was all organised over email in 24 hours a couple days before the book went live.

We organised an interview with Jack Binstead which was published in The Guardian the day after – the day that the torch was to go through his home town and the day that he would be flying out of the country to avoid it. An interview with Journalism.co.uk on the ebook itself – Help Me Investigate’s first – was published the same day.

We published data on where torchbearer places went in The Guardian’s datablog two days after that, and serialised the book throughout the week, along with some additional pieces – for example, on how LOCOG missed their target of 50% of places going to young people by other 1,000 places. A lengthier interview with Jack and his mother was published at the end of the week.

In theory this should have captured interest in the torch relay at just the right time – but I think we misjudged two factors.

The first was beyond our control: the weather changed.

Until now, the weather had been awful. When it changed, the mood of the country changed, and there was less interest in the missed promises of the Olympic torch relay. But it also coincided with another change: the final week of the torch relay was also the last few days before the opening ceremony – and as the weather changed, attention shifted to the Olympic Games itself.

The torch relay, which had been squeezed dry of every possible angle for nine weeks, was – finally – yesterday’s news. It was no longer about who was carrying the torch, but about where that torch was going, and who might carry the last one.

Still, the book raised money for a deserving charity, and its story is not over. When the next torch relay comes around, I wonder, will it benefit from a resurgence of interest?

Get the free ebook for the full story: 8,000 Holes: How the 2012 Olympic Torch Relay Lost its Way - Leanpub.com/8000holes

 

August 06 2012

07:38

A case study in online journalism part 2: verification, SEO and collaboration (investigating the Olympic torch relay)

corporate Olympic torchbearers image

Having outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I want to touch on how verification and ‘passive aggressive newsgathering’ played a role.

Verification: who’s who

Data in this story not only provided leads which needed verifying, but also helped verify leads from outside the data.

In one example, an anonymous tip-off suggested that both children of one particular executive were carrying the Olympic torch on different legs of the relay. A quick check against his name in the data suggested this was so: two girls with the same unusual surname were indeed carrying the torch. Neither mentioned the company or their father. But how could we confirm it?

The answer involved checking planning applications, Google Streetview, and a number of other sources, including newsletters from the private school that they both attended which identified the father.

In another example, I noticed that one torchbearer had mentioned running alongside two employees of Aggreko, who were paying for their torches. I searched for other employees, and found a cake shop which had created a celebratory cake for three of them. Having seen how some corporate sponsors used their places, I went on a hunch and looked up the board of directors, searching in the data first for the CEO Rupert Soames. His name turned up – with no nomination story. A search for other directors found that more than half the executive board were carrying torches – which turned out to be our story. The final step: a call to the company to get a reaction and confirmation.

The more that we knew about how torch relay places had been used, the easier it was to verify other torchbearers. As a pattern emerged of many coming from the telecomms industry, that helped focus the search – but we had to be aware that having suspicions ‘confirmed’ didn’t mean that the name itself was confirmed – it was simply that you were more likely to hit a match that you could verify.

Scepticism was important: at various times names seemed to match with individuals but you had to ask ‘Would that person not use his title? Why would he be nominated? Would he be that age now?’

Images helped – sometimes people used the same image that had been used elsewhere (you could match this with Google Images ‘match image’ feature, then refine the search). At other times you could match with public photos of the person as they carried the torch.

This post on identifying mystery torchbearers gives more detail.

Passive aggressive newsgathering

Alerts proved key to the investigation. Early on I signed up for daily alerts on any mention of the Olympic torch. 95% of stories were formulaic ‘local town/school/hero excited about torch’ reports, but occasionally key details would emerge in other pieces – particularly those from news organisations overseas.

Google Alerts for Olympic torch

It was from these that I learned how many places exactly Dow, Omega, Visa and others had, and how many were nominated. It was how I learned about torchbearers who were not even listed on the official site, about the ‘criteria’ that were supposed to be adhered to by some organisations, about public announcements of places which suggested a change from previous numbers, and more besides.

As I came across anything that looked interesting, I bookmarked and tagged it. Some would come in useful immediately, but most would only come in useful later when I came to write up the full story. Essentially, they were pieces of a jigsaw I was yet to put together.  (For example, this report mentioned that 2,500 employees were nominated within Dow for just 10 places. How must those employees feel when they find the company’s VP of Olympic operations took up one of the few places? Likewise, he fit a broader pattern of sponsorship managers carrying the torch)

I also subscribed to any mention of the torch relay in Parliament, and any mention in FOI requests.

SEO – making yourself findable

One of the things I always emphasise to my students is the importance of publishing early and often on a subject to maximise the opportunities for others in the field to find out – and get in touch. This story was no exception to this. From the earliest stages through to the last week of the relay, users stumbled across the site as they looked for information on the relay – and passed on their concerns and leads.

It was particularly important with a big public event like the Olympic torch relay, which generated a lot of interest among local people. In the first week of the investigation one photographer stumbled across the site because he was searching for the name of one of the torchbearers we had identified as coming from adidas. He passed on his photographs – but more importantly, made me aware that there may be photographs of other executives who had already carried the torch.

That led to the strongest image of the investigation – two executives exchanging a ‘torch kiss’ (shown at the top of this post) – which was in turn picked up by The Daily Mail.

Other leads kept coming. The tip-off about the executive’s daughters mentioned above; someone mentioning two more Aggreko directors – one of which had never been published on the official site, and the other had been listed and then removed. Questions about a Polish torchbearer who was not listed on the official site or, indeed, anywhere on the web other than the BBC’s torch relay liveblog. Challenges to one story we linkblogged, which led to further background that helped flesh out the processes behind the nominations given to universities.

When we published the ‘mystery torchbearers’ with The Guardian some got in touch to tell us who they were. In one case, that contact led to an interview which closed the book: Geoff Holt, the first quadriplegic to sail single-handed across the Atlantic Ocean.

Collaboration

I could have done this story the old-fashioned way: kept it to myself, done all the digging alone, and published one big story at the end.

It wouldn’t have been half as good. It wouldn’t have had the impact, it wouldn’t have had the range, and it would have missed key ingredients.

Collaboration was at the heart of this process. As soon as I started to unearth the adidas torchbearers I got in touch with The Guardian’s James Ball. His report the week after added reactions from some of the companies involved, and other torchbearers we’d simultaneously spotted. But James also noticed that one of Coca Cola’s torchbearers was a woman “who among other roles sits on a committee of the US’s Food and Drug Administration”.

It was collaborating with contacts in Staffordshire which helped point me to the ‘torch kiss’ image. They in turn followed up the story behind it (a credit for Help Me Investigate was taken out of the piece – it seems old habits die hard), and The Daily Mail followed up on that to get some further reaction and response (and no, they didn’t credit the Stoke Sentinel either). In Bournemouth and Sussex local journalists took up the baton (sorry), and the Times Higher did their angle.

We passed on leads to Ventnor Blog, whose users helped dig into a curious torchbearer running through the area. And we published a list of torchbearers missing stories in The Guardian, where users helped identify them.

Collaborating with an international mailing list for investigative journalists, I generated datasets of local torchbearers in Hungary, Italy, India, the Middle East, Germany, and Romania. German daily newspaper Der Tagesspiegel got in touch and helped trace some of the Germans.

And of course, within the Help Me Investigate network people were identifying mystery torchbearers, getting responses from sponsors, visualising data, and chasing interviews. One contributor in particular – Carol Miers – came on board halfway through and contributed some of the key elements of the final longform report – in particular the interview that opens the book, which I’ll talk about in the final part tomorrow.

07:38

A case study in online journalism part 2: verification, SEO and collaboration (investigating the Olympic torch relay)

corporate Olympic torchbearers image

Having outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I want to touch on how verification and ‘passive aggressive newsgathering’ played a role.

Verification: who’s who

Data in this story not only provided leads which needed verifying, but also helped verify leads from outside the data.

In one example, an anonymous tip-off suggested that both children of one particular executive were carrying the Olympic torch on different legs of the relay. A quick check against his name in the data suggested this was so: two girls with the same unusual surname were indeed carrying the torch. Neither mentioned the company or their father. But how could we confirm it?

The answer involved checking planning applications, Google Streetview, and a number of other sources, including newsletters from the private school that they both attended which identified the father.

In another example, I noticed that one torchbearer had mentioned running alongside two employees of Aggreko, who were paying for their torches. I searched for other employees, and found a cake shop which had created a celebratory cake for three of them. Having seen how some corporate sponsors used their places, I went on a hunch and looked up the board of directors, searching in the data first for the CEO Rupert Soames. His name turned up – with no nomination story. A search for other directors found that more than half the executive board were carrying torches – which turned out to be our story. The final step: a call to the company to get a reaction and confirmation.

The more that we knew about how torch relay places had been used, the easier it was to verify other torchbearers. As a pattern emerged of many coming from the telecomms industry, that helped focus the search – but we had to be aware that having suspicions ‘confirmed’ didn’t mean that the name itself was confirmed – it was simply that you were more likely to hit a match that you could verify.

Scepticism was important: at various times names seemed to match with individuals but you had to ask ‘Would that person not use his title? Why would he be nominated? Would he be that age now?’

Images helped – sometimes people used the same image that had been used elsewhere (you could match this with Google Images ‘match image’ feature, then refine the search). At other times you could match with public photos of the person as they carried the torch.

This post on identifying mystery torchbearers gives more detail.

Passive aggressive newsgathering

Alerts proved key to the investigation. Early on I signed up for daily alerts on any mention of the Olympic torch. 95% of stories were formulaic ‘local town/school/hero excited about torch’ reports, but occasionally key details would emerge in other pieces – particularly those from news organisations overseas.

Google Alerts for Olympic torch

It was from these that I learned how many places exactly Dow, Omega, Visa and others had, and how many were nominated. It was how I learned about torchbearers who were not even listed on the official site, about the ‘criteria’ that were supposed to be adhered to by some organisations, about public announcements of places which suggested a change from previous numbers, and more besides.

As I came across anything that looked interesting, I bookmarked and tagged it. Some would come in useful immediately, but most would only come in useful later when I came to write up the full story. Essentially, they were pieces of a jigsaw I was yet to put together.  (For example, this report mentioned that 2,500 employees were nominated within Dow for just 10 places. How must those employees feel when they find the company’s VP of Olympic operations took up one of the few places? Likewise, he fit a broader pattern of sponsorship managers carrying the torch)

I also subscribed to any mention of the torch relay in Parliament, and any mention in FOI requests.

SEO – making yourself findable

One of the things I always emphasise to my students is the importance of publishing early and often on a subject to maximise the opportunities for others in the field to find out – and get in touch. This story was no exception to this. From the earliest stages through to the last week of the relay, users stumbled across the site as they looked for information on the relay – and passed on their concerns and leads.

It was particularly important with a big public event like the Olympic torch relay, which generated a lot of interest among local people. In the first week of the investigation one photographer stumbled across the site because he was searching for the name of one of the torchbearers we had identified as coming from adidas. He passed on his photographs – but more importantly, made me aware that there may be photographs of other executives who had already carried the torch.

That led to the strongest image of the investigation – two executives exchanging a ‘torch kiss’ (shown at the top of this post) – which was in turn picked up by The Daily Mail.

Other leads kept coming. The tip-off about the executive’s daughters mentioned above; someone mentioning two more Aggreko directors – one of which had never been published on the official site, and the other had been listed and then removed. Questions about a Polish torchbearer who was not listed on the official site or, indeed, anywhere on the web other than the BBC’s torch relay liveblog. Challenges to one story we linkblogged, which led to further background that helped flesh out the processes behind the nominations given to universities.

When we published the ‘mystery torchbearers’ with The Guardian some got in touch to tell us who they were. In one case, that contact led to an interview which closed the book: Geoff Holt, the first quadriplegic to sail single-handed across the Atlantic Ocean.

Collaboration

I could have done this story the old-fashioned way: kept it to myself, done all the digging alone, and published one big story at the end.

It wouldn’t have been half as good. It wouldn’t have had the impact, it wouldn’t have had the range, and it would have missed key ingredients.

Collaboration was at the heart of this process. As soon as I started to unearth the adidas torchbearers I got in touch with The Guardian’s James Ball. His report the week after added reactions from some of the companies involved, and other torchbearers we’d simultaneously spotted. But James also noticed that one of Coca Cola’s torchbearers was a woman “who among other roles sits on a committee of the US’s Food and Drug Administration”.

It was collaborating with contacts in Staffordshire which helped point me to the ‘torch kiss’ image. They in turn followed up the story behind it (a credit for Help Me Investigate was taken out of the piece – it seems old habits die hard), and The Daily Mail followed up on that to get some further reaction and response (and no, they didn’t credit the Stoke Sentinel either). In Bournemouth and Sussex local journalists took up the baton (sorry), and the Times Higher did their angle.

We passed on leads to Ventnor Blog, whose users helped dig into a curious torchbearer running through the area. And we published a list of torchbearers missing stories in The Guardian, where users helped identify them.

Collaborating with an international mailing list for investigative journalists, I generated datasets of local torchbearers in Hungary, Italy, India, the Middle East, Germany, and Romania. German daily newspaper Der Tagesspiegel got in touch and helped trace some of the Germans.

And of course, within the Help Me Investigate network people were identifying mystery torchbearers, getting responses from sponsors, visualising data, and chasing interviews. One contributor in particular – Carol Miers – came on board halfway through and contributed some of the key elements of the final longform report – in particular the interview that opens the book, which I’ll talk about in the final part tomorrow.

July 26 2012

13:43

Interview: the team behind the Translate audio transcription app

After test-driving the audio transcription app Translate, Antoinette Siu interviewed Jason and Kishore of Wreally Studios, the team behind Transcribe.

What do you hope to do with this project?

We want to make journalists’ lives easier through software. From what we’ve heard, transcription is one of their pain points and while Transcribe can’t do the transcription automatically for them (at least, not yet) we could make the transcription process a little easier for them through our tool.

To that end we’ve been talking to several users and getting their feedback to improve the product. Our goal is to build a great product that genuinely addresses the transcription needs of our users.

How is the tool being received?

Transcribe is actually spreading pretty virally at the moment, purely through word-of-mouth advertising. We think it’s a testament to the simplicity and effectiveness of the app.

The very first version we put out was pretty rough and it was dormant for a few months. However, we should thank our early users for giving us lots of valuable feedback that has helped make Transcribe where it is today. We get lots of fan emails every week, which is always heartening.

Is there a team developing this tool or is this something you are doing full-time?

We are both software engineers by profession and entrepreneurs at heart. We have been working on Transcribe part-time over the past year.

Why a journalism tool?

We have always believed that one can’t build great software without putting oneself in the shoes of the end users. As such, this tool was actually born out of a personal frustration. We built a tool which helped us do transcriptions faster, and we wanted to put it out in the wild for others to use it and also give us feedback on how to make it better.

In the past, we have worked on other highly focused tools, with one of them (Scribble), having more than 35,000 users in the Chrome Web Store.

Currently, we are focusing on Transcribe, and recently launched a “Pro” version which builds upon the free version. We have a bunch of really exciting features lined up for Transcribe Pro!

What was your favourite part about creating this tool?

Definitely the part about letting the users drive the development of features that’s important to them. We have been able to quickly iterate on the product by responding to user suggestions and that has made the application better for everyone.

How do you see users using it?

While interacting with the users of Transcribe, we found out that a lot of people performed transcription using just a audio player and a text editor! This process is painful for a number of reasons.

Firstly, you cannot easily pause, rewind or slow-down the audio without constantly switching between the editor and the audio player.

In addition, your fingers have to toggle between the keyboard and the mouse in order to perform various actions.

In contrast—Transcribe puts both the audio player and the text editor on the same screen, and provides you with powerful shortcuts for controlling both the audio playback and the editor.

With Transcribe Pro, you can manage multiple transcriptions at the same time and it will even remember where you stopped the audio so that you can resume from the same place next time. It also supports various audio formats directly.

Are you marketing the tool to any news outlets?

As you have correctly pointed out, Transcription is a huge time sink for journalists. So far, we have relied primarily on word-of-mouth advertising and that by itself has proved very effective—our users love the product and recommend it to their colleagues. So we haven’t yet had the need to market the tool ourselves, which is great.

But we would love to discuss Transcribe with news outlets to get more feedback and explore opportunities to integrate Transcribe into their workflows.

With the launch of Transcribe Pro last month, we are currently working on some exciting features that would make Transcription even less painful. Anyone interested in talking to us, can email contact@wreally.com.

July 25 2012

09:16

Hyperlocal Voices: Richard Gurner, Caerphilly Observer

For the fourth in our new series of Hyperlocal Voices we head back to Wales. Launched by Richard Gurner in July 2009, the Caerphilly Observer acts as a local news and information website for Caerphilly County Borough.

The site is one of a small, but growing, number of financially viable hyperlocal websites. Richard, who remains the Editor of the site, told Damian Radcliffe a little bit about his journey over the last three years.

 

1.  Who were the people behind the blog?

People tend to be a bit surprised when I reveal that it’s only me behind Caerphilly Observer. We do have guest bloggers (local politicians and business leaders) and we have some sports reports sent in from local teams, but apart from that I do most of the editorial on the site and our weekly newsletter.

2.  What made you decide to set up the blog?

Believe it or not, I originally set up Caerphilly Observer while I was living in Brighton – some 200 miles away from the area.

I was working for daily newspaper The Argus at the time as a reporter and simply wanted to keep up with what was going on back home. I also wanted to improve my digital skills and thought setting up a news website would kill two birds with one stone.

It has always been a dream of mine to own a newspaper and I thought that if the website took off with the readers, then maybe one day I could do it as a full-time job. I never thought that would become a reality until it happened in August 2011.

3.  When did you set up the blog and how did you go about it?

With the intention of this maybe becoming a business one day, I purposely set about choosing a name with a “newspaper” feel. If the website was to be taken seriously then it needed to have a strong brand. After several alternatives, Caerphilly Observer was finally chosen by my wife.

I registered the domain name and went about setting-up a self-hosted WordPress site. With next to no technical knowledge of DNS, PHP, Apache and loads of other things that sounded like they were from Star Trek, I ploughed on.

The learning curve has been steep – especially with implementing a custom WordPress theme – but the knowledge gained has been immensely valuable.

I’m very much a hands-on learning person, so I know a lot of it has stuck and it won’t be forgotten.

4.  What other blogs, bloggers or websites influenced you?

I drew a lot of inspiration from several news websites, in not what to do, and loads of other blogs in what to do correctly.

Lichfield Live (Or Lichfield Blog as it was then called) was a big inspiration as was Bristol 24/7.

5.  How did – and do – you see yourself in relation to a traditional news operation?

I definitely see Caerphilly Observer as part of the local media and I’m very pleased to say the community we cover also sees us in the same light.

Quite often people mistake us for a newspaper and think we’re bigger and more established than we actually are – not a bad thing. Obviously, I can’t cover everything and there have been court cases I would have loved to have covered but couldn’t. I used to beat myself up about not being everywhere but more recently I’ve come to terms with the fact that it’s me against the big media trying to create something sustainable.

There are other aspects of the site that equally need taking care of such as business admin and the small matter of selling advertising to fund what I do.

6.  What have been the key moments in the blog’s development editorially?

You know you’re being taken seriously when people contact you to complain. I won’t go into specifics but during last year’s Welsh Assembly elections we were threatened with legal action. We eventually sorted it out without the need for solicitors but it did go to show that we had arrived. If we were irrelevant then I wouldn’t have had that phone call.

7.  What sort of traffic do you get and how has that changed over time?

Our monthly average over the last six months (Jan 2012 to June 2012) is 37,000 page impressions and 13,340 unique visitors. That’s roughly double to what we did in the first half of 2011.

8.  What has been your biggest challenge to date?

Creating revenue is an absolute huge challenge and fundamental to the sustainable future of Caerphilly Observer.

One of our selling points is that we’re local and independent, but if we’re not getting the numbers for local businesses to themselves get business, they’re not going to advertise and we’re not going to make any money.

Paid-for editorial spots and display advertising make up the bulk of my income, but I still do freelance copywriting and journalism to create my wage. It’s nowhere near where it was when I was working for a big media company but the difference is I’m doing what I think serves our readers and advertisers the best. There is also an unrivalled sense of job satisfaction.

Many in hyperlocal circles and the wider media industry state that creating a paying website is impossible – I love proving them wrong.

9.  What story, feature or series are you most proud of?

Without doubt it was our liveblog during the local election count in May this year. It was a fantastic night grabbing interviews and updating the website and we had a record number of visitors and page views for a single day.

The reaction from and interaction with our readers was what kept me going into the small hours.

10.  What are your plans for the future?

To keep growing. I want to have at least one other member of staff and an office in Caerphilly town centre, but that will take a lot of hard work and dedication.

Most of all, I want Caerphilly Observer to be the primary source for local news in the area and have the mind and market share in the local community that traditional media has.

09:16

Hyperlocal Voices: Richard Gurner, Caerphilly Observer

For the fourth in our new series of Hyperlocal Voices we head back to Wales. Launched by Richard Gurner in July 2009, the Caerphilly Observer acts as a local news and information website for Caerphilly County Borough.

The site is one of a small, but growing, number of financially viable hyperlocal websites. Richard, who remains the Editor of the site, told Damian Radcliffe a little bit about his journey over the last three years.

 

1.  Who were the people behind the blog?

People tend to be a bit surprised when I reveal that it’s only me behind Caerphilly Observer. We do have guest bloggers (local politicians and business leaders) and we have some sports reports sent in from local teams, but apart from that I do most of the editorial on the site and our weekly newsletter.

2.  What made you decide to set up the blog?

Believe it or not, I originally set up Caerphilly Observer while I was living in Brighton – some 200 miles away from the area.

I was working for daily newspaper The Argus at the time as a reporter and simply wanted to keep up with what was going on back home. I also wanted to improve my digital skills and thought setting up a news website would kill two birds with one stone.

It has always been a dream of mine to own a newspaper and I thought that if the website took off with the readers, then maybe one day I could do it as a full-time job. I never thought that would become a reality until it happened in August 2011.

3.  When did you set up the blog and how did you go about it?

With the intention of this maybe becoming a business one day, I purposely set about choosing a name with a “newspaper” feel. If the website was to be taken seriously then it needed to have a strong brand. After several alternatives, Caerphilly Observer was finally chosen by my wife.

I registered the domain name and went about setting-up a self-hosted WordPress site. With next to no technical knowledge of DNS, PHP, Apache and loads of other things that sounded like they were from Star Trek, I ploughed on.

The learning curve has been steep – especially with implementing a custom WordPress theme – but the knowledge gained has been immensely valuable.

I’m very much a hands-on learning person, so I know a lot of it has stuck and it won’t be forgotten.

4.  What other blogs, bloggers or websites influenced you?

I drew a lot of inspiration from several news websites, in not what to do, and loads of other blogs in what to do correctly.

Lichfield Live (Or Lichfield Blog as it was then called) was a big inspiration as was Bristol 24/7.

5.  How did – and do – you see yourself in relation to a traditional news operation?

I definitely see Caerphilly Observer as part of the local media and I’m very pleased to say the community we cover also sees us in the same light.

Quite often people mistake us for a newspaper and think we’re bigger and more established than we actually are – not a bad thing. Obviously, I can’t cover everything and there have been court cases I would have loved to have covered but couldn’t. I used to beat myself up about not being everywhere but more recently I’ve come to terms with the fact that it’s me against the big media trying to create something sustainable.

There are other aspects of the site that equally need taking care of such as business admin and the small matter of selling advertising to fund what I do.

6.  What have been the key moments in the blog’s development editorially?

You know you’re being taken seriously when people contact you to complain. I won’t go into specifics but during last year’s Welsh Assembly elections we were threatened with legal action. We eventually sorted it out without the need for solicitors but it did go to show that we had arrived. If we were irrelevant then I wouldn’t have had that phone call.

7.  What sort of traffic do you get and how has that changed over time?

Our monthly average over the last six months (Jan 2012 to June 2012) is 37,000 page impressions and 13,340 unique visitors. That’s roughly double to what we did in the first half of 2011.

8.  What has been your biggest challenge to date?

Creating revenue is an absolute huge challenge and fundamental to the sustainable future of Caerphilly Observer.

One of our selling points is that we’re local and independent, but if we’re not getting the numbers for local businesses to themselves get business, they’re not going to advertise and we’re not going to make any money.

Paid-for editorial spots and display advertising make up the bulk of my income, but I still do freelance copywriting and journalism to create my wage. It’s nowhere near where it was when I was working for a big media company but the difference is I’m doing what I think serves our readers and advertisers the best. There is also an unrivalled sense of job satisfaction.

Many in hyperlocal circles and the wider media industry state that creating a paying website is impossible – I love proving them wrong.

9.  What story, feature or series are you most proud of?

Without doubt it was our liveblog during the local election count in May this year. It was a fantastic night grabbing interviews and updating the website and we had a record number of visitors and page views for a single day.

The reaction from and interaction with our readers was what kept me going into the small hours.

10.  What are your plans for the future?

To keep growing. I want to have at least one other member of staff and an office in Caerphilly town centre, but that will take a lot of hard work and dedication.

Most of all, I want Caerphilly Observer to be the primary source for local news in the area and have the mind and market share in the local community that traditional media has.

May 01 2012

17:21
17:21
17:21

April 23 2012

16:16

Step by step: how to start in a data journalist role

Following my previous posts on the network journalist and community manager roles as part of an investigation team, this post expands on the first steps a student journalist can take in filling the data journalist role.

1: Brainstorm data that might be relevant to your investigation or field

Before you begin digging for data, it’s worth mapping out the territory you’re working in. Some key questions to ask include:

  • Who measures or monitors your field? For example:
  • Where is spending recorded? This might be at both a local and national level.
  • What are the key things that might be measured in your field? For example, in prisons they might be interested in reoffending, or overcrowding, or staffing.
  • Can you find historical data?
  • What data do you need to provide basic context? e.g.
    • Where – addresses for all institutions in your field (e.g. schools, prisons, etc.)
    • Codes – often these are used instead of institution or area names
    • Who – names of those responsible for particular aspects of your field
    • Demographics – the distribution of age, gender, ethnicity, industries, wealth, property or other elements may be important to your work
    • Politics – who is in charge in each area (local authority and local MP)
  • How could you collate data that doesn’t exist? E.g. public awareness of something; or how the policies of different bodies compare, etc.

Sometimes the simplest and quickest way to find out these things is to pick up the phone and speak to someone in a relevant organisation and ask them: what information is collected about your field, and by whom?

You can also make content from this process of research: post a guide to how your field is regulated and measured (and what information isn’t); who’s who in your field - the regulators, monitors, politicians and bodies that all have a hand in keeping it on track.

2. Learn advanced techniques to obtain that data

Once you’ve mapped it all out you can start to prioritise the datasets that are most relevant to your particular investigation. You may need to use different techniques to get hold of these, including:

Again, you can make content from this process, for example: “How we found…” or “Why we’re asking the MoJ for…” (with a link to the FOI request) or “Get the data” (here’s how to publish data online)

The flow chart below (from this previous post) helps guide you to the relevant techniques for your data:

Gathering data: a flow chart for data journalist
Gathering data: a flow chart for data journalist

3. Pull out the parts of data relevant to your field/investigation

For example:

4. Add value to the data

Here are just some suggestions. You can use one or many:

Any of these provide useful opportunities for posting new content with the new contextual information (e.g. “How the data on X was gathered“) or new combined data (“Now with QOF data“) or the issues that they raise (“Why schools data may be worthless“).

5. Communicate the story in the data

I’ve written separately about the different ways of communicating data stories, so you can read that here. In short, human case studies are helpful, and visualisation is often useful.

And it’s at this point that you can also link to the further detail provided in all the content you’ve written in the previous 4 steps: How you got the data, the wider context, the specific data that’s of interest, the more detailed expert analysis or background, and so on.

16:16

Step by step: how to start in a data journalist role

Following my previous posts on the network journalist and community manager roles as part of an investigation team, this post expands on the first steps a student journalist can take in filling the data journalist role.

1: Brainstorm data that might be relevant to your investigation or field

Before you begin digging for data, it’s worth mapping out the territory you’re working in. Some key questions to ask include:

  • Who measures or monitors your field? For example:
  • Where is spending recorded? This might be at both a local and national level.
  • What are the key things that might be measured in your field? For example, in prisons they might be interested in reoffending, or overcrowding, or staffing.
  • Can you find historical data?
  • What data do you need to provide basic context? e.g.
    • Where – addresses for all institutions in your field (e.g. schools, prisons, etc.)
    • Codes – often these are used instead of institution or area names
    • Who – names of those responsible for particular aspects of your field
    • Demographics – the distribution of age, gender, ethnicity, industries, wealth, property or other elements may be important to your work
    • Politics – who is in charge in each area (local authority and local MP)
  • How could you collate data that doesn’t exist? E.g. public awareness of something; or how the policies of different bodies compare, etc.

Sometimes the simplest and quickest way to find out these things is to pick up the phone and speak to someone in a relevant organisation and ask them: what information is collected about your field, and by whom?

You can also make content from this process of research: post a guide to how your field is regulated and measured (and what information isn’t); who’s who in your field - the regulators, monitors, politicians and bodies that all have a hand in keeping it on track.

2. Learn advanced techniques to obtain that data

Once you’ve mapped it all out you can start to prioritise the datasets that are most relevant to your particular investigation. You may need to use different techniques to get hold of these, including:

Again, you can make content from this process, for example: “How we found…” or “Why we’re asking the MoJ for…” (with a link to the FOI request) or “Get the data” (here’s how to publish data online)

The flow chart below (from this previous post) helps guide you to the relevant techniques for your data:

Gathering data: a flow chart for data journalist
Gathering data: a flow chart for data journalist

3. Pull out the parts of data relevant to your field/investigation

For example:

4. Add value to the data

Here are just some suggestions. You can use one or many:

Any of these provide useful opportunities for posting new content with the new contextual information (e.g. “How the data on X was gathered“) or new combined data (“Now with QOF data“) or the issues that they raise (“Why schools data may be worthless“).

5. Communicate the story in the data

I’ve written separately about the different ways of communicating data stories, so you can read that here. In short, human case studies are helpful, and visualisation is often useful.

And it’s at this point that you can also link to the further detail provided in all the content you’ve written in the previous 4 steps: How you got the data, the wider context, the specific data that’s of interest, the more detailed expert analysis or background, and so on.

16:16

Step by step: how to start in a data journalist role

Following my previous posts on the network journalist and community manager roles as part of an investigation team, this post expands on the first steps a student journalist can take in filling the data journalist role.

1: Brainstorm data that might be relevant to your investigation or field

Before you begin digging for data, it’s worth mapping out the territory you’re working in. Some key questions to ask include:

  • Who measures or monitors your field? For example:
  • Where is spending recorded? This might be at both a local and national level.
  • What are the key things that might be measured in your field? For example, in prisons they might be interested in reoffending, or overcrowding, or staffing.
  • Can you find historical data?
  • What data do you need to provide basic context? e.g.
    • Where – addresses for all institutions in your field (e.g. schools, prisons, etc.)
    • Codes – often these are used instead of institution or area names
    • Who – names of those responsible for particular aspects of your field
    • Demographics – the distribution of age, gender, ethnicity, industries, wealth, property or other elements may be important to your work
    • Politics – who is in charge in each area (local authority and local MP)
  • How could you collate data that doesn’t exist? E.g. public awareness of something; or how the policies of different bodies compare, etc.

Sometimes the simplest and quickest way to find out these things is to pick up the phone and speak to someone in a relevant organisation and ask them: what information is collected about your field, and by whom?

You can also make content from this process of research: post a guide to how your field is regulated and measured (and what information isn’t); who’s who in your field - the regulators, monitors, politicians and bodies that all have a hand in keeping it on track.

2. Learn advanced techniques to obtain that data

Once you’ve mapped it all out you can start to prioritise the datasets that are most relevant to your particular investigation. You may need to use different techniques to get hold of these, including:

Again, you can make content from this process, for example: “How we found…” or “Why we’re asking the MoJ for…” (with a link to the FOI request) or “Get the data” (here’s how to publish data online)

The flow chart below (from this previous post) helps guide you to the relevant techniques for your data:

Gathering data: a flow chart for data journalist
Gathering data: a flow chart for data journalist

3. Pull out the parts of data relevant to your field/investigation

For example:

4. Add value to the data

Here are just some suggestions. You can use one or many:

Any of these provide useful opportunities for posting new content with the new contextual information (e.g. “How the data on X was gathered“) or new combined data (“Now with QOF data“) or the issues that they raise (“Why schools data may be worthless“).

5. Communicate the story in the data

I’ve written separately about the different ways of communicating data stories, so you can read that here. In short, human case studies are helpful, and visualisation is often useful.

And it’s at this point that you can also link to the further detail provided in all the content you’ve written in the previous 4 steps: How you got the data, the wider context, the specific data that’s of interest, the more detailed expert analysis or background, and so on.


Filed under: online journalism Tagged: data blogging, finding data, gathering data

April 20 2012

12:19
12:19

April 19 2012

09:08

When data goes bad

Incorrect-statistics

Image by Lauren York

Data is so central to the decision-making that shapes our countries, jobs and even personal lives that an increasing amount of data journalism involves scrutinising the problems with the very data itself. Here’s an illustrative list of when bad data becomes the story – and the lessons they can teach data journalists:

Deaths in police custody unrecorded

This investigation by the Bureau of Investigative Journalism demonstrates an important question to ask about data: who decides what gets recorded?

In this case, the BIJ identified “a number of cases not included in the official tally of 16 ‘restraint-related’ deaths in the decade to 2009 … Some cases were not included because the person has not been officially arrested or detained.”

As they explain:

“It turns out the IPCC has a very tight definition of ‘in custody’ –  defined only as when someone has been formally arrested or detained under the mental health act. This does not include people who have died after being in contact with the police.

“There are in fact two lists. The one which includes the widely quoted list of sixteen deaths in custody only records the cases where the person has been arrested or detained under the mental health act. So, an individual who comes into contact with the police – is never arrested or detained – but nonetheless dies after being restrained, is not included in the figures.

“… But even using the IPCC’s tightly drawn definition, the Bureau has identified cases that are still missing.”

Cross-checking the official statistics against wider reports was key technique. As was using the Freedom of Information Act to request the details behind them and the details of those “ who died in circumstances where restraint was used but was not necessarily a direct cause of death”.

Cooking the books on drug-related murders

Drug related murders in Mexico
Cross-checking statistics against reports was also used in this investigation by Diego Valle-Jones into Mexican drug deaths:

“The Acteal massacre committed by paramilitary units with government backing against 45 Tzotzil Indians is missing from the vital statistics database. According to the INEGI there were only 2 deaths during December 1997 in the municipality of Chenalho, where the massacre occurred. What a silly way to avoid recording homicides! Now it is just a question of which data is less corrupt.”

Diego also used the Benford’s Law technique to identify potentially fraudulent data, which was also used to highlight relationships between dodgy company data and real world events such as the dotcom bubble and deregulation.

Poor records mean no checks

Detective Inspector Philip Shakesheff exposed a “gap between [local authority] records and police data”, reported The Sunday Times in a story headlined ‘Care home loses child 130 times‘:

“The true scale of the problem was revealed after a check of records on police computers. For every child officially recorded by local authorities as missing in 2010, another seven were unaccounted for without their absence being noted.”

Why is it important?

“The number who go missing is one of the indicators on which Ofsted judges how well children’s homes are performing and the homes have a legal duty to keep accurate records.

“However, there is evidence some homes are failing to do so. In one case, Ofsted gave a good report to a private children’s home in Worcestershire when police records showed 1,630 missing person reports in five years. Police stationed an officer at the home and pressed Ofsted to look closer. The home was downgraded to inadequate and it later closed.

“The risks of being missing from care are demonstrated by Zoe Thomsett, 17, who was Westminster council’s responsibility. It sent her to a care home in Herefordshire, where she went missing several times, the final time for three days. She had earlier been found at an address in Hereford, but because no record was kept, nobody checked the address. She died there of a drugs overdose.

“The troubled life of Dane Edgar, 14, ended with a drugs overdose at a friend’s house after he repeatedly went missing from a children’s home in Northumberland. Another 14-year-old, James Jordan, was killed when he absconded from care and was the passenger in a stolen car.”

Interests not registered

When there are no formal checks on declarations of interest, how can we rely on it? In Chile, the Ciudadano Inteligente Fundaciondecided to check the Chilean MPs’ register of assets and interests by building a database:

“No-one was analysing this data, so it was incomplete,” explained Felipe Heusser, executive president of the Fundacion. “We used technology to build a database, using a wide range of open data and mapped all the MPs’ interests. From that, we found that nearly 40% of MPs were not disclosing their assets fully.”

The organisation has now launched a database that “enables members of the public to find potential conflicts of interest by analysing the data disclosed through the members’ register of assets.”

Data laundering

Tony Hirst’s post about how dodgy data was “laundered” by Facebook in a consultants report is a good illustration of the need to ‘follow the data’.

We have some dodgy evidence, about which we’re biased, so we give it to an “independent” consultant who re-reports it, albeit with caveats, that we can then report, minus the caveats. Lovely, clean evidence. Our lobbyists can then go to a lazy policy researcher and take this scrubbed evidence, referencing it as finding in the Deloitte report, so that it can make its way into a policy briefing.”

“Things just don’t add up”

In the video below Ellen Miller of the Sunlight Foundation takes the US government to task over the inconsistencies in its transparency agenda, and the flawed data published on its USAspending.gov – so flawed that they launched the Clearspending website to automate and highlight the discrepancy between two sources of the same data:

Key budget decisions made on useless data

Sometimes data might appear to tell an astonishing story, but this turns out to be a mistake – and that mistake itself leads you to something much more newsworthy, as Channel 4′s FactCheck foundwhen it started trying to find out if councils had been cutting spending on Sure Start children’s centres:

“That ought to be fairly straightforward, as all councils by law have to fill in something called a Section 251 workbook detailing how much they are spending on various services for young people.

“… Brent Council in north London appeared to have slashed its funding by nearly 90 per cent, something that seemed strange, as we hadn’t heard an outcry from local parents.

“The council swiftly admitted making an accounting error – to the tune of a staggering £6m.”

And they weren’t the only ones. In fact, the Department for Education  admitted the numbers were “not very accurate”:

“So to recap, these spending figures don’t actually reflect the real amount of money spent; figures from different councils are not comparable with each other; spending in one year can’t be compared usefully with other years; and the government doesn’t propose to audit the figures or correct them when they’re wrong.”

This was particularly important because the S251 form “is the document the government uses to reallocate funding from council-run schools to its flagship academies.”:

“The Local Government Association (LGA) says less than £250m should be swiped from council budgets and given to academies, while the government wants to cut more than £1bn, prompting accusations that it is overfunding its favoured schools to the detriment of thousands of other children.

“Many councils’ complaints, made plain in responses to an ongoing government consultation, hinge on DfE’s use of S251, a document it has variously described as “unaudited”, “flawed” and”not fit for purpose”.

No data is still a story

Sticking with education, the TES reports on the outcome of an FOI request on the experience of Ofsted inspectors:

“[Stephen] Ball submitted a Freedom of Information request, asking how many HMIs had experience of being a secondary head, and how many of those had led an outstanding school. The answer? Ofsted “does not hold the details”.

““Secondary heads and academy principals need to be reassured that their work is judged by people who understand its complexity,” Mr Ball said. “Training as a good head of department or a primary school leader on the framework is no longer adequate. Secondary heads don’t fear judgement, but they expect to be judged by people who have experience as well as a theoretical training. After all, a working knowledge of the highway code doesn’t qualify you to become a driving examiner.”

“… Sir Michael Wilshaw, Ofsted’s new chief inspector, has already argued publicly that raw data are a key factor in assessing a school’s performance. By not providing the facts to back up its boasts about the expertise of its inspectors, many heads will remain sceptical of the watchdog’s claims.”

Men aren’t as tall as they say they are

To round off, here’s a quirky piece of data journalism by dating site OkCupid, which looked at the height of its members and found an interesting pattern:

Male height distribution on OKCupid

“The male heights on OkCupid very nearly follow the expected normal distribution—except the whole thing is shifted to the right of where it should be.

“Almost universally guys like to add a couple inches. You can also see a more subtle vanity at work: starting at roughly 5′ 8″, the top of the dotted curve tilts even further rightward. This means that guys as they get closer to six feet round up a bit more than usual, stretching for that coveted psychological benchmark.”

Do you know of any other examples of bad data forming the basis of a story? Please post a comment – I’m collecting examples.

UPDATE (April 20 2012): A useful addition from Simon Rogers: Named and shamed: the worst government annual reports explains why government department spending reports fail to support the Government’s claimed desire for an “army of armchair auditors”, with a list of the worst offenders at the end.

Also:

09:08

When data goes bad

Incorrect-statistics

Image by Lauren York

Data is so central to the decision-making that shapes our countries, jobs and even personal lives that an increasing amount of data journalism involves scrutinising the problems with the very data itself. Here’s an illustrative list of when bad data becomes the story – and the lessons they can teach data journalists:

Deaths in police custody unrecorded

This investigation by the Bureau of Investigative Journalism demonstrates an important question to ask about data: who decides what gets recorded?

In this case, the BIJ identified “a number of cases not included in the official tally of 16 ‘restraint-related’ deaths in the decade to 2009 … Some cases were not included because the person has not been officially arrested or detained.”

As they explain:

“It turns out the IPCC has a very tight definition of ‘in custody’ –  defined only as when someone has been formally arrested or detained under the mental health act. This does not include people who have died after being in contact with the police.

“There are in fact two lists. The one which includes the widely quoted list of sixteen deaths in custody only records the cases where the person has been arrested or detained under the mental health act. So, an individual who comes into contact with the police – is never arrested or detained – but nonetheless dies after being restrained, is not included in the figures.

“… But even using the IPCC’s tightly drawn definition, the Bureau has identified cases that are still missing.”

Cross-checking the official statistics against wider reports was key technique. As was using the Freedom of Information Act to request the details behind them and the details of those “ who died in circumstances where restraint was used but was not necessarily a direct cause of death”.

Cooking the books on drug-related murders

Drug related murders in Mexico
Cross-checking statistics against reports was also used in this investigation by Diego Valle-Jones into Mexican drug deaths:

“The Acteal massacre committed by paramilitary units with government backing against 45 Tzotzil Indians is missing from the vital statistics database. According to the INEGI there were only 2 deaths during December 1997 in the municipality of Chenalho, where the massacre occurred. What a silly way to avoid recording homicides! Now it is just a question of which data is less corrupt.”

Diego also used the Benford’s Law technique to identify potentially fraudulent data, which was also used to highlight relationships between dodgy company data and real world events such as the dotcom bubble and deregulation.

Poor records mean no checks

Detective Inspector Philip Shakesheff exposed a “gap between [local authority] records and police data”, reported The Sunday Times in a story headlined ‘Care home loses child 130 times‘:

“The true scale of the problem was revealed after a check of records on police computers. For every child officially recorded by local authorities as missing in 2010, another seven were unaccounted for without their absence being noted.”

Why is it important?

“The number who go missing is one of the indicators on which Ofsted judges how well children’s homes are performing and the homes have a legal duty to keep accurate records.

“However, there is evidence some homes are failing to do so. In one case, Ofsted gave a good report to a private children’s home in Worcestershire when police records showed 1,630 missing person reports in five years. Police stationed an officer at the home and pressed Ofsted to look closer. The home was downgraded to inadequate and it later closed.

“The risks of being missing from care are demonstrated by Zoe Thomsett, 17, who was Westminster council’s responsibility. It sent her to a care home in Herefordshire, where she went missing several times, the final time for three days. She had earlier been found at an address in Hereford, but because no record was kept, nobody checked the address. She died there of a drugs overdose.

“The troubled life of Dane Edgar, 14, ended with a drugs overdose at a friend’s house after he repeatedly went missing from a children’s home in Northumberland. Another 14-year-old, James Jordan, was killed when he absconded from care and was the passenger in a stolen car.”

Interests not registered

When there are no formal checks on declarations of interest, how can we rely on it? In Chile, the Ciudadano Inteligente Fundaciondecided to check the Chilean MPs’ register of assets and interests by building a database:

“No-one was analysing this data, so it was incomplete,” explained Felipe Heusser, executive president of the Fundacion. “We used technology to build a database, using a wide range of open data and mapped all the MPs’ interests. From that, we found that nearly 40% of MPs were not disclosing their assets fully.”

The organisation has now launched a database that “enables members of the public to find potential conflicts of interest by analysing the data disclosed through the members’ register of assets.”

Data laundering

Tony Hirst’s post about how dodgy data was “laundered” by Facebook in a consultants report is a good illustration of the need to ‘follow the data’.

We have some dodgy evidence, about which we’re biased, so we give it to an “independent” consultant who re-reports it, albeit with caveats, that we can then report, minus the caveats. Lovely, clean evidence. Our lobbyists can then go to a lazy policy researcher and take this scrubbed evidence, referencing it as finding in the Deloitte report, so that it can make its way into a policy briefing.”

“Things just don’t add up”

In the video below Ellen Miller of the Sunlight Foundation takes the US government to task over the inconsistencies in its transparency agenda, and the flawed data published on its USAspending.gov – so flawed that they launched the Clearspending website to automate and highlight the discrepancy between two sources of the same data:

Key budget decisions made on useless data

Sometimes data might appear to tell an astonishing story, but this turns out to be a mistake – and that mistake itself leads you to something much more newsworthy, as Channel 4′s FactCheck foundwhen it started trying to find out if councils had been cutting spending on Sure Start children’s centres:

“That ought to be fairly straightforward, as all councils by law have to fill in something called a Section 251 workbook detailing how much they are spending on various services for young people.

“… Brent Council in north London appeared to have slashed its funding by nearly 90 per cent, something that seemed strange, as we hadn’t heard an outcry from local parents.

“The council swiftly admitted making an accounting error – to the tune of a staggering £6m.”

And they weren’t the only ones. In fact, the Department for Education  admitted the numbers were “not very accurate”:

“So to recap, these spending figures don’t actually reflect the real amount of money spent; figures from different councils are not comparable with each other; spending in one year can’t be compared usefully with other years; and the government doesn’t propose to audit the figures or correct them when they’re wrong.”

This was particularly important because the S251 form “is the document the government uses to reallocate funding from council-run schools to its flagship academies.”:

“The Local Government Association (LGA) says less than £250m should be swiped from council budgets and given to academies, while the government wants to cut more than £1bn, prompting accusations that it is overfunding its favoured schools to the detriment of thousands of other children.

“Many councils’ complaints, made plain in responses to an ongoing government consultation, hinge on DfE’s use of S251, a document it has variously described as “unaudited”, “flawed” and”not fit for purpose”.

No data is still a story

Sticking with education, the TES reports on the outcome of an FOI request on the experience of Ofsted inspectors:

“[Stephen] Ball submitted a Freedom of Information request, asking how many HMIs had experience of being a secondary head, and how many of those had led an outstanding school. The answer? Ofsted “does not hold the details”.

““Secondary heads and academy principals need to be reassured that their work is judged by people who understand its complexity,” Mr Ball said. “Training as a good head of department or a primary school leader on the framework is no longer adequate. Secondary heads don’t fear judgement, but they expect to be judged by people who have experience as well as a theoretical training. After all, a working knowledge of the highway code doesn’t qualify you to become a driving examiner.”

“… Sir Michael Wilshaw, Ofsted’s new chief inspector, has already argued publicly that raw data are a key factor in assessing a school’s performance. By not providing the facts to back up its boasts about the expertise of its inspectors, many heads will remain sceptical of the watchdog’s claims.”

Men aren’t as tall as they say they are

To round off, here’s a quirky piece of data journalism by dating site OkCupid, which looked at the height of its members and found an interesting pattern:

Male height distribution on OKCupid

“The male heights on OkCupid very nearly follow the expected normal distribution—except the whole thing is shifted to the right of where it should be.

“Almost universally guys like to add a couple inches. You can also see a more subtle vanity at work: starting at roughly 5′ 8″, the top of the dotted curve tilts even further rightward. This means that guys as they get closer to six feet round up a bit more than usual, stretching for that coveted psychological benchmark.”

Do you know of any other examples of bad data forming the basis of a story? Please post a comment – I’m collecting examples.

UPDATE (April 20 2012): A useful addition from Simon Rogers: Named and shamed: the worst government annual reports explains why government department spending reports fail to support the Government’s claimed desire for an “army of armchair auditors”, with a list of the worst offenders at the end.

Also:


Filed under: online journalism Tagged: bad data, benford's law, BIJ, bureau of investigative journalism, Channel 4, Chile, Ciudadano Inteligente Fundacion, Clearspending, data laundering, dating, Deaths in custody, ellen miller, FactCheck, Felipe Heusser, height, IPCC, Lauren York, missing children, OKCupid, Philip Shakesheff, register of interests, S251, sex trafficking, simon rogers, sunday times, sunlight foundation, tony hirst
Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl