Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

August 02 2012

14:18

A case study in online journalism: investigating the Olympic torch relay

Torch relay places infographic by Caroline Beavon

For the last two months I’ve been involved in an investigation which has used almost every technique in the online journalism toolbox. From its beginnings in data journalism, through collaboration, community management and SEO to ‘passive-aggressive’ newsgathering,  verification and ebook publishing, it’s been a fascinating case study in such a range of ways I’m going to struggle to get them all down.

But I’m going to try.

Data journalism: scraping the Olympic torch relay

The investigation began with the scraping of the official torchbearer website. It’s important to emphasise that this piece of data journalism didn’t take place in isolation – in fact, it was while working with Help Me Investigate the Olympics‘s Jennifer Jones (coordinator for#media2012, the first citizen media network for the Olympic Games) and others that I stumbled across the torchbearer data. So networks and community are important here (more later).

Indeed, it turned out that the site couldn’t be scraped through a ‘normal’ scraper, and it was the community of the Scraperwiki site – specifically Zarino Zappia – who helped solve the problem and get a scraper working. Without both of those sets of relationships – with the citizen media network and with the developer community on Scraperwiki – this might never have got off the ground.

But it was also important to see the potential newsworthiness in that particular part of the site. Human stories were at the heart of the torch relay – not numbers. Local pride and curiosity was here – a key ingredient of any local newspaper. There were the promises made by its organisers – had they been kept?

The hunch proved correct – this dataset would just keep on giving stories.

The scraper grabbed details on around 6,000 torchbearers. I was curious why more weren’t listed – yes, there were supposed to be around 800 invitations to high profile torchbearers including celebrities, who might reasonably be expected to be omitted at least until they carried the torch – but that still left over 1,000.

I’ve written a bit more about the scraping and data analysis process for The Guardian and the Telegraph data blog. In a nutshell, here are some of the processes used:

  • Overview (pivot table): where do most come from? What’s the age distribution?
  • Focus on details in the overview: what’s the most surprising hometown in the top 5 or 10? Who’s oldest and youngest? What about the biggest source outside the UK?
  • Start asking questions of the data based on what we know it should look like – and hunches
  • Don’t get distracted – pick a focus and build around it.

This last point is notable. As I looked for mentions of Olympic sponsors in nomination stories, I started to build up subsets of the data: a dozen people who mentioned BP, two who mentioned ArcelorMittal (the CEO and his son), and so on. Each was interesting in its own way – but where should you invest your efforts?

One story had already caught my eye: it was written in the first person and talked about having been “engaged in the business of sport”. It was hardly inspirational. As it mentioned adidas, I focused on the adidas subset, and found that the same story was used by a further six people – a third of all of those who mentioned the company.

Clearly, all seven people hadn’t written the same story individually, so something was odd here. And that made this more than a ‘rotten apple’ story, but something potentially systemic.

Signals

While the data was interesting in itself, it was important to treat it as a set of signals to potentially more interesting exploration. Seven torchbearers having the same story was one of those signals. Mentions of corporate sponsors was another.

But there were many others too.

That initial scouring of the data had identified a number of people carrying the torch who held executive positions at sponsors and their commercial partners. The Guardian, The Independent and The Daily Mail were among the first to report on the story.

I wondered if the details of any of those corporate torchbearers might have been taken off off the site afterwards. And indeed they had: seven disappeared entirely (many still had a profile if you typed in the URL directly - but could not be found through search or browsing), and a further two had had their stories removed.

Now, every time I scraped details from the site I looked for those who had disappeared since the last scrape, and those that had been added late.

One, for example – who shared a name with a very senior figure at one of the sponsors – appeared just once before disappearing four days later. I wouldn’t have spotted them if they – or someone else – hadn’t been so keen on removing their name.

Another time, I noticed that a new torchbearer had been added to the list with the same story as the 7 adidas torchbearers. He turned out to be the Group Chief Executive of the country’s largest catalogue retailer, providing “continuing evidence that adidas ignored LOCOG guidance not to nominate executives.”

Meanwhile, the number of torchbearers running without any nomination story went from just 2.7% in the first scrape of 6,056 torchbearers, to 7.2% of 6,891 torchbearers in the last week, and 8.1% of all torchbearers – including those who had appeared and then disappeared – who had appeared between the two dates.

Many were celebrities or sportspeople where perhaps someone had taken the decision that they ‘needed no introduction’. But many also turned out to be corporate torchbearers.

By early July the numbers of these ‘mystery torchbearers’ had reached 500 and, having only identified a fifth, we published them through The Guardian datablog.

There were other signals, too, where knowing the way the torch relay operated helped.

For example, logistics meant that overseas torchbearers often carried the torch in the same location. This led to a cluster of Chinese torchbearers in Stansted, Hungarians in Dorset, Germans in Brighton, Americans in Oxford and Russians in North Wales.

As many corporate torchbearers were also based overseas, this helped narrow the search, with Germany’s corporate torchbearers in particular leading to an article in Der Tagesspiegel.

I also had the idea to total up how many torchbearers appeared each day, to identify days when details on unusually high numbers of torchbearers were missing – thanks to Adrian Short – but it became apparent that variation due to other factors such as weekends and the Jubilee made this worthless.

However, the percentage per day missing stories did help (visualised below by Caroline Beavon), as this also helped identify days when large numbers of overseas torchbearers were carrying the torch. I cross-referenced this with the ‘mystery torchbearer’ spreadsheet to see how many had already been checked, and which days still needed attention.

Daily totals - bar chart

But the data was just the beginning. In the second part of this case study, I’ll talk about the verification process.

14:18

A case study in online journalism: investigating the Olympic torch relay

Torch relay places infographic by Caroline Beavon

For the last two months I’ve been involved in an investigation which has used almost every technique in the online journalism toolbox. From its beginnings in data journalism, through collaboration, community management and SEO to ‘passive-aggressive’ newsgathering,  verification and ebook publishing, it’s been a fascinating case study in such a range of ways I’m going to struggle to get them all down.

But I’m going to try.

Data journalism: scraping the Olympic torch relay

The investigation began with the scraping of the official torchbearer website. It’s important to emphasise that this piece of data journalism didn’t take place in isolation – in fact, it was while working with Help Me Investigate the Olympics‘s Jennifer Jones (coordinator for#media2012, the first citizen media network for the Olympic Games) and others that I stumbled across the torchbearer data. So networks and community are important here (more later).

Indeed, it turned out that the site couldn’t be scraped through a ‘normal’ scraper, and it was the community of the Scraperwiki site – specifically Zarino Zappia – who helped solve the problem and get a scraper working. Without both of those sets of relationships – with the citizen media network and with the developer community on Scraperwiki – this might never have got off the ground.

But it was also important to see the potential newsworthiness in that particular part of the site. Human stories were at the heart of the torch relay – not numbers. Local pride and curiosity was here – a key ingredient of any local newspaper. There were the promises made by its organisers – had they been kept?

The hunch proved correct – this dataset would just keep on giving stories.

The scraper grabbed details on around 6,000 torchbearers. I was curious why more weren’t listed – yes, there were supposed to be around 800 invitations to high profile torchbearers including celebrities, who might reasonably be expected to be omitted at least until they carried the torch – but that still left over 1,000.

I’ve written a bit more about the scraping and data analysis process for The Guardian and the Telegraph data blog. In a nutshell, here are some of the processes used:

  • Overview (pivot table): where do most come from? What’s the age distribution?
  • Focus on details in the overview: what’s the most surprising hometown in the top 5 or 10? Who’s oldest and youngest? What about the biggest source outside the UK?
  • Start asking questions of the data based on what we know it should look like – and hunches
  • Don’t get distracted – pick a focus and build around it.

This last point is notable. As I looked for mentions of Olympic sponsors in nomination stories, I started to build up subsets of the data: a dozen people who mentioned BP, two who mentioned ArcelorMittal (the CEO and his son), and so on. Each was interesting in its own way – but where should you invest your efforts?

One story had already caught my eye: it was written in the first person and talked about having been “engaged in the business of sport”. It was hardly inspirational. As it mentioned adidas, I focused on the adidas subset, and found that the same story was used by a further six people – a third of all of those who mentioned the company.

Clearly, all seven people hadn’t written the same story individually, so something was odd here. And that made this more than a ‘rotten apple’ story, but something potentially systemic.

Signals

While the data was interesting in itself, it was important to treat it as a set of signals to potentially more interesting exploration. Seven torchbearers having the same story was one of those signals. Mentions of corporate sponsors was another.

But there were many others too.

That initial scouring of the data had identified a number of people carrying the torch who held executive positions at sponsors and their commercial partners. The Guardian, The Independent and The Daily Mail were among the first to report on the story.

I wondered if the details of any of those corporate torchbearers might have been taken off off the site afterwards. And indeed they had: seven disappeared entirely (many still had a profile if you typed in the URL directly - but could not be found through search or browsing), and a further two had had their stories removed.

Now, every time I scraped details from the site I looked for those who had disappeared since the last scrape, and those that had been added late.

One, for example – who shared a name with a very senior figure at one of the sponsors – appeared just once before disappearing four days later. I wouldn’t have spotted them if they – or someone else – hadn’t been so keen on removing their name.

Another time, I noticed that a new torchbearer had been added to the list with the same story as the 7 adidas torchbearers. He turned out to be the Group Chief Executive of the country’s largest catalogue retailer, providing “continuing evidence that adidas ignored LOCOG guidance not to nominate executives.”

Meanwhile, the number of torchbearers running without any nomination story went from just 2.7% in the first scrape of 6,056 torchbearers, to 7.2% of 6,891 torchbearers in the last week, and 8.1% of all torchbearers – including those who had appeared and then disappeared – who had appeared between the two dates.

Many were celebrities or sportspeople where perhaps someone had taken the decision that they ‘needed no introduction’. But many also turned out to be corporate torchbearers.

By early July the numbers of these ‘mystery torchbearers’ had reached 500 and, having only identified a fifth, we published them through The Guardian datablog.

There were other signals, too, where knowing the way the torch relay operated helped.

For example, logistics meant that overseas torchbearers often carried the torch in the same location. This led to a cluster of Chinese torchbearers in Stansted, Hungarians in Dorset, Germans in Brighton, Americans in Oxford and Russians in North Wales.

As many corporate torchbearers were also based overseas, this helped narrow the search, with Germany’s corporate torchbearers in particular leading to an article in Der Tagesspiegel.

I also had the idea to total up how many torchbearers appeared each day, to identify days when details on unusually high numbers of torchbearers were missing – thanks to Adrian Short – but it became apparent that variation due to other factors such as weekends and the Jubilee made this worthless.

However, the percentage per day missing stories did help (visualised below by Caroline Beavon), as this also helped identify days when large numbers of overseas torchbearers were carrying the torch. I cross-referenced this with the ‘mystery torchbearer’ spreadsheet to see how many had already been checked, and which days still needed attention.

Daily totals - bar chart

But the data was just the beginning. In the second part of this case study, I’ll talk about the verification process.

January 19 2012

10:52

20 free ebooks on journalism (for your Xmas Kindle) {updated to 38}

As many readers of this blog will have received a Kindle for Christmas I thought I should share my list of the free ebooks that I recommend stocking up on.

Online journalism and multimedia ebooks

Starting with more general books, Mark Briggs‘s book Journalism 2.0 (PDF*) is now 4 years old but still provides a good overview of online journalism to have by your side. Mindy McAdams‘s 42-page Reporter’s Guide to Multimedia Proficiency (PDF) adds some more on that front, and Adam Westbrook‘s Ideas on Digital Storytelling and Publishing (PDF) provides a larger focus on narrative, editing and other elements.

After the first version of this post, MA Online Journalism student Franzi Baehrle suggested this free book on DSLR Cinematography, as well as Adam Westbrook on multimedia production (PDF). And Guy Degen recommends the free ebook on news and documentary filmmaking from ImageJunkies.com.

A free ebook on blogging can be downloaded from Guardian Students when you register with the site, and Swedish Radio have produced this guide to Social Media for Journalists (in English).

Computer assisted reporting ebooks

The Society of Professional Journalists‘s Digital Media Handbook Part 1 (PDF) and Part 2 cover more multimedia, but also provide a pot-pourri of extra bits and pieces including computer assisted reporting (CAR).

For more on CAR, the first edition of Philip Meyer‘s classic The New Precision Journalism is available in full online, although you’ll have to download each chapter in Word format and email it to your Kindle for conversion. It’s worth it: 20 years on, his advice is still excellent.

You’ll also have to download each chapter of the Data Journalism Handbook separately, or you can pay for a single-download ebook or physical version.

For a walkthrough on using some data techniques in the health field, this ebook on reporting health gives some excellent advice. Although it uses US data which is rather more accessible and structured than in most other countries, the principles are illustrative for readers anywhere.

If you want to explore statistics or programming further, Think Stats (via Adrian Short) covers both. The Bastards Book of Regular Expressions is a useful introduction to more programming – it’s free if you choose a zero price, but you can also pay whatever you want.

On visualisation, here’s Chapter 1 and Chapter 2 from a book by Alberto Cairo (from a free course at the Knight Center).

On advanced search, Untangling The Web: A Guide to Internet Research is a whopping 643-page document released by the US National Security Agency following an FOIA request (thanks Neurobonkers). Sadly it’s scanned so you won’t be able to convert this to another format.

Community management ebooks

Jono Bacon‘s The Art of Community (PDF), comes in at over 360 pages and is a thorough exploration – told largely through his own experiences – of an area that too few journalists understand.

The Proven Path (PDF) by Richard Millington is a more concise overview by one of the field’s leading voices (via Jan Kampmann).

A useful complement to these is Yochai Benkler‘s landmark book on how networked individuals operate, The Wealth of Networks, which is available to download in full or part online from his page at Harvard University’s Berkman Center. And each chapter of Dan Gillmor’s We The Media is available in PDF format on O’Reilly’s site.

More recently, New Forms of Collaborative Innovation and Production on the Internet (PDF) is a free ebook from the University of Gottingen with a collection of chapters covering practices such as consumer co-creation, trust management in online communities, and “coordination and motivation of consumer contribution”.

Staying savvy in the information war

Simply dealing with the flood of information and work deserves a book itself – and one free option is SmarterEveryday: Design Your Day - Adam Tinworth is among the contributors.

If you’re reporting on health issues – or ever expect to deal with a press release from a health company – Testing Treatments (PDF) is well worth a read, providing an insight into how medicines and treatments are tested, and popular misconceptions to avoid. It’s littered with examples from reporting on health in the media, and well written. And if you need persuading why you should care, read this post (all of it) by Dr Petra Boynton on what happens when journalists fail to scrutinise press releases from health companies.

More broadly on the subject of keeping your wits about you, Dan Gillmor‘s latest book on media literacy, Mediactive, is published under a Creative Commons licence as a PDF. And The American Copy Editors Society has published a 50-page ebook on attribution and plagiarism which includes social media and other emerging platforms.

Ebooks on culture, copyright and code

Lawrence Lessig has written quite a few books about law and how it relates to the media when content becomes digitised, as well as code more generally. Most of his work is available online for free download, including The Future of Ideas (PDF), Code 2.0 (PDF), Remix, and Free Culture.

Matt Mason‘s book on how media culture is changed by “pirates” gives you a choice: you can download The Pirate’s Dilemma for whatever price you choose to pay, including nothing.

Investigative Journalism

Mark Lee Hunter has written 2 great free ebooks which strip away the mystique that surrounds investigative journalism and persuades so many journalists that it’s something ‘other people do’.

The first, Story-Based Inquiry (PDF), is an extremely useful guide to organising and focusing an investigation, demonstrating that investigative journalism is more about being systematic than about meeting strangers in underground car parks.

The second, The Global Casebook (PDF), is brilliant: a collection of investigative journalism – but with added commentary by each journalist explaining their methods and techniques. Where Story-Based Inquiry provides an over-arching framework; The Global Casebook demonstrates how different approaches can work for different stories and contexts.

He’s also worked with Luuk Sengers to produce Nine Steps from Idea to Story (PDF), which puts the story-based method into step-by-step form.

For more tips on investigative journalism the Investigative Journalism Manual (you’ll have to download each chapter separately) provides guidance from an African perspective which still applies whatever country you practise journalism.

And if you’re particularly interested in corruption you may also want to download Paul Radu‘s 50-page ebook Follow The Money: A Digital Guide for Tracking Corruption (PDF).

The CPJ have also published the Journalist Security Guide, a free ebook for anyone who needs to protect sources or work in dangerous environments. Scroll down to the bottom to find links to PDF, Kindle, ePub and iPad versions.

Related subjects: design, programming

That’s 17 18 so many books I’m losing count, but if you want to explore design or programming there are dozens more out there. In particular, How to Think Like a Computer Scientistis a HTML ebook, but the Kindle deals with HTML pages too. Also in HTML is Probabilistic Programming and Bayesian Methods for Hackers (more statistics), and Digital Foundations: Introduction to Media Design (h/t Jon Hickman).

Have I missed anything?

Those are just the books that spring to mind or that I’ve previously bookmarked. Are there others I’ve missed?

*Some commenters have suggested I should point out that these are mostly PDFs, which some people don’t like. You can, however, convert a PDF to Kindle’s own mobi format by emailing it to your Kindle email address with ‘convert’ as the subject line (via Leonie in the comments). Christian Payne also recommends the free tool calibre for converting PDFs into the more Kindle-friendly .mobi and other formats.

Alternatively, if you change the orientation to landscape the original PDF can be read with formatting and images intact.

UPDATES [12 Jan 2012]: Now translated into Catalan by Alvaro Martinez. [20 Jan 2012]: Dan Gillmor’s We The Media added to make a round 20. [22 March 2012]: A book on DSLR, another on multimedia, and a third on news and documentary filmmaking added. [27 April 2012]: A book on security for journalists added. [29 April]: the Data Journalism Handbook added. [3 July 2012]: Mark Lee Hunter’s 3rd book added. [4 October 2012]: Adam Westbrook’s book on multimedia added. [5 February 2013]: ebooks on health data journalism and statistics added. [3 April 2013]: Guardian Students’ How to Blog ebook and The Bastards Book of Regular Expressions added. [2 May 2013]: book on plagiarism added. [10 May]: books on productivity and advanced search added. [2 June]: book on social media for journalists added, and Bayesian methods. [12 June]: book added on collaboration and innovation in online publishing.

 

10:52

20 free ebooks on journalism (for your Xmas Kindle) {updated to 38}

As many readers of this blog will have received a Kindle for Christmas I thought I should share my list of the free ebooks that I recommend stocking up on.

Online journalism and multimedia ebooks

Starting with more general books, Mark Briggs‘s book Journalism 2.0 (PDF*) is now 4 years old but still provides a good overview of online journalism to have by your side. Mindy McAdams‘s 42-page Reporter’s Guide to Multimedia Proficiency (PDF) adds some more on that front, and Adam Westbrook‘s Ideas on Digital Storytelling and Publishing (PDF) provides a larger focus on narrative, editing and other elements.

After the first version of this post, MA Online Journalism student Franzi Baehrle suggested this free book on DSLR Cinematography, as well as Adam Westbrook on multimedia production (PDF). And Guy Degen recommends the free ebook on news and documentary filmmaking from ImageJunkies.com.

A free ebook on blogging can be downloaded from Guardian Students when you register with the site, and Swedish Radio have produced this guide to Social Media for Journalists (in English).

Computer assisted reporting ebooks

The Society of Professional Journalists‘s Digital Media Handbook Part 1 (PDF) and Part 2 cover more multimedia, but also provide a pot-pourri of extra bits and pieces including computer assisted reporting (CAR).

For more on CAR, the first edition of Philip Meyer‘s classic The New Precision Journalism is available in full online, although you’ll have to download each chapter in Word format and email it to your Kindle for conversion. It’s worth it: 20 years on, his advice is still excellent.

You’ll also have to download each chapter of the Data Journalism Handbook separately, or you can pay for a single-download ebook or physical version.

For a walkthrough on using some data techniques in the health field, this ebook on reporting health gives some excellent advice. Although it uses US data which is rather more accessible and structured than in most other countries, the principles are illustrative for readers anywhere.

If you want to explore statistics or programming further, Think Stats (via Adrian Short) covers both. The Bastards Book of Regular Expressions is a useful introduction to more programming – it’s free if you choose a zero price, but you can also pay whatever you want.

On visualisation, here’s Chapter 1 and Chapter 2 from a book by Alberto Cairo (from a free course at the Knight Center).

On advanced search, Untangling The Web: A Guide to Internet Research is a whopping 643-page document released by the US National Security Agency following an FOIA request (thanks Neurobonkers). Sadly it’s scanned so you won’t be able to convert this to another format.

Community management ebooks

Jono Bacon‘s The Art of Community (PDF), comes in at over 360 pages and is a thorough exploration – told largely through his own experiences – of an area that too few journalists understand.

The Proven Path (PDF) by Richard Millington is a more concise overview by one of the field’s leading voices (via Jan Kampmann).

A useful complement to these is Yochai Benkler‘s landmark book on how networked individuals operate, The Wealth of Networks, which is available to download in full or part online from his page at Harvard University’s Berkman Center. And each chapter of Dan Gillmor’s We The Media is available in PDF format on O’Reilly’s site.

More recently, New Forms of Collaborative Innovation and Production on the Internet (PDF) is a free ebook from the University of Gottingen with a collection of chapters covering practices such as consumer co-creation, trust management in online communities, and “coordination and motivation of consumer contribution”.

Staying savvy in the information war

Simply dealing with the flood of information and work deserves a book itself – and one free option is SmarterEveryday: Design Your Day - Adam Tinworth is among the contributors.

If you’re reporting on health issues – or ever expect to deal with a press release from a health company – Testing Treatments (PDF) is well worth a read, providing an insight into how medicines and treatments are tested, and popular misconceptions to avoid. It’s littered with examples from reporting on health in the media, and well written. And if you need persuading why you should care, read this post (all of it) by Dr Petra Boynton on what happens when journalists fail to scrutinise press releases from health companies.

More broadly on the subject of keeping your wits about you, Dan Gillmor‘s latest book on media literacy, Mediactive, is published under a Creative Commons licence as a PDF. And The American Copy Editors Society has published a 50-page ebook on attribution and plagiarism which includes social media and other emerging platforms.

Ebooks on culture, copyright and code

Lawrence Lessig has written quite a few books about law and how it relates to the media when content becomes digitised, as well as code more generally. Most of his work is available online for free download, including The Future of Ideas (PDF), Code 2.0 (PDF), Remix, and Free Culture.

Matt Mason‘s book on how media culture is changed by “pirates” gives you a choice: you can download The Pirate’s Dilemma for whatever price you choose to pay, including nothing.

Investigative Journalism

Mark Lee Hunter has written 2 great free ebooks which strip away the mystique that surrounds investigative journalism and persuades so many journalists that it’s something ‘other people do’.

The first, Story-Based Inquiry (PDF), is an extremely useful guide to organising and focusing an investigation, demonstrating that investigative journalism is more about being systematic than about meeting strangers in underground car parks.

The second, The Global Casebook (PDF), is brilliant: a collection of investigative journalism – but with added commentary by each journalist explaining their methods and techniques. Where Story-Based Inquiry provides an over-arching framework; The Global Casebook demonstrates how different approaches can work for different stories and contexts.

He’s also worked with Luuk Sengers to produce Nine Steps from Idea to Story (PDF), which puts the story-based method into step-by-step form.

For more tips on investigative journalism the Investigative Journalism Manual (you’ll have to download each chapter separately) provides guidance from an African perspective which still applies whatever country you practise journalism.

And if you’re particularly interested in corruption you may also want to download Paul Radu‘s 50-page ebook Follow The Money: A Digital Guide for Tracking Corruption (PDF).

The CPJ have also published the Journalist Security Guide, a free ebook for anyone who needs to protect sources or work in dangerous environments. Scroll down to the bottom to find links to PDF, Kindle, ePub and iPad versions.

Related subjects: design, programming

That’s 17 18 so many books I’m losing count, but if you want to explore design or programming there are dozens more out there. In particular, How to Think Like a Computer Scientistis a HTML ebook, but the Kindle deals with HTML pages too. Also in HTML is Probabilistic Programming and Bayesian Methods for Hackers (more statistics), and Digital Foundations: Introduction to Media Design (h/t Jon Hickman).

Have I missed anything?

Those are just the books that spring to mind or that I’ve previously bookmarked. Are there others I’ve missed?

*Some commenters have suggested I should point out that these are mostly PDFs, which some people don’t like. You can, however, convert a PDF to Kindle’s own mobi format by emailing it to your Kindle email address with ‘convert’ as the subject line (via Leonie in the comments). Christian Payne also recommends the free tool calibre for converting PDFs into the more Kindle-friendly .mobi and other formats.

Alternatively, if you change the orientation to landscape the original PDF can be read with formatting and images intact.

UPDATES [12 Jan 2012]: Now translated into Catalan by Alvaro Martinez. [20 Jan 2012]: Dan Gillmor’s We The Media added to make a round 20. [22 March 2012]: A book on DSLR, another on multimedia, and a third on news and documentary filmmaking added. [27 April 2012]: A book on security for journalists added. [29 April]: the Data Journalism Handbook added. [3 July 2012]: Mark Lee Hunter’s 3rd book added. [4 October 2012]: Adam Westbrook’s book on multimedia added. [5 February 2013]: ebooks on health data journalism and statistics added. [3 April 2013]: Guardian Students’ How to Blog ebook and The Bastards Book of Regular Expressions added. [2 May 2013]: book on plagiarism added. [10 May]: books on productivity and advanced search added. [2 June]: book on social media for journalists added, and Bayesian methods. [12 June]: book added on collaboration and innovation in online publishing.

 


Filed under: online journalism Tagged: adam tinworth, adam westbrook, adrian short, bayesian methods, Code 2.0, community management, CPJ, dan gillmor, Data Journalism Handbook, documentary, ebooks, Franzi Baerhle, free culture, global casebook, Guardian Students, Guy Degan, how to blog, imagejunkies, investigative journalism manual, jono bacon, Journalism 2.0, kindle, lawrence lessig, Mark Briggs, Mark Lee Hunter, matt mason, New Forms of Collaborative Innovation and Production on the Internet, nokia, paul radu, philip meyer, productivity, Proven Path, Remix, richard millington, security, SmarterEveryday: Design Your Day, story-based inquiry, Testing Treatments, the art of community, The Future of Ideas, The New Precision Journalism, The Pirate's Dilemma, University of Gottingen
10:52

20 free ebooks on journalism (for your Xmas Kindle) {updated to 38}

As many readers of this blog will have received a Kindle for Christmas I thought I should share my list of the free ebooks that I recommend stocking up on.

Online journalism and multimedia ebooks

Starting with more general books, Mark Briggs‘s book Journalism 2.0 (PDF*) is now 4 years old but still provides a good overview of online journalism to have by your side. Mindy McAdams‘s 42-page Reporter’s Guide to Multimedia Proficiency (PDF) adds some more on that front, and Adam Westbrook‘s Ideas on Digital Storytelling and Publishing (PDF) provides a larger focus on narrative, editing and other elements.

After the first version of this post, MA Online Journalism student Franzi Baehrle suggested this free book on DSLR Cinematography, as well as Adam Westbrook on multimedia production (PDF). And Guy Degen recommends the free ebook on news and documentary filmmaking from ImageJunkies.com.

A free ebook on blogging can be downloaded from Guardian Students when you register with the site, and Swedish Radio have produced this guide to Social Media for Journalists (in English).

Computer assisted reporting ebooks

The Society of Professional Journalists‘s Digital Media Handbook Part 1 (PDF) and Part 2 cover more multimedia, but also provide a pot-pourri of extra bits and pieces including computer assisted reporting (CAR).

For more on CAR, the first edition of Philip Meyer‘s classic The New Precision Journalism is available in full online, although you’ll have to download each chapter in Word format and email it to your Kindle for conversion. It’s worth it: 20 years on, his advice is still excellent.

You’ll also have to download each chapter of the Data Journalism Handbook separately, or you can pay for a single-download ebook or physical version.

For a walkthrough on using some data techniques in the health field, this ebook on reporting health gives some excellent advice. Although it uses US data which is rather more accessible and structured than in most other countries, the principles are illustrative for readers anywhere.

If you want to explore statistics or programming further, Think Stats (via Adrian Short) covers both. The Bastards Book of Regular Expressions is a useful introduction to more programming – it’s free if you choose a zero price, but you can also pay whatever you want.

On visualisation, here’s Chapter 1 and Chapter 2 from a book by Alberto Cairo (from a free course at the Knight Center).

On advanced search, Untangling The Web: A Guide to Internet Research is a whopping 643-page document released by the US National Security Agency following an FOIA request (thanks Neurobonkers). Sadly it’s scanned so you won’t be able to convert this to another format.

Community management ebooks

Jono Bacon‘s The Art of Community (PDF), comes in at over 360 pages and is a thorough exploration – told largely through his own experiences – of an area that too few journalists understand.

The Proven Path (PDF) by Richard Millington is a more concise overview by one of the field’s leading voices (via Jan Kampmann).

A useful complement to these is Yochai Benkler‘s landmark book on how networked individuals operate, The Wealth of Networks, which is available to download in full or part online from his page at Harvard University’s Berkman Center. And each chapter of Dan Gillmor’s We The Media is available in PDF format on O’Reilly’s site.

More recently, New Forms of Collaborative Innovation and Production on the Internet (PDF) is a free ebook from the University of Gottingen with a collection of chapters covering practices such as consumer co-creation, trust management in online communities, and “coordination and motivation of consumer contribution”.

Staying savvy in the information war

Simply dealing with the flood of information and work deserves a book itself – and one free option is SmarterEveryday: Design Your Day - Adam Tinworth is among the contributors.

If you’re reporting on health issues – or ever expect to deal with a press release from a health company – Testing Treatments (PDF) is well worth a read, providing an insight into how medicines and treatments are tested, and popular misconceptions to avoid. It’s littered with examples from reporting on health in the media, and well written. And if you need persuading why you should care, read this post (all of it) by Dr Petra Boynton on what happens when journalists fail to scrutinise press releases from health companies.

More broadly on the subject of keeping your wits about you, Dan Gillmor‘s latest book on media literacy, Mediactive, is published under a Creative Commons licence as a PDF. And The American Copy Editors Society has published a 50-page ebook on attribution and plagiarism which includes social media and other emerging platforms.

Ebooks on culture, copyright and code

Lawrence Lessig has written quite a few books about law and how it relates to the media when content becomes digitised, as well as code more generally. Most of his work is available online for free download, including The Future of Ideas (PDF), Code 2.0 (PDF), Remix, and Free Culture.

Matt Mason‘s book on how media culture is changed by “pirates” gives you a choice: you can download The Pirate’s Dilemma for whatever price you choose to pay, including nothing.

Investigative Journalism

Mark Lee Hunter has written 2 great free ebooks which strip away the mystique that surrounds investigative journalism and persuades so many journalists that it’s something ‘other people do’.

The first, Story-Based Inquiry (PDF), is an extremely useful guide to organising and focusing an investigation, demonstrating that investigative journalism is more about being systematic than about meeting strangers in underground car parks.

The second, The Global Casebook (PDF), is brilliant: a collection of investigative journalism – but with added commentary by each journalist explaining their methods and techniques. Where Story-Based Inquiry provides an over-arching framework; The Global Casebook demonstrates how different approaches can work for different stories and contexts.

He’s also worked with Luuk Sengers to produce Nine Steps from Idea to Story (PDF), which puts the story-based method into step-by-step form.

For more tips on investigative journalism the Investigative Journalism Manual (you’ll have to download each chapter separately) provides guidance from an African perspective which still applies whatever country you practise journalism.

And if you’re particularly interested in corruption you may also want to download Paul Radu‘s 50-page ebook Follow The Money: A Digital Guide for Tracking Corruption (PDF).

The CPJ have also published the Journalist Security Guide, a free ebook for anyone who needs to protect sources or work in dangerous environments. Scroll down to the bottom to find links to PDF, Kindle, ePub and iPad versions.

Related subjects: design, programming

That’s 17 18 so many books I’m losing count, but if you want to explore design or programming there are dozens more out there. In particular, How to Think Like a Computer Scientistis a HTML ebook, but the Kindle deals with HTML pages too. Also in HTML is Probabilistic Programming and Bayesian Methods for Hackers (more statistics), and Digital Foundations: Introduction to Media Design (h/t Jon Hickman).

Have I missed anything?

Those are just the books that spring to mind or that I’ve previously bookmarked. Are there others I’ve missed?

*Some commenters have suggested I should point out that these are mostly PDFs, which some people don’t like. You can, however, convert a PDF to Kindle’s own mobi format by emailing it to your Kindle email address with ‘convert’ as the subject line (via Leonie in the comments). Christian Payne also recommends the free tool calibre for converting PDFs into the more Kindle-friendly .mobi and other formats.

Alternatively, if you change the orientation to landscape the original PDF can be read with formatting and images intact.

UPDATES [12 Jan 2012]: Now translated into Catalan by Alvaro Martinez. [20 Jan 2012]: Dan Gillmor’s We The Media added to make a round 20. [22 March 2012]: A book on DSLR, another on multimedia, and a third on news and documentary filmmaking added. [27 April 2012]: A book on security for journalists added. [29 April]: the Data Journalism Handbook added. [3 July 2012]: Mark Lee Hunter’s 3rd book added. [4 October 2012]: Adam Westbrook’s book on multimedia added. [5 February 2013]: ebooks on health data journalism and statistics added. [3 April 2013]: Guardian Students’ How to Blog ebook and The Bastards Book of Regular Expressions added. [2 May 2013]: book on plagiarism added. [10 May]: books on productivity and advanced search added. [2 June]: book on social media for journalists added, and Bayesian methods. [12 June]: book added on collaboration and innovation in online publishing.

 


Filed under: online journalism Tagged: adam tinworth, adam westbrook, adrian short, bayesian methods, Code 2.0, community management, CPJ, dan gillmor, Data Journalism Handbook, documentary, ebooks, Franzi Baerhle, free culture, global casebook, Guardian Students, Guy Degan, how to blog, imagejunkies, investigative journalism manual, jono bacon, Journalism 2.0, kindle, lawrence lessig, Mark Briggs, Mark Lee Hunter, matt mason, New Forms of Collaborative Innovation and Production on the Internet, nokia, paul radu, philip meyer, productivity, Proven Path, Remix, richard millington, security, SmarterEveryday: Design Your Day, story-based inquiry, Testing Treatments, the art of community, The Future of Ideas, The New Precision Journalism, The Pirate's Dilemma, University of Gottingen

February 08 2011

11:43

How private is a tweet?

The PCC has made its first rulings on a complaint over newspapers republishing a person’s tweets. The background to this is the publication in The Daily Mail and the Independent on Sunday of tweets by civil servant Sarah Baskerville. Adrian Short sums up the stories pretty nicely: “We could be forgiven for thinking you’re trying to make the news rather than report it.”

The complaint came under the headings of privacy and accuracy. In a nutshell, the PCC have not upheld the complaints and, in the process, decided that a public Twitter account is not private. That seems fair enough. However, it is noted that “her Twitter account and her blog [which the Independent quoted from, along with her Flickr account] both included clear disclaimers that the views expressed were personal opinions and were not representative of her employer.”

The wider issue is of course about privacy as a whole, and about the relationship between our professional and private lives. The stories – as Adrian Short outlines so well – are strangely self-contained. ‘It is terrible that this civil servant has opinions and drinks occasionally, because someone like me might say that is it terrible…’

Next they’ll be saying that journalists have opinions and drink too…

July 06 2010

10:05

Don’t stop us digging into public spending data

A very disturbing discovery by Chris Taggart last week: a number of councils in the UK are handing over their ‘open’ data to a company which only allows it to be downloaded for “personal” use.

As Chris himself points out, this runs completely against the spirit of the push to release public data in a number of ways:

  • Data cannot be used for “commercial gain”. This includes publishers wanting to present the information in ways that make most sense to the reader, and startups wanting to find innovative ways to involve people in their local area. Oh, and that whole ‘Big Society‘ stuff.
  • The way the sites are built means you couldn’t scrape this information with a computer anyway
  • It’s only a part of the data. “Download the data from SpotlightOnSpend and it’s rather different from the published data [on the Windsor & Maidenhead site]. Different in that it is missing core data that is in W&M published data (e.g. categories), and that includes data that isn’t in the published data (e.g. data from 2008).”

It’s a very worrying path indeed. As Chris sums it up: ” Councils hand over all their valuable financial data to a company which aggregates for its own purposes, and, er, doesn’t open up the data, shooting down all those goals of mashing up the data, using the community to analyse and undermining much of the good work that’s been done.”

The Transparency Board quickly issued a statement about this issue saying that “urgent” measures are taking place to rectify the problem.

And Spikes Cavell, who make the software, responded in Information Age, pointing out that “it is first and foremost a spend analysis software and consultancy supplier, and that it publishes data through SpotlightOnSpend as a free, optional and supplementary service for its local government customers. The hope is that this might help the company to win business, he explains, but it is not a money-spinner in itself.”

They are now promising to make the data available for download in its “raw form”, although it’s not clear what that will be. Adrian Short’s comment to the piece is worth reading.

Nevertheless, this is an issue that anyone interested in holding power to account should keep a close eye on. And to that aim, Chris has started an investigation on Help Me Investigate to find out how and why councils are giving access to their spending data. Please join it and help here.

(Comment or email me on paul at helpmeinvestigate.com if you want an invitation.)

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl