Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 29 2011

08:27

2 guest posts: 2012 predictions and “Social media and the evolution of the fourth estate”

I’ve written a couple of guest posts for Nieman Journalism Lab and the tech news site Memeburn. The Nieman post is part of a series looking forward to 2012. I’m never a fan of futurology so I’ve cheated a little and talked about developments already in progress: new interface conventions in news websites; the rise of collaboration; and the skilling up of journalists in data.

Memeburn asked me a few months ago to write about social media’s impact on journalism’s role as the Fourth Estate, and it took me until this month to find the time to do so. Here’s the salient passage:

“But the power of the former audience is a power that needs to be held to account too, and the rise of liveblogging is teaching reporters how to do that: reacting not just to events on the ground, but the reporting of those events by the people taking part: demonstrators and police, parents and politicians all publishing their own version of events — leaving journalists to go beyond documenting what is happening, and instead confirming or debunking the rumours surrounding that.

“So the role of journalist is moving away from that of gatekeeper and — as Axel Bruns argues — towards that of gatewatcher: amplifying the voices that need to be heard, factchecking the MPs whose blogs are 70% fiction or the Facebook users scaremongering about paedophiles.

“But while we are still adapting to this power shift, we should also recognise that that power is still being fiercely fought-over. Old laws are being used in new waysnew laws are being proposed to reaffirm previous relationships. Some of these may benefit journalists — but ultimately not journalism, nor its fourth estate role. The journalists most keenly aware of this — Heather Brooke in her pursuit of freedom of information; Charles Arthur in his campaign to ‘Free Our Data’ — recognise that journalists’ biggest role as part of the fourth estate may well be to ensure that everyone has access to information that is of public interest, that we are free to discuss it and what it means, and that — in the words of Eric S. Raymond — “Given enough eyeballs, all bugs are shallow“.”

Comments, as always, very welcome.

October 04 2010

07:41

Where should an aspiring data journalist start?

In writing last week’s Guardian Data Blog piece on How to be a data journalist I asked various people involved in data journalism where they would recommend starting. The answers are so useful that I thought I’d publish them in full here.

The Telegraph’s Conrad Quilty-Harper:

Start reading:

http://www.google.com/reader/bundle/user%2F06076274130681848419%2Fbundle%2Fdatavizfeeds

Keep adding to your knowledge and follow other data journalists/people who work with data on Twitter.

Look for sources of data:

ONS stats release calendar is a good start http://www.statistics.gov.uk/hub/release-calendar/index.html Look at the Government data stores (Data.gov, Data.gov.uk, Data.london.gov.uk etc).

Check out What do they know, Freebase, Wikileaks, Manyeyes, Google Fusion charts.

Find out where hidden data is and try and get hold of it: private companies looking for publicity, under appreciated research departments, public bodies that release data but not in a granular form (e.g. Met Office).

Test out cleaning/visualisation tools:

You want to be able to collect data, clean it, visualise it and map it.

Obviously you need to know basic Excel skills (pivot tables are how journalists efficiently get headline numbers from big spreadsheets).

For publishing just use Google Spreadsheets graphs, or ManyEyes or Timetric. Google MyMaps coupled with http://batchgeo.com is a great beginner mapping combo.

Further on from that you want to try out Google Spreadsheets importURL service, Yahoo Pipes for cleaning data, Freebase Gridworks and Dabble DB.

More advanced stuff you want to figure out query language and be able to work with relational databases, Google BigQuery, Google Visualisation API (http://code.google.com/apis/charttools/), Google code playgrounds (http://code.google.com/apis/ajax/playground/?type=visualization#org_chart) and other Javascript tools. The advanced mapping equivalents are ArcGIS or GeoConcept, allowing you to query geographical data and find stories.

You could also learn some Ruby for building your own scrapers, or Python for ScraperWiki.

Get inspired:

Get the data behind some big data stories you admire, try and find a story, visualise it and blog about it. You’ll find that the whole process starts with the data, and your interpretation of it. That needs to be newsworthy/valuable.

Look to the past!

Edward Tufte’s work is very inspiring: http://www.edwardtufte.com/tufte/ His favourite data visualisation is from 1869! Or what about John Snow’s Cholera map? http://www.york.ac.uk/depts/maths/histstat/snow_map.htm

And for good luck here’s an assorted list of visualisation tutorials.

The Times’ Jonathan Richards

I’d say a couple of blogs.

Others that spring to mind are:

If people want more specific advice, tell them to come to the next London Hack/Hackers and track me down!

The Guardian’s Charles Arthur:

Obvious thing: find a story that will be best told through numbers. (I’m thinking about quizzing my local council about the effects of stopping free swimming for children. Obvious way forward: get numbers for number of children swimming before, during and after free swimming offer.)

If someone already has the skills for data journalism (which I’d put at (1) understanding statistics and relevance (2) understanding how to manipulate data (3) understanding how to make the data visual) the key, I’d say, is always being able to spot a story that can be told through data – and only makes sense that way, and where being able to manipulate the data is key to extracting the story. It’s like interviewing the data. Good interviewers know how to get what they want out from the conversation. Ditto good data journalists and their data.

The New York Times’ Aron Pilhofer:

I would start small, and start with something you already know and already do. And always, always, always remember that the goal here is journalism. There is a tendency to focus too much on the skills for the sake of skills, and not enough on how those skills help enable you to do better journalism. Be pragmatic about it, and resist the tendency to think you need to know everything about the techy stuff before you do anything — nothing could be further from the truth.

Less abstractly, I would start out learning some basic computer-assisted reporting skills and then moving from there as your interests/needs dictate. A lot of people see the programmer/journalism thing as distinct from computer-assisted reporting, but I don’t. I see it as a continuum. I see CAR as a “gateway drug” of sorts: Once you start working with small data sets using tools like Excel, Access, MySQL, etc., you’ll eventually hit limits of what you can do with macros and SQL.

Soon enough, you’ll want to be able to script certain things. You’ll want to get data from the web. You’ll want to do things you can only do using some kind of scripting language, and so it begins.

But again, the place to start isn’t thinking about all these technologies. The place to start is thinking about how these technologies can enable you to tell stories you otherwise would never be able to tell otherwise. And you should start small. Look for little things to start, and go from there.

September 22 2010

10:40

Why did you get into data journalism?

In researching my book chapter I asked a group of journalists who worked with data what led them to do so. Here are their answers:

Jonathon Richards, The Times:

The flood of information online presents an amazing opportunity for journalists, but also a challenge: how on earth does one keep up with; make sense of it? You could go about it in the traditional way, fossicking in individual sites, but much of the journalistic value in this outpouring, it seems, comes in aggregation: in processing large amounts of data, distilling them, and exploring them for patterns. To do that – unless you’re superhuman, or have a small army of volunteers – you need the help of a computer.

I ‘got into’ data journalism because I find this mix exciting. It appeals to the traditional journalistic instinct, but also calls for a new skill which, once harnessed, dramatically expands the realm of ‘stories I could possibly investigate…’

Mary Hamilton, Eastern Daily Press:

I started coding out of necessity, not out of desire. In my day-to-day work for local newspapers I came across stories that couldn’t be told any other way. Excel spreadsheets full of data that I knew was relevant to readers if I could break it down or aggregate it up. Lists of locations that meant nothing on the page without a map. Timelines of events and stacks of documents. The logical response for me was to try to develop the skills to parse data to get to the stories it can tell, and to present it in interactive, interesting and – crucially – relevant ways. I see data journalism as an important skill in my storytelling toolkit – not the only option, but an increasingly important way to open up information to readers and users.

Charles Arthur, The Guardian:

When I was really young, I read a book about computers which made the point – rather effectively – that if you found yourself doing the same process again and again, you should hand it over to a computer. That became a rule for me: never do some task more than once if you can possibly get a computer to do it.

Obviously, to implement that you have to do a bit of programming. It turns out all programming languages are much the same – they vary in their grammar, but they’re all about making the computer do stuff. And it’s often the same stuff (at least in my ambit) – fetch a web page, mash up two sets of data, filter out some rubbish and find the information you want.

I got into data journalism because I also did statistics – and that taught me that people are notoriously bad at understanding data. Visualisation and simplification and exposition are key to helping people understand.

So data journalism is a compound of all those things: determination to make the computer do the slog, confidence that I can program it to, and the desire to tell the story that the data is holding and hiding.

I don’t think there was any particular point where I suddenly said “ooh, this is data journalism” – it’s more that the process of thinking “oh, big dataset, stuff it into an ad-hoc MySQL database, left join against that other database I’ve got, see what comes out” goes from being a huge experiment to your natural reaction.

It’s not just data though – I use programming to slough off the repetitive tasks of the day, such as collecting links, or resizing pictures, or getting the picture URL and photographer and licence from a Flickr page and stuffing it into a blogpost.

Data journalism is actually only half the story. The other half is that journalists should be **actively unwilling** to do repetitive tasks if it’s machine-like (say, removing line breaks from a piece of copy, or changing a link format).

Time spent doing those sorts of tasks is time lost to journalism and given up to being a machine. Let the damn machines do it. Humans have better things to do.

Stijn Debrouwere, Belgian information designer:

I used to love reading the daily newspaper, but lately I can’t seem to be bothered anymore. I’m part of that generation of people news execs fear so much: those that simply don’t care about what newspapers and news magazines have to offer. I enjoy being an information designer because it gives me a chance to help reinvent the way we engage and inform communities through news and analysis, both offline and online. Technology doesn’t solve everything, but it sure can help. My professional goal is simply this: make myself love news and newspapers again, and thereby hopefully getting others to love it too.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl