Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

June 03 2011


Speaker presentations: Session 2B – Social media strategy

Find below presentations from Session 2B – Social media strategy.

The session featured Jack Riley, head of digital audience and content development, the Independent; Stefan Stern, director of strategy, Edelman; Mark Jones, global communities editor, Reuters News; Mark Johnson, community editor, the Economist and Suw Charman-Anderson, social technologist.

Jack Riley, head of digital audience and content development, the Independent

Mark Jones, global communities editor, Reuters News

Mark Johnson, community editor, the Economist

April 21 2011


A template for successful 'Hack Days.' What are your tips?

“Always be releasing.”

I’m going to be helping with some ‘Hack Days’ later this year for the Knight-Mozilla partnership. One in Berlin in September or October, and another with the Boston Hacks/Hackers chapter during the Online News Association conference.

I’d also like to organize some here in Toronto with our local Hacks/Hackers chapter, and support from the local start-up community, and the media organizations that come out to our event, i.e.: CBC, The Globe & Mail, Postmedia, Global News, OpenFile.ca, and so on.

(Hey, while I’m thinking of it, the Toronto chapter of Hacks/Hackers is pushing toward 300 members — if you haven’t joined yet, why don’t you? It only takes a minute.)

I’ve attended more than a few hack days in my life (or hackathons, code jams, or whatever the kids are calling them now). Let me tell you, in case you don’t already know from experience, that building useful software in 1-2 days with people you don’t know is fucking hard.

That’s why I admire events that have the infrastructure figured out (and I’m not talking about pizza and Jolt cola). Specifically, I’m impressed when they’ve answered the question “how are we going to host these apps, show them off publicly, and make the code available for others to build on?”

It sounds like the recent Buttercamp in NYC did a great job, but they didn’t get 100% of the demos online at the end of the event, and that’s a missed opportunity.

That should be the target that teams strive for, and the eligibility criteria to ‘win’ at the event. Code and demos should be online and visible to the other teams and to the public. Always be releasing.

This must be a solved problem? I’m hoping that there’s a write-up somewhere out there that will be posted in the comments. But, if that’s not the case, here’s what I’m proposing:

  • The event organizers do a fair bit of prep-work in advance to make the above possible. Specifically, by setting up an event-specific Github’organization’ that participants will be added to.

  • There should be a straightforward LAMP-stack Amazon EC2 instance made available for each team, or a bare bones instance that they can set-up with their preferred stack.

  • Each of the public URLs for the instances should be listed, so that teams and organizers can check on progress. Always be releasing. (One could even have fun with a leader board type set-up, e.g., a simple HTML page with a bunch of iframes — one for each teams instance — that reloads their app every minute or so. Same goes for a tally of Github commits by team.)

  • With a bit more planning, organizers could even look into using something like DotCloud — or one of the PaaS providers — or a fancy set-up like this one for Node.js that auto-deploys to the EC2 instance with the ‘git push’ command.

For the finale at the hack day, each team should have to present their work live, via the EC2 instance URL. It should also be required that what’s running on the EC2 is a straightforward ‘git pull’ of the code in their Github repository.

Maybe that’s the suspense-building moment, where the teams — live, in front of an audience — have to run a ‘git pull’ and deploy their app, and then load it in a browser. Oh, the suspense!

That’s how I would add some exponential gnerativity to a hack day. Mark my words, it would work.

UPDATE: While I’m thinking of it, it would also be great if each team had a person willing to live blog, or status update, during the event, to expose the whole process — the challenges, the breakthroughs, and opportunities for others to check out their app.

April 13 2011


Which blog platform should I use? A blog audit

When people start out blogging they often ask what blogging platform they should use – WordPress or Blogger? Tumblr or Posterous? It’s impossible to give an answer, because the first questions should be: who is going to use it, how, and what and who for?

To illustrate how the answers to those questions can help in choosing the best platform, I decided to go through the 35 or so blogs I have created, and why I chose the platforms that they use. As more and more publishing platforms have launched, and new features added, some blogs have changed platforms, while new ones have made different choices to older ones.

Bookmark blogs (Klogging) – Blogger and WordPress to Delicious and Tumblr

When I first began blogging it was essentially what’s called ‘klogging’ (knowledge blogging) – a way to keep a record of useful information. I started doing this with three blogs on Blogger, each of which was for a different class I taught: O-Journalism recorded reports in the field for online journalism students, Interactive Promotion and PR was created to inform students on a module of the same name (later exported to WordPress) and students on the Web and New Media module could follow useful material on that blog.

The blogs developed with the teaching, from being a place where I published supporting material, to a group blog where students themselves could publish their work in progress.

As a result, Web and New Media was moved to WordPress where it became a group blog maintained by students (now taught by someone else). The blog I created for the MA in Television and Interactive Content was first written by myself, then quickly handed over to that year’s students to maintain. When I started requiring students to publish their own blogs the original blogs were retired.

One-click klogging

By this time my ‘klogging’ had moved to Delicious. Webpages mentioned in a specific class were given a class-specific tag such as MMJ02 or CityOJ09. And students who wanted to dig further into a particular subject could use subject-specific tags such as ‘onlinevideo‘ or ‘datajournalism‘.

For the MA in Television and Interactive Content, then, I simply invented a new tag – ‘TVI’ – and set up a blog using Tumblr to pull anything I bookmarked on Delicious with that tag. (This was done in five minutes by clicking on ‘Customise‘ on the main Tumblr page, then clicking on Services and scrolling down to ‘Automatically import my…‘ and selecting RSS feed as Links. Then in the Feed URL box paste the RSS feed at the bottom of delicious.com/paulb/tvi).

(You can do something similar with WordPress – which I did here for all my bookmarks – but it requires more technical knowhow).

For klogging quotes for research purposes I also use Tumblr for Paul’s Literature Review. I’ve not used this as regularly or effectively as I could or should, but if I was embarking on a particularly large piece of research it would be particularly useful in keeping track of key passages in what I’m reading. Used in conjunction with a Kindle, it could be particularly powerful.

Back to the TVI bookmarks: another five minutes on Feedburner allowed me to set up a daily email newsletter of those bookmarks that students could subscribe to as well, and a further five minutes on Twitterfeed sent those bookmarks to a dedicated Twitter feed too (I could also have simply used Tumblr’s option to publish to a Twitter feed). ‘Blogging’ had moved beyond the blog.

Resource blogs – Tumblr and Posterous

For my Online Journalism module at City University London I use Tumblr to publish a curated, multimedia blog in addition to the Delicious bookmarks: Online Journalism Classes collects a limited number of videos, infographics, quotes and other resources for students. Tumblr was used because I knew most content would be instructional videos and I wanted a separate place to collect these.

The more general Paul Bradshaw’s Tumblelog (http://paulbradshaw.tumblr.com/) is where I maintain a collection of images, video, quotes and infographics that I look to whenever I need to liven up a presentation.

For resources based on notes or documents, however, Posterous is a better choice.

Python Notes and Notes on Spreadsheet Formulae and CAR, for example, both use Posterous as a simple way for me to blog my own notes on both (Python is a programming language) via a quick email (often drafted while on the move without internet access).

Posterous was chosen because it is very easy to publish and tag content, and I wanted to be able to access my notes based on tag (e.g. VLOOKUP) when I needed to remember how I’d used a particular formula or function.

Similarly, Edgbaston Election Campaign Exprenses and Hall Green Election Campaign Exprenses use Posterous as a quick way to publish and tag PDFs of election expense receipts from both constituencies (how this was done is explained here), allowing others to find expense details based on candidate, constituency, party or other details, and providing a space to post comments on findings or things to follow up.

Niche blogs – WordPress and Posterous

Although Online Journalism Blog began as ‘klogging’ it soon became something more, adding analysis, research, and contributions from other authors, and the number of users increased considerably. Blogger is not the most professional-looking of platforms, however (unless you’re prepared to do a lot of customisation), so I moved it to WordPress.com. And when I needed to install plugins for extra functionality I moved it again to a self-hosted WordPress site.

Finally, when the site was the victim of repeated hacking attempts I moved it to a WordPress MU (multi user) site hosted by Philip John’s Journal Local service, which provided technical support and a specialised suite of plugins.

If you want a powerful and professional-looking blogging platform it’s hard to beat WordPress.com, and if you want real control over how it works – such as installing plugins or customising themes – then a self-hosted WordPress site is, for me, your best option. I’d also recommend Journal Local if you want that combination of functionality and support.

If, however, you want to launch a niche blog quickly and functionality is not an issue then Posterous is an even better option, especially if there will be multiple contributors without technical skills. Council Coverage in Newspapers, for example, used Posterous to allow a group of people to publish the results of an investigation on my crowdsourced investigative journalism platform Help Me InvestigateThe Hospital Parking Charges Blog did the same for another investigation, but as it was only me publishing, I used WordPress.

Group blogs – Posterous and Tumblr

Posterous suits groups particularly well because members only need to send their post to a specific email address that you give them (such as post@yourblog.posterous.com) to be published on the blog.

It also handles multimedia and documents particularly well – when I was helping Podnosh‘s Nick Booth train a group of people with Flip cameras we used Posterous as an easy way for members of a group to instantly publish the video interviews they were doing by simply sending it to the relevant email address (Posterous will also cross-publish to YouTube and Twitter, simplifying those processes).

A few months ago Posterous launched a special ‘Groups’ service that publishes content in a slightly different way to make it easier for members to collaborate. I used this for another Help Me Investigate investigation - Recording Council Meetings – where each part of the investigation is a post/thread that users can contribute to.

Again, Posterous provides an easy way to do this – all people need to know is the email address to send their contribution to, or the web address where they can add comments to other posts.

If your contributors are more blog-literate and want to retain more control over their content, another option for group blogs is Tumblr. Brumblr, for example, is one group blog I belong to for Birmingham bloggers, set up by Jon Bounds. ‘We Love Michael Grimes‘ is another, set up by Pete Ashton, that uses Tumblr for people to post images of Birmingham’s nicest blogger.

Blogs for events – Tumblr, Posterous, CoverItLive

When I organised a Citizen Journalism conference in 2007, I used a WordPress blog to build up to it, write about related stories, and then link to reports on the event itself. Likewise, when later that year the NUJ asked me to manage a team of student members as they blogged that year’s ADM, I used WordPress for a group blog.

As the attendees of further events began to produce their own coverage, the platforms I chose evolved. For JEEcamp.com (no longer online), I used a self-hosted WordPress blog with an aggregation plugin that pulled in anything tagged ‘JEEcamp’ on blogs or Twitter. CoverItLive was also used to liveblog – and was then adopted successfully by attendees when they returned to their own news operations around the country (and also, interestingly, by Downing Street after they saw the tool being used for the event).

For the final JEEcamp I used Tumblr as an aggregator, importing the RSS feed from blog search engine Icerocket for any mention of ‘JEEcamp’.

In future I may experiment with the Posterous iPhone app’s new Events feature, which aggregates posts in the same location as you.

Aggregators – Tumblr

Sometimes you just want a blog to keep a record of instances of a particular trend or theme. For example, I got so sick of people asking “Is blogging journalism?” that I set up Is Ice Cream Strawberry?, a Tumblr blog that aggregates any articles that mention the phrases “Is blogging journalism”, “Are bloggers journalists” and “Is Twitter journalism” on Google News.

This was set up in the same way as detailed above, with the Feed URL box completed using the RSS feed from the relevant search on Google News or Google Blog Search (repeat for each feed).

Likewise, Online Journalism Jobs aggregates – you’ve got it – jobs in online journalism or that use online journalism skills. It pulls from the RSS feed for anything I bookmark on Delicious with the tag ‘ojjobs’ – but it can also be done manually with the Tumblr bookmark or email address, which is useful when you want to archive an entire job description that is longer than Delicious’s character limit.

Easy hyperlocal blogging – WordPress, Posterous and Tumblr

For a devoted individual hyperlocal blog WordPress seems the best option due to its power, flexibility and professionalism. For a hyperlocal blog where you’re inviting contributions from community members via email, Posterous may be better.

But if you want to publish a hyperlocal blog and have never had the time to do it justice, Tumblr provides a good way to make a start without committing yourself to regular, wordy updates. Boldmere High Street is my own token gesture – essentially a photoblog that I update from my mobile phone when I see something of interest – and take a photo – as I walk down the high street.

Personal blogs

As personal blogs tend to contain off-the-cuff observations, copies of correspondence or media, Posterous suits it well. Paul Bradshaw O/T (Off Topic) is mine: a place to publish things that don’t fit on any of the other blogs I publish. I use Posterous as it tends to be email-based, sometimes just keeping web-based copies of emails I’ve sent elsewhere.

It’s difficult to prescribe a platform for personal blogs as they are so… personal. If you talk best about your life through snatches of images and quotes, Tumblr will work well. I have a family Tumblr, for example, that pulls images and video from a family Flickr account, tweets from a family Twitter feed, video from a family YouTube account, and also allows me to publish snatches of audio or quotes.

You could use this to, for instance, create an approved-members-only Facebook page for the family so other family members can ‘follow’ their grandchildren, and publish updates from the Tumblr blog via RSS Graffiti. Facebook is, ultimately, the most popular personal blogging platform.

If it is hard to separate your personal life from your professional life, or your personal hobby involves playing with technology, WordPress may be a better choice.

And Blogger may be an easy way to bring together material from Google properties such as Picasa and Orkut.

Company blogs

Likewise, although Help Me Investigate’s blog started as two separate blogs on WordPress (one for company updates, the other for investigation tips), it now uses Posterous for both as it’s an easier way for multiple people to contribute.

This is because ease of publishing is more important than power – but for many companies WordPress is going to be the most professional and flexible option.

For some, Tumblr will best communicate their highly visual and creative nature. And for others, Posterous may provide a good place to easily publish documents and video.

Blogs – flexible enough for anything

What emerges from all the above is that blogs are just a publishing platform. There was a time when you had to customise WordPress, Typepad or Blogger to do what you wanted – from linkblogging and photoblogging to group blogs and aggregation. But those problems have since been solved by an increasing range of bespoke platforms.

Social bookmarking platforms and Twitter made it easier to linkblog; Tumblr made it easier to photoblog or aggregate RSS feeds. Posterous lowered the barrier to make group blogging as easy as sending an email. CoverItLive piggybacked on Twitter to aggregate live event coverage. And Facebook made bloggers of everyone without them realising.

A blog can now syndicate itself across multiple networks: Tumblr and Posterous make it easy to automatically cross-publish links and media to Twitter, YouTube and any other media-specific platform. RSS feeds can be pulled from Flickr, Delicious, YouTube or any of dozens of other services into a Facebook page or a WordPress widget.

What is important is not to be distracted by the technology, but focus on the people who will have to use it, and what they want to use it for.

To give a concrete example: I was once advising an organisation who wanted to publish their work online and help young people get their work out there. The young people used mobile phones (Blackberrys) and were on Facebook, but the organisation also wanted the content created by those young people to be seen by potential funders, in a professional context.

I advised them to:

  • Set up a moderated Posterous so that it would cross-publish to individuals’ Facebook pages (so there would be instant feedback for those users rather than it be published in an isolated space online that their friends had to go off and find);
  • Give the Posterous blog email address to the young people so they could use it to send in their work (making it easy to use on a device they were comfortable with);
  • Then to set up a separate ‘official’ WordPress site that pulled in the Posterous feed into a side-widget alongside the more professional, centrally placed, content (meeting the objectives of the organisation).

This sounds more technically complex than it is in practice, and the key thing is that it makes publishing as easy as possible: for the young users of the service, they only had to send images and comments to an email address. For members of the organisation they only had to write blog posts. Everything else, once set up, was automated. And free.

Many people hesitate before blogging, thinking that their effort has to be right first time. It doesn’t. Going through these blogs I counted around 35 that I’ve either created or been involved in. Many of those were retired when they ceased to be useful; some were transferred to new platforms. Some changed their names, some were deleted. Increasingly, they are intended from the start to have a limited shelf life. But every one has taught me something.

And those are just my experiences – how have you used blogs in different ways? And how has it changed?


April 12 2011


Hacks & Hackers Glasgow: the BBC College of Journalism video

Last month we celebrated the final leg of our UK & Ireland Hacks & Hackers tour in Glasgow, at an event hosted by BBC Scotland and supported by BBC College of Journalism and Guardian Open Platform. You can read more about it here. Other coverage includes:

The BBC College of Journalism kindly filmed the whole thing and the videos are now available to watch. The whole playlist can be viewed here, or watch each segment in the clips below:

April 11 2011


Links: Session 2B – social media strategy

Continuing with our links posts in the lead up to news:rewired – noise to signal, here is a collection of resources for session 2B, which will look at social media strategy – how news organisations and journalists are using social media platforms to report and engage with online communities.

Speakers on the panel will include Jack Riley, head of digital audience and content development, the Independent; Robin Hamman, director of digital, Edelman and Mark Jones, global communities editor, Reuters News.

In session 1B we’ll already have looked at how journalists can use social media to source news, and the filtering tools to bring this information into a news organisation, but what should they be putting out on these channels and how can they measure the impact of their online reputation?

Topical blog posts:

  • Over on the Guardian Data Blog an analysis of more than 82,000 tweets by journalists and UK media sources, carried out by Tweetminster, offers up a great visualisation showing exactly how the UK media used Twitter to report on different stories and topics in February. You can also access the raw data in the post.
  • Similarly in this follow up infographic Tony Hirst uses the Tweetminster API to gather lists of UK political and current affairs journalists, found out who they follow on Twitter and then used free graphic design software Gephi to visualise how they link together – raising important questions about social media networks.
  • At the end of this comment article by Jeff Reifman, founder of NewsCloud, he offers some tips on what news organisations can be doing better to “prepare for the future”, including through the use of social media such as by hosting community space on the web for reader interaction around content.
  • Back on the Guardian, editor-in-chief Alan Rusbridger outlines 15 things which he feels Twitter does “rather effectively and which should be of the deepest interest to anyone involved in the media at any level”.
  • This post on the Wall Blog speaks to news:rewired speaker Jack Riley about how the Independent managed to boost referrals from Facebook by 680 per cent from January to December 2010, and 250 per cent from Twitter.

Other resources/guides:

  • Poynter has a useful post looking at how newsrooms may want to develop social networking policies for journalists – an interesting side of the debate when looking at individual journalists reporting on social media platforms
  • It’s time for journalists to promote a better ‘Twitter style’ – the Online Journalism Review’s Robert Niles looks at how journalists using Twitter (for example) to report could work to promote a certain format for different forms of reporting, such as the use of tags to denote an eye-witness account
  • Media Helping Media has used Scoop.it to curate a social media kitbag, offering plenty of tools which may be of interest to journalists working on social media platforms, or looking to bring communities together around content
  • Paul Bradshaw outlines on his Online Journalism Blog how to create a Facebook news feed, which could be used by news organisations to promote the work of an individual journalist (with their permission)
  • On its developers blog Facebook itself outlines the different ways news organisations have been using the social media site effectively to report on stories and engage with readers, such as through implementation of the Activity Feed and Recommendations social plugins, or Live Stream for event coverage.

Issues to debate:

  • Should a news organisation’s social media strategy differ from an individual journalist’s – and if so, how? And how can both parties work to develop what they do?
  • Where do you start in developing a strategy? What should the main goals be and what are the ‘winning’ formulas for getting there?
  • Beyond setting up a Twitter or Facebook account, what other functionalities of these well-known platforms should a news organisation be exploiting? And what other platforms are out there to report on in innovative ways?
  • What role should a news organisation take in training/drawing up ethical guidelines, for the use of social media by its newsroom?
  • And one key question for debate – how can you measure the success, and ultimately the impact, of a news organisation’s social media strategy and its online reputation?

April 04 2011


D (ata) + J (ournalism) + Camp 2011 = #djcamp2011

Here at ScraperWiki we like to learn. and we also relish the opportunity to teach. Be it scraping or viewing, Ruby or Python or PHP: we want to spread the data and the scraping knowledge.

So it’ll come as no big surprise that our head professor, Francis Irving, will be lending a scraping hand at #djcamp2011.

“What is #djcamp2011?”, you ask. Here’s what you need to know:

So for the many hacks we’ve met at our Hacks and Hackers Hack Days, this is a brilliant opportunity to learn some of the hacker trade secrets! Sign up here.

March 15 2011


Cardiff Hacks and Hackers Hacks Day

What’s occurin’? Loads in fact, at our first Welsh Hacks and Hackers Hack Day! From schools from space to catering college’s with a Food Safety Standard of 2, we had an amazing day.

We got five teams:

Co-Ordnance – This project aimed to be a local business tracker. They wanted to make the London Stock Exchange code into meaningful data, but alas, the stock exchange prevents scraping. So they decided to use company data from registers like the LSE and Companies House to extract business information and structure it for small businesses who need to know best place to set up and for local business activists.

The team consisted of 3 hacks (Steve Fossey, Eva Tallaksen from Intrafish and Gareth Morlais from BBC Cymru) and 3 hackers (Carey HilesCraig Marvelley and Warren Seymour, all from Box UK).

It’s a good thing they had some serious hackers as they had a serious hack on their hands. Here’s a scraper they did for the London Stock Exchange ticker. And here’s what they were able to get done in just one day!

This was just a locally hosted site but the map did allow users to search for types of businesses by region, see whether they’d been dissolved and by what date.

Open Senedd – This project aimed to be a Welsh version of TheyWorkforYou. A way for people in Wales to find out how assembly members voted in plenary meetings. It tackles the worthy task of making assembly members voting records accessible and transparent.

The team consisted of 2 hacks (Daniel Grosvenor from CLIConline and Hannah Waldram from Guardian Cardiff) and 2 hackers (Nathan Collins and Matt Dove).

They spent the day hacking away and drew up an outline for www.opensenedd.org.uk. We look forward to the birth of their project! Which may or may not look something like this (left). Minus Coke can and laptop hopefully!

They took on a lot for a one day project but devolution will not stop the ScraperWiki digger!

There’s no such thing as a free school meal – This project aimed to extract information on Welsh schools from inspection reports. This involved getting unstructure Estyn reports on all 2698 Welsh schools into ScraperWiki.

The team consisted of 1 hack (Izzy Kaminski) and 2 astronomer hackers (Edward Gomez and Stuart Lowe from LCOGT).

This small team managed to scrape Welsh schools data (which the next team stole!) and had time to make a heat map of schools in Wales. This was done using some sort of astronomical tool. Their longer term aim is to overlay the map with information on child poverty and school meals. A worthy venture and we wish them well.

Ysgoloscope – This project aimed to be a Welsh version of Schooloscope. It’s aim was to make accessible and interactive information about schools for parents to explore. It used Edward’s scraper of horrible PDF Estyn inspection reports. These had different rating methodology to Ofsted (devolution is not good for data journalism!).

The team consisted of 6 hacks (Joni Ayn Alexander, Chris Bolton, Bethan James from the Stroke Association, Paul Byers, Geraldine Nichols and Rachel Howells), 1 hacker (Ben Campbell from Media Standards Trust) and 1 troublemaker (Esko Reinikainen).

Maybe it was a case to too many hacks or just trying to narrow down what area of local government to tackle but the result was a plan. Here is their presentation and I’m sure parents all over wales are hoping to see Ysgoloscope up and running.

Blasus – This project aimed to map food hygiene rating over Wales. They wanted to correlate this information with deprivation indices. They noticed that the Food Standards Agency site does not work. Not for this purpose which is most useful.

The team consisted of 4 hacks (Joe Goodden from the BBC, Alyson Fielding, Charlie Duff from HRZone and Sophie Paterson from the ATRiuM) and 1 hacker (Dafydd Vaughan from CF Labs).

As you can see below they created something which they presented on the day. They used this scraper and made an interactive map with food hygiene ratings, symbols and local information. Amazing for just a day’s work!

And the winners are… (drum roll please)

  • 1st Prize: Blasus
  • 2nd Prize: Open Senedd
  • 3rd Prize: Co-Ordnance
  • Best Scoop: Blasus for finding  a catering college in Merthyr with a Food Hygiene Standard rating of just 2
  • Best Scraper: Co-Ordnance

A big shout out

To our judges Glyn Mottershead from Cardiff School of Journalism, Media and Cultural Studies, Gwawr Hughes from Skillset and Sean Clarke from The Guardian.

And our sponsors Skillset, Guardian Platform, Guardian Local and Cardiff School of Journalism, Media and Cultural Studies.

Schools, businesses and eating place of Wales – you’ve been ScraperWikied!

February 22 2011


New event! Hacks & Hackers Glasgow (#hhhglas)

Calling journalists, bloggers, programmers and designers in Scotland!

Scraperwiki is pleased to announce another hacks & hackers hack day: in Glasgow. BBC Scotland is hosting and sponsoring the one day event, with support from BBC College of Journalism. As with our other UK hack days, Guardian Open Platform is providing the prizes.

Web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data. It’s completely free (food provided) and open to both BBC and non BBC staff. It will take place at the Viewing Theatre, Pacific Quay, Glasgow on Friday 25 March 2011.

Any questions? Please email judith@scraperwiki.com.

February 09 2011


New event! Hacks and Hackers Hack Day Cardiff (#hhhCar)

The UK Hacks & Hackers tour carries on – into 2011. Our first stop: Wales.

Scraperwiki, which provides award-winning tools for screen scraping,data mining and visualisation, will hold a one day practical hack day* at the Atrium in Cardiff on Friday 11 March, 2011.

Web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data.

We would like to thank our main sponsor Skillset Cymru, our hosts the Atrium and our prize sponsors Guardian Local, Guardian Open Platform and Cardiff School of Media, Journalism and Cultural Studies for making the event possible.

“Skillset Cymru is very pleased to be supporting the Cardiff Scraperwiki Hacks and Hackers Hack Day this March,” says Gwawr Hughes, director, Skillset Cymru.

“This exciting event will bring journalists and computer programmers and designers together to explore the scraping, storage, aggregation, and distribution of public data in more useful, structured formats.

“It is at the forefront of data journalism and should be of great interest to the media industry across the board here in Wales.”

More details

Who’s it for? We hope to attract ‘hacks’ and ‘hackers’ from all different types of backgrounds: people from big media organisations, as well as individual online publishers and freelancers.

What will I get out of it?
The aim is to show journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show programmers how to find, develop, and polish stories and features. To see what happened at our past events in Liverpool and Birmingham visit the ScraperWiki blog. Here’s a video showing what happened in Belfast.

How much? NOTHING! It’s absolutely free, thanks to our sponsors. Food and refreshments will be provided throughout the day. If you have special dietary requirements please email judith [at] scraperwiki.com.

What should I bring? We would encourage people to come along with ideas for local ‘datasets’ that are of interest. In addition we will create a list of suggested data sets at the introduction on the morning of the event but flexibility is key for this event. If you have a laptop, please bring this too.

So what exactly will happen on the day? Armed with their laptops and WIFI, journalists and developers will be put into teams of around four to develop their ideas, with the aim of finishing final projects that can be published and shared publicly. Each team will then present their project to the whole group. Winners will receive prizes at the end of the day.

*Not sure what a hack day is? Let’s go with the Wikipedia definition: It “an event where developers, designers and people with ideas gather to build ‘cool stuff’”…

With thanks to our sponsors:

Keep an eye on the ScraperWiki blog for details about Scraperwiki events. Hacks & Hackers Hack Day Glasgow is scheduled for March 25 2011. For additional information please contact judith [at] scraperwiki.com.

December 23 2010


Student scraping in Liverpool: football figures and flying police

A final Hacks & Hackers report to end 2010! Happy Christmas from everyone at ScraperWiki!

Last month ScraperWiki put on its first ever student event, at Liverpool John Moores University in partnership with Open Labs for students from both LJMU’s School of Journalism and the School of Computing & Mathematical Sciences, as well as external participants. This fabulous video comes courtesy of the Hatch. Alison Gow, digital executive editor at the Liverpool Daily Post and the Liverpool Echo has kindly supplied us with the words (below the video).


Report: Hacks and Hackers Hack Day – student edition

By Alison Gow

At the annual conference of the Society of Editors, held in Glasgow in November, there was some debate about journalist training and whether journalism students currently learning their craft on college courses were a) of sufficient quality and b) likely to find work.

Plenty of opinions were presented as facts and there seemed to be no recognition that today’s students might not actually want to work for mainstream media once they graduated – with their varied (and relevant) skill sets they may have very different (and far more entrepreneurial) career plans in mind.

Anyway, that was last month. Scroll forward to December 8 and a rather more optimistic picture of the future emerges. I got to spend the day with a group of Liverpool John Moores University student journalists, programmers and lecturers, local innovators and programming experts, and it seemed to me that the students were going to do just fine in whatever field they eventually chose.

This was Hacks Meet Hackers (Students) – the first event that ScraperWiki (Liverpool’s own scraping and data-mining phenomenon that has done so much to facilitate collaborative learning projects between journalists and coders) had held for students. I was one of four Trinity Mirror journalists lucky enough to be asked along too.

Brought into being through assistance from the excellent LJMU Open Labs team, backed by LJMU journalism lecturer Steve Harrison, #hhhlivS as it was hashtagged was a real eye-opener. It wasn’t the largest group to attend a ScraperWiki hackday I suspect, but I’m willing to bet it was one of the most productive; relevant, viable projects were crafted over the course of the day and I’d be surprised if they didn’t find their way onto the LJMU Journalism news website in the near future.

The projects brought to the presentation room at the end of the day were:

  • The Class Divide: Investigating the educational background of Britain’s MPs
  • Are Police Helicopters Effective in Merseyside?
  • Football League Attendances 1980-2010
  • Sick of School: The link between ill health and unpopular schools

The prize for Idea With The Most Potential went to the Police Helicopters project. This group had used a sample page from Merseyside Police helicopter movements report, which showed time of flight, geography, outcome and duration. They also determined that of the 33% of solved crimes, 0.03% involved the helicopter. Using the data scraped for helicopter flights, and comparing it to crimes and policing costs data, the group extrapolated it cost £1,675 per hour to fly the helicopter (amounting to more than £100,000 a month), and by comparing it to average officer salaries projected this could fund recruitment of 30 extra police officers. The team also suggested potential spin-off ideas around the data.

The Best Use of Data went to the Football League Figures team an all-male bunch of journos and student journos aided by hacker Paul Freeman who scraped data of every Football League club and brought it together into a database that could be used to show attendance trends. These included the dramatic drop in Liverpool FC attendances during the Thatcher years and the rises that coincided with exciting new signings, plunging attendances for Manchester City and subsequent spikes during takeovers, and the affects of promotion and relegation Premier League teams. The team suggested such data could be used for any number of stories, and would prove compelling information for statistics-hungry fans.

The Most Topical project went to the Class Divide group – LJMU students who worked with ScraperWiki’s Julian Todd to scrape data from the Telegraph’s politics web section and investigate the educational backgrounds of MPs. The group set out to investigate whether parliament consisted mainly of privately-educated elected members. The group said the data led them to discover most Lib Dem MPs were state educated, and that there was no slant of figures between state and privately educated MPs, contrary to what might have been expected. They added the data they had uncovered would prove particularly interesting once the MPs’ vote was held on University tuition fees.

The Best Presentation and the Overall Winner of the hackday went to Sick of Schools by Scraping The Barrel – a team of TM journos and students, hacker Brett and student nurse Claire Sutton – who used Office for National Statistics, Census, council information, and scraped data from school prospectuses and wards to investigate illness data and low demand for school places in Sefton borough. By overlaying health data with school places demand they were able to highlight various outcomes which they believed would be valuable for a range of readers, from parents seeking school places to potential house buyers.

Paul Freeman, described in one tweet as the “the Johan Cruyff of football data scraping” was presented with a Scraperwiki mug as the Hacker of the Day, for his sterling work on the Football League data.

Judges Andy Goodwin, of Open Labs, and Chris Frost, head of the Journalism department, praised everyone for their efforts and Aine McGuire, of ScraperWiki, highlighted the great quality of the ideas, and subsequent projects.  It was a long day but it passed incredibly quickly – I was really impressed not only by the ideas that came out but by the collaborative efforts between the students on their projects.

From my experience of the first Hacks Meet Hackers Day (held, again with support from Open Labs, in Liverpool last summer) there was quite a competitive atmosphere not just between the teams but even within teams as members – usually the journalists – pitched their ideas as the ones to run with. Yesterday was markedly less so, with each group working first to determine whether the data supported their ideas, and adapting those projects depending on what the information produced, rather than having a complete end in sight before they started. Maybe that’s why the projects that emerged were so good.

The Liverpool digital community is full of extraordinary people doing important, innovative work (and who don’t always get the credit they deserve). I first bumped into Julian and Aidan as they prepared to give a talk at a Liver and Mash libraries event earlier this year – I’d never heard of ScraperWiki and I was bowled over by the possibilities they talked about (once I got my brain around how it worked). Since then team has done so much to promote the cause of open data, data journalism, the opportunities it can create, and the worth and value it can have for audiences; Scraperwiki hackdays are attended by journalists from all media across the UK, eager to learn more about data-scraping and collaborative projects with hackers.

With the Hacks Meet Hackers Students day, these ideas are being brought into the classroom, and the outcome can only benefit the colleges, students and journalism in the future. It was a great day, and the prospects for the future are exciting.

Watch this space for more ScraperWiki events in 2011!

December 10 2010


Hacks & Hackers RBI: Snow mashes, truckstops and moving home

Sarah Booker (@Sarah_Booker on Twitter), digital content and social media editor for the Worthing Herald series, has kindly  provided us with this guest blog from the recent  Scraperwiki B2B Hacks and Hackers Hack day at RBI. Pictures courtesy of RBI’s Adam Tinworth.

Dealing with data is not new to me. Throughout my career I have dealt with plenty of stats, tables and survey results.

I have always asked myself, what’s the real story? Is this statistically significant? What are the numbers rather than the percentages?
Paying attention in maths O level classes paid off because I know the difference between mean and mode, but there had to be more.

My goal was greater understanding so I decided to go along to the Scraperwiki day at Reed Business Information. I wanted to find out ways to get at information, learn how to scrape and create beautiful things from the data discovered.

It didn’t take long to realise I wanted to run before I could walk. Ideas are great, but when you’re starting out it’s difficult to deal with something when it turns out the information is full of holes.

My data sets were unstructured, my comma separated values (CSV) had gaps and it was almost impossible to parse it within the timeframe. My projects were abandoned after a couple of hours work, but as well as learning new terms I was able to see how Scraperwiki worked, even though I can’t work it myself, yet.

What helped me understand the structure, if not the language, was spending time with Scraperwiki co-founder Julian Todd. Using existing scraped data, he showed me how to make minor adjustments and transform maps.

Being shown the code structure by someone who understands it helped to build up my confidence to learn more in the future.

Our group eventually came up with an interesting idea to mash up the #uksnow Twitter feed with pre-scraped restaurant data, calling it a snow hole.  It has the potential to be something but didn’t end up being an award-winning product by the day’s end.

Other groups produced extremely polished work. Where the Truck Stops was particularly impressive for combining information about crimes at truckstops with locations to find the most secure.

They won best scrape for achieving things my group had dreamed of. The top project, Is It Worth It? had astonishingly brilliant interactive graphics polishing an interesting idea.

Demand for workers and the cost of living in an area were matched with job aspirations to establish if it was worth moving. There has to be a future in projects like this.

It was a great experience and I went away with a greater understanding of structuring data gathering before it can be processed into something visual and a yearning to learn more.

Read more here:

December 09 2010


Hacks and Hackers Dublin: Data and the Dail

[Video: courtesy of Cathal Furey]

“Dublin can be heaven, at a quarter past eleven and a stroll in Stephens Green, there’s no need to worry, there’s no need to hurry, you’re a king and the lady’s a queen…”

Onwards and downwards we headed towards Dublin, as part of our UK & Ireland Hacks & Hackers tour.

We were received as guests at the Irish Dail and given a tour of Leinster House (see left) which was useful given that our event was all about opening up Government data – thank you to Dermot Keehan (Irish Embassy London) and Patrick Rochford (Private secretary to Conor Lenihan TD).

We attended and spoke at a meeting of one of our sponsors, Dublin Freelance Branch of the National Union of Journalists, by kind invitation of Gerard Cunningham and enjoyed an evening and a few pints in the warm and inviting Buswell’s Hotel opposite Leinster House.

On the HHH day we journeyed through Dublin along the River Liffey to Wood Quay, a site that houses the remains of a Viking city dating back to the 12th Century and which was without doubt our most prestigious venues to date. We were there courtesy of Dublin City Council and Innovation Dublin and we received a fantastic welcome and great support from all their staff especially Maeve White and John Downey. We were also sponsored by Guardian Open Platform and developer Michael Brunton-Spall (@bruntonspall) joined us for the event.

We had a great crowd on the day itself and we were delighted with the variety and scope of the projects.

First prize was given to MonuMental. Martha Rotter (@martharotter), Jane Ruffino (@janeruffino), John Craddon (@johncraddon), Elaine Edwards (@elaineedwards), Paul Barker, Michael Brunton-Spall (@bruntonspall) David Garavin (@newgraphic) and Alison Whelan (@smartdesigns). The project aimed to expose information and the location of archaeological monuments and combine these with planning data to show the danger that exists if there is a lack of awareness on planned public works. The idea was that the project would be sustained and would help local people actively campaign for the preservation of works that were treasured by communities.

The second prize was awarded to eTenders: Follow the Money. Fergal Reid (@fergal_reid), Gavin Sheridan (@gavinsblog), Julian Todd (@goatchurch) and Conor Ryan (@Connie_Zevon). The project was designed to highlight the issues facing people trying to understand how government contracts are distributed and to show patterns and relationships between contracts, organisations and government representatives.

The third prize and much-coveted ScraperWiki mug prize went to the ‘EPA Pollution Licenses and Enforcement’ project by Richard Cyganiak (@cygri). Since 1994 the EPA has been licensing large-scale industrial and agricultural activities. The project looked at the history of the applications for these IPPC licenses and aligned these to enforcement activities. It highlighted which sectors needed most attention for enforcement orders. The data was collated by scraping the EPA’s web based IPPC database and a PDF listing enforcement activities.

Road Safety included team members Gerard Cunningham (@faduda), Phil Mac Giolla Bhain, Cian Ginty (@cianginty) Mary O’Carroll, Alison Spilane (@Alison_Spillane) Trish Morgan and Victor Akujobi (@akujobi). The objective of this project was to show the number of road deaths per county and in parallel show the number of speed cameras and penalty points issued.

Twitter Mood Index, by Antonella Sassu (@misentoscossa), Marco Crosa, Victor Akujobi (@akujobi) and John Muldoon (@John__Muldoon) was a project designed to gauge the mood of Dublin people by sampling and analysing twitter feeds.

‘Fingal County Council is first to market’: We were also delighted to have Dominic Byrne from Fingal County Council who explained how he had set our HHH date as the target date for his team to launch their Open Data initiative. It was a coup for the council and a very promising first. It was great to hear him talking about the value in and the process for the publishing of government data.

A special thanks to our judges Michael Fisher (@fishbelfast), Dominic Byrne (Fingal County Council) and Michael Stubbs (Dublin City Council).

Thank you to bloggers and journalists for the additional coverage. Read more here:

Finally a huge ‘thank you’ to the Woods, the Wheatleys and the McGuires for their generous hospitality during our visit.

Oo, and must not forget the obligatory pizza pic!…

We set sail (in a gale but only force 9 this time!) after a few more pints of Guinness in O’Shea’s on the Quays where Francis and Julian declared that they were up for a sprint in the Aran Islands off the coast of Galway in 2011: I guess that must mean that we are going back next year and that we should really do a #hhhgal!

Roll on 2011!

December 08 2010


Belfast Hacks & Hackers: a roundup

November was a hectic month so the blog posts have been a little tardy – apologies! After a full hacks and hackers day with the wonderful crowd in Lichfield #hhhlich we set off and sailed overnight to Belfast with a full force gale – force 11 to be precise!

The Titanic memorabilia available on board was a nostalgic touch, but we questioned the appropriateness of the shop’s choice, as the 20k ton ferry bobbed up and down like a cork and we were advised to stay in bed to avoid the worst effects of a rough crossing!

We received a rapturous welcome at the University of Ulster (Belfast Campus) from Milne Rowntree, our main host for the event. It was our first ever Saturday ‘Hacks and Hackers’ Hack day. The University was bright and modern with great facilities and it was wonderful to see how the city had been transformed by a big investment in infrastructure – three cheers and long may it continue in NI! Mr Cameron please keep your mitts off their money!

Francis Irving gave an introduction to ScraperWiki and the teams soon split off into their chosen projects. The choice of subjects varied and included:

Mr ‘No Vote’
This was the winning entry for the day and all about politics in Northern Ireland (surprise surprise!) and representation. Ivor Whitten (@iwhitten), Alan Meban (@alanbelfast), Matt Johnson (@cimota) and Rob Moore (@robsogc) set about gathering data and graphing the impact of people choosing not to vote and what this meant for representative democracy across Northern Ireland.

Money for Mention
Jo Briggs, Dee Harvey (@deeharvey), Julian Todd (@goatchurch) and Ian Walsh (@ianwalshireland) all worked on a project that examined the patterns within the NI Court System. This was all about how court case data appeared on a web site for a single week and the implications and difficulty of measuring the costs of cases as this information could not be captured or aggregated. The data was captured and will be maintained so it will be interesting to look at the findings over time. This project scooped the 2nd prize.

A Bit of Red Sky Thinking
Tony Rice (@ricetony), Philip Bradfield and Francis Irving (@frabcus) set about looking at the depth of one of Northern Ireland’s Property companies – Red Sky and its relationship with the Northern Ireland Housing Executive.

Money (That’s What I Want)
Brian Pelan (@ckarkkent), Veronica Kelly (@veedles) and Declan McGrath (@theirishpenquin) decided to look at public and private sector pay in Northern Ireland.

The prize for the ScraperWiki mug was won by Declan McGrath a hacker who had come from Dublin for the event!

Thank you to our judges Colm Murphy & Milne Rowntree, both from the University of Ulster and Damien Whinnery from Flagship Media. Also, a big thanks to our sponsors University of Ulster, Digital Circle and Guardian Platform.

Thanks to everyone who created additional posts which can be found here:

The Scraperwiki team would like to thank the McKeown family for all their hospitality in Belfast! Julian, Aidan and Francis at Belfast Lough and the Giants Causeway in Antrim:

December 07 2010


Hacks & Hackers Belfast: ‘You don’t realize how similar coding and reporting are until you watch a hack and a technologist work together to create something’

In November, Scraperwiki went to Belfast and participant Lyra McKee, CEO, NewsRupt (creators of the news app Qluso) has kindly supplied us with this account!

The concept behind Hacks and Hackers, a global phenomenon, is simple: bring a bunch of hacks (journalists) and hackers (coders) together to build something really cool that other journalists and industry people can use. We were in nerd heaven.

The day kicked off with a talk from the lovely Francis Irving (@frabcus), Scraperwiki’s CEO. Francis talked about Scraperwiki’s main use-scraping data, stats & facts from large datasets – and the company’s background, from being built by coder Julian Todd to getting funded by 4IP.

After that, the gathered geeks split off into groups, all with the same goal: scrape data and find an explosive, exclusive story. First, second and third prizes would be awarded at the end of the day.

You don’t realize how similar coding and reporting are until you watch a hack and a technologist work together to create something. Both vocations have the same core purpose: creating something useful that others can use (or in the hack’s case, unearthing information that is useful to the public).

The headlines that emerged out of the day were amazing. ‘Mr No Vote’ won first prize. When citizen hacks Ivor Whitten, Matt Johnston and coder Robert Moore of e-learning company Learning Pool used Scraperwiki to scrape electoral data from local government websites, they found that over 60% of voters in every constituency in Northern Ireland (save one) abstained from voting in the last election, raising questions about just how democratically MPs and MLAs have been elected.

What was really significant about the story was that the guys were able to uncover it within a number of hours. One member of Team Qluso, an ex investigative journalist, was astounded, calling Scraperwiki a “gamechanger” for the industry. It was an almost historical event, seeing technology transform a small but significant part of the industry: the process of finding and analyzing data. (A process that, according to said gobsmacked Team Qluso member, used to take days, weeks, even months).

If you get a chance to chat with the Scraperwiki team, take it with both hands: these guys are building some cracking tools for hacks’n’hackers alike.

November 24 2010


Hacks/Hackers London

First of all, the Iraq War Logs:

Round One – The Cleaning

Documents, records and words all hugely intimidating in their vastness. But some tools to help are MySQL, Ultraedit and Google Refine. But this stage is incredibly frustrating.

Round Two – The Problem

How do you tackle the types of documents? There was even small PDF files. Had to build a basic web interface for everyday queries. Needed multiple fields, this part is extremely difficult. Especially when you need to explain it to an editor. You have to have a healthy mistrust of the data. Asking the right questions is crucial. Asking something which the data is not structured to ask is the real problem.

Round Three – What We Did

Looked at key incidences and names of interest which the media had previously reported. Trick was to try and find what we didn’t know. First start by looking at categories of deaths by time. Found that it was murders rather than weapons fire that killed the most. It was the own civilian in-fighting. Use Tableau. Up to 100,000 records. Also had to get researches to sift through reports and manually verify what the data meant. Make sure if you do that that you organise a system that everyone uses to categories, calculate and tabulate. Can then use Exel and filter. Quicker with Access.

Data was used as part of research not just to make loads of charts. Visual maps tell a story. Quite powerful to an audience. Maps can be used for newsgathering. Asked journalists which areas they were interested in and sent them the reports geocoded. They could read up on the plane all the reports in the area they were heading to. Can also link a big story to it’s log. Prove it to be true. The log can validate a report, so you can use it.

What Did it Take?

10 week. 25 people. 30,000 reports. 5,000 reports manually recounted. More than one 18-hour day.


A lot of really useful information is not easily available on the web. Writing a web scraper not only makes searching and viewing information better but it can bring stories to light which were hidden in the mass of digital structures.

November 15 2010


Lichfield Hacks and Hackers: PFIs, plotting future care needs, what’s on in Lichfield and mapping flood warnings

The winners with judges Lizzie and Rita. Pic: Nick Brickett

By Philip John, Journal LocalThis has been cross-posted on the Journal Local blog.

It may be a tiny city but Lichfield has shown that it has some great talent at the Hacks and Hackers Hack Day.

Sponsored by Lichfield District Council and Lichfield-based Journal Local, the day was held at the George Hotel and attended by a good selection of local developers and journalists – some coming from much further afield.

Once the introductions were done and we’d all contributed a few ideas the work got started and five teams quickly formed around those initial thoughts.

The first two teams decided to look into Private Finance Initiatives (PFIs) and Information Asset Registers (IARs). The first of these scraped information from 470 councils to show which of these published information about PFIs. The results showed that only 10% of councils actually put out any details of PFIs, highlighting a lack of openness in that area.

Also focused on PFIs was the ‘PFI wiki’ project which scraped the Partnerships UK database of PFIs and re-purposed it to allow deeper interrogation, such as by region and companies. It clearly paves the way for an OpenCharities style site for PFIs.

Future care needs was the focus of the third team who mapped care homes along with information on ownership, public vs private status and location. The next step, they said, is to add the number of beds and match that to the needs of the population based on demographic data, giving a clearer view of whether the facilities exist to cater for the future care needs in the area.

A Lichfield-related project was the focus of the fourth group who aimed to create a comprehensive guide to events going on in Lichfield District. Using about four or five scrapers, they produced a site that collated all the events listing sites serving Lichfield into one central site with a search facility. The group also spawned a new Hacks/Hackers group to continue their work.

Last but not least, the fifth group worked on flood warning information. By scraping the Environment Agency web site they were able to display on a map, the river level gauges and the flood warning level so that at a glance it’s possible to see the water level in relation to the flood warning limit.

So after a long day Lizzie Thatcher and Rita Wilson from Lichfield District Council joined us to judge the projects. They came up with a clever matrix of key points to rate the projects by and decided to choose the ‘what’s on’ and ‘flood warning’ projects as joint winners, who each share a prize of £75 in Amazon vouchers.

The coveted ScraperWiki mug also went to the ‘what’s on’ project for their proper use of ScraperWiki to create good quality scrapers.

Pictures from the event by Nick Brickett:



November 10 2010


Announcing The Big Clean, Spring 2011

We’re very excited to announce that we’re helping to organise an international series of events to convert not-very-useful, unstructured, non-machine-readable sources of public information into nice clean structured data.

This will make it much easier for people to reuse the data, whether this is mixing it with other data sources (e.g. different sources of information about the area you live in) or creating new useful services based on the data (like TheyWorkForYou or Where Does My Money Go?). The series of events will be called The Big Clean, and will take place next spring, probably in March.

The idea was originally floated by Antti Poikola on the OKF’s international open-government list back in September, and since then we’ve been working closely with Antti and Jonathan Gray at OKFN to start planning the events.

Antti and Francis Irving (mySociety) will be running a session on this at the Open Government Data Camp on the 18-19th November in London. If you’d like to attend this session, please add your name to the following list:

If you can’t attend but you’re interested in helping to organise an event near you, please add your name/location to the following wiki page:

All planning discussions will take place on the open-government list!

November 01 2010


Scraperwiki launches first student event in Liverpool

We’re happy to announce our first Hacks Meet Hackers event for students, to take place in Liverpool on Wednesday December 8, 2010 from 9.30am to 5pm at Liverpool John Moores University’s Art and Design Academy.

In partnership with Open Labs, we’re putting on this event for student developers and journalists from LJMU’s School of Journalism and other departments including the School of Computing & Mathematical Sciences.

So what’s this hack day all about?
It’s a practical event at which web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data.

Who’s it for?
We hope to attract ‘hacks’ and ‘hackers’ from all different types of backgrounds – students with skills in journalism, data visualisation, designers, programmers, statistics, games developers etc.

What will you get out of it?
The aim is to show journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show programmers how to find, develop, and polish stories and features.

What should participants bring?
We would encourage people to come along with ideas for local ‘datasets’ that are of interest. In addition we will create a list of suggested data sets at the introduction on the morning of the event but flexibility is key for this event. If you have a laptop, please bring this too.

But what exactly will happen on the day itself? Armed with their laptops and WIFI, journalists and developers will be put into teams of around four to develop ideas, with the aim of finishing final projects that can be published and shared publicly. Each team will then present their project to the whole group.

Overall winners will receive a prize at the end of the day. Food and drink will be provided during the day!
Any more questions? Please get in touch via aine[at]scraperwiki.com.

October 29 2010


Video: Leeds Hacks & Hackers Hack Day

Here are a few video interviews from yesterday’s Hacks and Hackers Hack Day Leeds, with various participants and Sarah Hartley from Guardian Local. Follow this link for a write-up of all the projects.


Leeds Hacks and Hackers Hack Day: Planning maps; cutting up Leeds; researching brownfield; and finding the city’s blogging pulse

It was to West Yorkshire for the fifth stop on Scraperwiki’s UK & Ireland Hacks & Hacker tour, at Old Broadcasting House in the excellent city of Leeds.

A varied crowd turned out for yesterday’s hack day hosted by nti Leeds, and also sponsored by Guardian Open Platform and Guardian Local and Leeds Trinity Centre for Journalism. It included participants from the city council and regional newspapers, independent bloggers, designers and computer programmers – with all different kinds of experience.

With the introduction over, the competition began, fuelled by the usual Scraperwiki promise of pizza and beer; and Amazon vouchers for the winners – who would be decided by our three judges, Sarah Hartley, editor of Guardian Local, Linda Broughton, head of nti Leeds, and Richard Horsman, associate principal lecturer at Leeds Trinity Centre for Journalism.

Five groups formed around different areas of interest, but all with a Leeds focus. Brownfield Research, by Greg Brant, Rebecca Whittington, Jon Eland and Tom Mortimer-Jones was about discovering the past, present and planned future of brownfield sites using scrapes of planning applications and change of use applications combined with web-chat and related documents. It also aimed to include history of industrial disease and accidents and contamination on site.

Leeds Planning Map by Catherine O’Connor (@journochat), Elizabeth Sanderson (@Lizziesanderson), James Rothschild (@jrpmedia), John Baron (@GdnLeeds), Karl Schneider (@karlschneider), Matt Jones (@matt_jones86) allowed users to view all planning decisions in Leeds colour coded by accepted or refused applications.

Find Me by software developer Marcus Houlden (@mhoulden) built a geolocation web application that displays current location, address, postcode, and links to nearest bus stops. He also started adding Yorkshire Water roadworks data.

The Leeds Pulse team scraped Live Journal data to produce a web application, built on Django, demonstrating negative and positive blogging attitudes across Leeds – drawing from 8,500 blog posts. It categorised “love, like or good” as positive, and “hate, bad or meh” as negative. The judges certainly weren’t ‘meh’ about it, and chose it as the runner-up.

Leeds Uncut, however, scooped the overall prize. Suzanne McTaggart, Amna Kaleem (@amnakaleem), Nick Crossland (@ncrossland), Michael Brunton-Spall (@bruntonspall) with some help from developer Martin Dunschen created a map showing the eight constituencies in Leeds to highlight how they are being affected by spending cuts and redundancies.

They also looked at job vacancies in each of the constituencies, to identify whether the creation of new jobs is offsetting the doom and gloom caused by spending cuts and job losses. Different shades of colour in the form of an “economic health thermometer” gave a visually effective overview of which constituencies are suffering the most and least.

The data for the project was gathered from job websites, news websites, the Guardian’s Cutswatch page and the Office of National Statistics, which provides figures on how many people are claiming unemployment benefit/jobseekers allowance each month, giving an indication of the number of new redundancies.

The three judges … were unanimous in deciding that the worthy winners had successfully collated trusted data and compiled an easy to use map visualisation.

… commented judge Sarah Hartley, who has written this account of the beginning, middle and end of the day.

£250 worth of Amazon.co.uk vouchers will be split up among the winners and runners up. An extra prize for the best scraper work, chosen by Scraperwiki’s Julian Todd, went to Matt Jones, who will continue to maintain the planning data scraper.

With thanks to all our sponsors and helpers mentioned above, and additionally Leeds Trinity’s Catherine O’Connor and developer Imran Ali.

Twitter conversation was via the #hhhleeds tag, and see below for a visualisation of some of the geotagged tweets (courtesy of remote onlooker Tony Hirst, @psychemedia):

You can find a Twitter list of delegates at this link here:

More links to be added as we spot them and photographs are coming… Please email judith at scraperwiki.com with more material, or leave links in the comment section below. I’d especially like to add in links to scrapers and data sets, so people can see how the projects were built.

Want to get involved? We’re still on tour! If you’d like to sponsor an event please get in touch with aine@scraperwiki.com.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!