Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 23 2010


Student scraping in Liverpool: football figures and flying police

A final Hacks & Hackers report to end 2010! Happy Christmas from everyone at ScraperWiki!

Last month ScraperWiki put on its first ever student event, at Liverpool John Moores University in partnership with Open Labs for students from both LJMU’s School of Journalism and the School of Computing & Mathematical Sciences, as well as external participants. This fabulous video comes courtesy of the Hatch. Alison Gow, digital executive editor at the Liverpool Daily Post and the Liverpool Echo has kindly supplied us with the words (below the video).


Report: Hacks and Hackers Hack Day – student edition

By Alison Gow

At the annual conference of the Society of Editors, held in Glasgow in November, there was some debate about journalist training and whether journalism students currently learning their craft on college courses were a) of sufficient quality and b) likely to find work.

Plenty of opinions were presented as facts and there seemed to be no recognition that today’s students might not actually want to work for mainstream media once they graduated – with their varied (and relevant) skill sets they may have very different (and far more entrepreneurial) career plans in mind.

Anyway, that was last month. Scroll forward to December 8 and a rather more optimistic picture of the future emerges. I got to spend the day with a group of Liverpool John Moores University student journalists, programmers and lecturers, local innovators and programming experts, and it seemed to me that the students were going to do just fine in whatever field they eventually chose.

This was Hacks Meet Hackers (Students) – the first event that ScraperWiki (Liverpool’s own scraping and data-mining phenomenon that has done so much to facilitate collaborative learning projects between journalists and coders) had held for students. I was one of four Trinity Mirror journalists lucky enough to be asked along too.

Brought into being through assistance from the excellent LJMU Open Labs team, backed by LJMU journalism lecturer Steve Harrison, #hhhlivS as it was hashtagged was a real eye-opener. It wasn’t the largest group to attend a ScraperWiki hackday I suspect, but I’m willing to bet it was one of the most productive; relevant, viable projects were crafted over the course of the day and I’d be surprised if they didn’t find their way onto the LJMU Journalism news website in the near future.

The projects brought to the presentation room at the end of the day were:

  • The Class Divide: Investigating the educational background of Britain’s MPs
  • Are Police Helicopters Effective in Merseyside?
  • Football League Attendances 1980-2010
  • Sick of School: The link between ill health and unpopular schools

The prize for Idea With The Most Potential went to the Police Helicopters project. This group had used a sample page from Merseyside Police helicopter movements report, which showed time of flight, geography, outcome and duration. They also determined that of the 33% of solved crimes, 0.03% involved the helicopter. Using the data scraped for helicopter flights, and comparing it to crimes and policing costs data, the group extrapolated it cost £1,675 per hour to fly the helicopter (amounting to more than £100,000 a month), and by comparing it to average officer salaries projected this could fund recruitment of 30 extra police officers. The team also suggested potential spin-off ideas around the data.

The Best Use of Data went to the Football League Figures team an all-male bunch of journos and student journos aided by hacker Paul Freeman who scraped data of every Football League club and brought it together into a database that could be used to show attendance trends. These included the dramatic drop in Liverpool FC attendances during the Thatcher years and the rises that coincided with exciting new signings, plunging attendances for Manchester City and subsequent spikes during takeovers, and the affects of promotion and relegation Premier League teams. The team suggested such data could be used for any number of stories, and would prove compelling information for statistics-hungry fans.

The Most Topical project went to the Class Divide group – LJMU students who worked with ScraperWiki’s Julian Todd to scrape data from the Telegraph’s politics web section and investigate the educational backgrounds of MPs. The group set out to investigate whether parliament consisted mainly of privately-educated elected members. The group said the data led them to discover most Lib Dem MPs were state educated, and that there was no slant of figures between state and privately educated MPs, contrary to what might have been expected. They added the data they had uncovered would prove particularly interesting once the MPs’ vote was held on University tuition fees.

The Best Presentation and the Overall Winner of the hackday went to Sick of Schools by Scraping The Barrel – a team of TM journos and students, hacker Brett and student nurse Claire Sutton – who used Office for National Statistics, Census, council information, and scraped data from school prospectuses and wards to investigate illness data and low demand for school places in Sefton borough. By overlaying health data with school places demand they were able to highlight various outcomes which they believed would be valuable for a range of readers, from parents seeking school places to potential house buyers.

Paul Freeman, described in one tweet as the “the Johan Cruyff of football data scraping” was presented with a Scraperwiki mug as the Hacker of the Day, for his sterling work on the Football League data.

Judges Andy Goodwin, of Open Labs, and Chris Frost, head of the Journalism department, praised everyone for their efforts and Aine McGuire, of ScraperWiki, highlighted the great quality of the ideas, and subsequent projects.  It was a long day but it passed incredibly quickly – I was really impressed not only by the ideas that came out but by the collaborative efforts between the students on their projects.

From my experience of the first Hacks Meet Hackers Day (held, again with support from Open Labs, in Liverpool last summer) there was quite a competitive atmosphere not just between the teams but even within teams as members – usually the journalists – pitched their ideas as the ones to run with. Yesterday was markedly less so, with each group working first to determine whether the data supported their ideas, and adapting those projects depending on what the information produced, rather than having a complete end in sight before they started. Maybe that’s why the projects that emerged were so good.

The Liverpool digital community is full of extraordinary people doing important, innovative work (and who don’t always get the credit they deserve). I first bumped into Julian and Aidan as they prepared to give a talk at a Liver and Mash libraries event earlier this year – I’d never heard of ScraperWiki and I was bowled over by the possibilities they talked about (once I got my brain around how it worked). Since then team has done so much to promote the cause of open data, data journalism, the opportunities it can create, and the worth and value it can have for audiences; Scraperwiki hackdays are attended by journalists from all media across the UK, eager to learn more about data-scraping and collaborative projects with hackers.

With the Hacks Meet Hackers Students day, these ideas are being brought into the classroom, and the outcome can only benefit the colleges, students and journalism in the future. It was a great day, and the prospects for the future are exciting.

Watch this space for more ScraperWiki events in 2011!

December 10 2010


Hacks & Hackers RBI: Snow mashes, truckstops and moving home

Sarah Booker (@Sarah_Booker on Twitter), digital content and social media editor for the Worthing Herald series, has kindly  provided us with this guest blog from the recent  Scraperwiki B2B Hacks and Hackers Hack day at RBI. Pictures courtesy of RBI’s Adam Tinworth.

Dealing with data is not new to me. Throughout my career I have dealt with plenty of stats, tables and survey results.

I have always asked myself, what’s the real story? Is this statistically significant? What are the numbers rather than the percentages?
Paying attention in maths O level classes paid off because I know the difference between mean and mode, but there had to be more.

My goal was greater understanding so I decided to go along to the Scraperwiki day at Reed Business Information. I wanted to find out ways to get at information, learn how to scrape and create beautiful things from the data discovered.

It didn’t take long to realise I wanted to run before I could walk. Ideas are great, but when you’re starting out it’s difficult to deal with something when it turns out the information is full of holes.

My data sets were unstructured, my comma separated values (CSV) had gaps and it was almost impossible to parse it within the timeframe. My projects were abandoned after a couple of hours work, but as well as learning new terms I was able to see how Scraperwiki worked, even though I can’t work it myself, yet.

What helped me understand the structure, if not the language, was spending time with Scraperwiki co-founder Julian Todd. Using existing scraped data, he showed me how to make minor adjustments and transform maps.

Being shown the code structure by someone who understands it helped to build up my confidence to learn more in the future.

Our group eventually came up with an interesting idea to mash up the #uksnow Twitter feed with pre-scraped restaurant data, calling it a snow hole.  It has the potential to be something but didn’t end up being an award-winning product by the day’s end.

Other groups produced extremely polished work. Where the Truck Stops was particularly impressive for combining information about crimes at truckstops with locations to find the most secure.

They won best scrape for achieving things my group had dreamed of. The top project, Is It Worth It? had astonishingly brilliant interactive graphics polishing an interesting idea.

Demand for workers and the cost of living in an area were matched with job aspirations to establish if it was worth moving. There has to be a future in projects like this.

It was a great experience and I went away with a greater understanding of structuring data gathering before it can be processed into something visual and a yearning to learn more.

Read more here:

December 09 2010


Hacks and Hackers Dublin: Data and the Dail

[Video: courtesy of Cathal Furey]

“Dublin can be heaven, at a quarter past eleven and a stroll in Stephens Green, there’s no need to worry, there’s no need to hurry, you’re a king and the lady’s a queen…”

Onwards and downwards we headed towards Dublin, as part of our UK & Ireland Hacks & Hackers tour.

We were received as guests at the Irish Dail and given a tour of Leinster House (see left) which was useful given that our event was all about opening up Government data – thank you to Dermot Keehan (Irish Embassy London) and Patrick Rochford (Private secretary to Conor Lenihan TD).

We attended and spoke at a meeting of one of our sponsors, Dublin Freelance Branch of the National Union of Journalists, by kind invitation of Gerard Cunningham and enjoyed an evening and a few pints in the warm and inviting Buswell’s Hotel opposite Leinster House.

On the HHH day we journeyed through Dublin along the River Liffey to Wood Quay, a site that houses the remains of a Viking city dating back to the 12th Century and which was without doubt our most prestigious venues to date. We were there courtesy of Dublin City Council and Innovation Dublin and we received a fantastic welcome and great support from all their staff especially Maeve White and John Downey. We were also sponsored by Guardian Open Platform and developer Michael Brunton-Spall (@bruntonspall) joined us for the event.

We had a great crowd on the day itself and we were delighted with the variety and scope of the projects.

First prize was given to MonuMental. Martha Rotter (@martharotter), Jane Ruffino (@janeruffino), John Craddon (@johncraddon), Elaine Edwards (@elaineedwards), Paul Barker, Michael Brunton-Spall (@bruntonspall) David Garavin (@newgraphic) and Alison Whelan (@smartdesigns). The project aimed to expose information and the location of archaeological monuments and combine these with planning data to show the danger that exists if there is a lack of awareness on planned public works. The idea was that the project would be sustained and would help local people actively campaign for the preservation of works that were treasured by communities.

The second prize was awarded to eTenders: Follow the Money. Fergal Reid (@fergal_reid), Gavin Sheridan (@gavinsblog), Julian Todd (@goatchurch) and Conor Ryan (@Connie_Zevon). The project was designed to highlight the issues facing people trying to understand how government contracts are distributed and to show patterns and relationships between contracts, organisations and government representatives.

The third prize and much-coveted ScraperWiki mug prize went to the ‘EPA Pollution Licenses and Enforcement’ project by Richard Cyganiak (@cygri). Since 1994 the EPA has been licensing large-scale industrial and agricultural activities. The project looked at the history of the applications for these IPPC licenses and aligned these to enforcement activities. It highlighted which sectors needed most attention for enforcement orders. The data was collated by scraping the EPA’s web based IPPC database and a PDF listing enforcement activities.

Road Safety included team members Gerard Cunningham (@faduda), Phil Mac Giolla Bhain, Cian Ginty (@cianginty) Mary O’Carroll, Alison Spilane (@Alison_Spillane) Trish Morgan and Victor Akujobi (@akujobi). The objective of this project was to show the number of road deaths per county and in parallel show the number of speed cameras and penalty points issued.

Twitter Mood Index, by Antonella Sassu (@misentoscossa), Marco Crosa, Victor Akujobi (@akujobi) and John Muldoon (@John__Muldoon) was a project designed to gauge the mood of Dublin people by sampling and analysing twitter feeds.

‘Fingal County Council is first to market’: We were also delighted to have Dominic Byrne from Fingal County Council who explained how he had set our HHH date as the target date for his team to launch their Open Data initiative. It was a coup for the council and a very promising first. It was great to hear him talking about the value in and the process for the publishing of government data.

A special thanks to our judges Michael Fisher (@fishbelfast), Dominic Byrne (Fingal County Council) and Michael Stubbs (Dublin City Council).

Thank you to bloggers and journalists for the additional coverage. Read more here:

Finally a huge ‘thank you’ to the Woods, the Wheatleys and the McGuires for their generous hospitality during our visit.

Oo, and must not forget the obligatory pizza pic!…

We set sail (in a gale but only force 9 this time!) after a few more pints of Guinness in O’Shea’s on the Quays where Francis and Julian declared that they were up for a sprint in the Aran Islands off the coast of Galway in 2011: I guess that must mean that we are going back next year and that we should really do a #hhhgal!

Roll on 2011!

December 08 2010


Belfast Hacks & Hackers: a roundup

November was a hectic month so the blog posts have been a little tardy – apologies! After a full hacks and hackers day with the wonderful crowd in Lichfield #hhhlich we set off and sailed overnight to Belfast with a full force gale – force 11 to be precise!

The Titanic memorabilia available on board was a nostalgic touch, but we questioned the appropriateness of the shop’s choice, as the 20k ton ferry bobbed up and down like a cork and we were advised to stay in bed to avoid the worst effects of a rough crossing!

We received a rapturous welcome at the University of Ulster (Belfast Campus) from Milne Rowntree, our main host for the event. It was our first ever Saturday ‘Hacks and Hackers’ Hack day. The University was bright and modern with great facilities and it was wonderful to see how the city had been transformed by a big investment in infrastructure – three cheers and long may it continue in NI! Mr Cameron please keep your mitts off their money!

Francis Irving gave an introduction to ScraperWiki and the teams soon split off into their chosen projects. The choice of subjects varied and included:

Mr ‘No Vote’
This was the winning entry for the day and all about politics in Northern Ireland (surprise surprise!) and representation. Ivor Whitten (@iwhitten), Alan Meban (@alanbelfast), Matt Johnson (@cimota) and Rob Moore (@robsogc) set about gathering data and graphing the impact of people choosing not to vote and what this meant for representative democracy across Northern Ireland.

Money for Mention
Jo Briggs, Dee Harvey (@deeharvey), Julian Todd (@goatchurch) and Ian Walsh (@ianwalshireland) all worked on a project that examined the patterns within the NI Court System. This was all about how court case data appeared on a web site for a single week and the implications and difficulty of measuring the costs of cases as this information could not be captured or aggregated. The data was captured and will be maintained so it will be interesting to look at the findings over time. This project scooped the 2nd prize.

A Bit of Red Sky Thinking
Tony Rice (@ricetony), Philip Bradfield and Francis Irving (@frabcus) set about looking at the depth of one of Northern Ireland’s Property companies – Red Sky and its relationship with the Northern Ireland Housing Executive.

Money (That’s What I Want)
Brian Pelan (@ckarkkent), Veronica Kelly (@veedles) and Declan McGrath (@theirishpenquin) decided to look at public and private sector pay in Northern Ireland.

The prize for the ScraperWiki mug was won by Declan McGrath a hacker who had come from Dublin for the event!

Thank you to our judges Colm Murphy & Milne Rowntree, both from the University of Ulster and Damien Whinnery from Flagship Media. Also, a big thanks to our sponsors University of Ulster, Digital Circle and Guardian Platform.

Thanks to everyone who created additional posts which can be found here:

The Scraperwiki team would like to thank the McKeown family for all their hospitality in Belfast! Julian, Aidan and Francis at Belfast Lough and the Giants Causeway in Antrim:

December 07 2010


Hacks & Hackers Belfast: ‘You don’t realize how similar coding and reporting are until you watch a hack and a technologist work together to create something’

In November, Scraperwiki went to Belfast and participant Lyra McKee, CEO, NewsRupt (creators of the news app Qluso) has kindly supplied us with this account!

The concept behind Hacks and Hackers, a global phenomenon, is simple: bring a bunch of hacks (journalists) and hackers (coders) together to build something really cool that other journalists and industry people can use. We were in nerd heaven.

The day kicked off with a talk from the lovely Francis Irving (@frabcus), Scraperwiki’s CEO. Francis talked about Scraperwiki’s main use-scraping data, stats & facts from large datasets – and the company’s background, from being built by coder Julian Todd to getting funded by 4IP.

After that, the gathered geeks split off into groups, all with the same goal: scrape data and find an explosive, exclusive story. First, second and third prizes would be awarded at the end of the day.

You don’t realize how similar coding and reporting are until you watch a hack and a technologist work together to create something. Both vocations have the same core purpose: creating something useful that others can use (or in the hack’s case, unearthing information that is useful to the public).

The headlines that emerged out of the day were amazing. ‘Mr No Vote’ won first prize. When citizen hacks Ivor Whitten, Matt Johnston and coder Robert Moore of e-learning company Learning Pool used Scraperwiki to scrape electoral data from local government websites, they found that over 60% of voters in every constituency in Northern Ireland (save one) abstained from voting in the last election, raising questions about just how democratically MPs and MLAs have been elected.

What was really significant about the story was that the guys were able to uncover it within a number of hours. One member of Team Qluso, an ex investigative journalist, was astounded, calling Scraperwiki a “gamechanger” for the industry. It was an almost historical event, seeing technology transform a small but significant part of the industry: the process of finding and analyzing data. (A process that, according to said gobsmacked Team Qluso member, used to take days, weeks, even months).

If you get a chance to chat with the Scraperwiki team, take it with both hands: these guys are building some cracking tools for hacks’n’hackers alike.

December 03 2010


Views part 1 – Canadian weather stations

(This is the first of two posts announcing ScraperWiki “views”. A new feature that Julian, Richard and Tom worked away and secretly launched a couple of months ago. Once you’ve scraped your data, how can you get it out again in just the form you want?)

Canadian weather stations

Clear Climate Code is a timely project to reimplement the software of climate science academics in nicely structured and commented Python. David Jones has been using ScraperWiki views to find out which areas of the world they don’t have much surface temperature data for, so they can look for more sources.

Take a look at his scraper Canada Climate Sources. If you scroll down, there’s a section “Views using this data from this scraper”. That’s where you can make new views – small pieces of code that output the data the way you want. Think of them as little CGI scripts you can edit in your browser. This is a screenshot of the Canada Weather Station Map view.

It’s a basic Google Map, made for you from a template when you choose “create new view”. But David then edited it, to add conditional code to change the colours and letters on the pins according to the status of the stations.

This is the key powerful thing about ScraperWiki views – even if you start with a standard chart or map, you have the full power of the visualisation APIs you are using, and of HTML, Javascript and CSS, to do more interesting things later.

There’s more about ScraperWiki and the Canada weather stations in the posts Canada and Analysis of Canada Data on the Clear Climate Code blog.

Next week – part 2 will be about how to use views to output your data in the machine readable format that you want.

November 15 2010


Lichfield Hacks and Hackers: PFIs, plotting future care needs, what’s on in Lichfield and mapping flood warnings

The winners with judges Lizzie and Rita. Pic: Nick Brickett

By Philip John, Journal LocalThis has been cross-posted on the Journal Local blog.

It may be a tiny city but Lichfield has shown that it has some great talent at the Hacks and Hackers Hack Day.

Sponsored by Lichfield District Council and Lichfield-based Journal Local, the day was held at the George Hotel and attended by a good selection of local developers and journalists – some coming from much further afield.

Once the introductions were done and we’d all contributed a few ideas the work got started and five teams quickly formed around those initial thoughts.

The first two teams decided to look into Private Finance Initiatives (PFIs) and Information Asset Registers (IARs). The first of these scraped information from 470 councils to show which of these published information about PFIs. The results showed that only 10% of councils actually put out any details of PFIs, highlighting a lack of openness in that area.

Also focused on PFIs was the ‘PFI wiki’ project which scraped the Partnerships UK database of PFIs and re-purposed it to allow deeper interrogation, such as by region and companies. It clearly paves the way for an OpenCharities style site for PFIs.

Future care needs was the focus of the third team who mapped care homes along with information on ownership, public vs private status and location. The next step, they said, is to add the number of beds and match that to the needs of the population based on demographic data, giving a clearer view of whether the facilities exist to cater for the future care needs in the area.

A Lichfield-related project was the focus of the fourth group who aimed to create a comprehensive guide to events going on in Lichfield District. Using about four or five scrapers, they produced a site that collated all the events listing sites serving Lichfield into one central site with a search facility. The group also spawned a new Hacks/Hackers group to continue their work.

Last but not least, the fifth group worked on flood warning information. By scraping the Environment Agency web site they were able to display on a map, the river level gauges and the flood warning level so that at a glance it’s possible to see the water level in relation to the flood warning limit.

So after a long day Lizzie Thatcher and Rita Wilson from Lichfield District Council joined us to judge the projects. They came up with a clever matrix of key points to rate the projects by and decided to choose the ‘what’s on’ and ‘flood warning’ projects as joint winners, who each share a prize of £75 in Amazon vouchers.

The coveted ScraperWiki mug also went to the ‘what’s on’ project for their proper use of ScraperWiki to create good quality scrapers.

Pictures from the event by Nick Brickett:



November 11 2010


Nominate a Developer Working Towards Social Change for this Year's Pizzigati Prize

Nominations are now open for the fifth annual awarding of the $10,000 Antonio Pizzigati Prize for Software in the Public Interest, an award aimed at software developers working with nonprofits to help forge innovative social change. The prize welcomes applications from — and nominations of — single individuals who have demonstrated leadership in the field of public interest software. 

Prize criteria:

read more

November 01 2010


Scraperwiki launches first student event in Liverpool

We’re happy to announce our first Hacks Meet Hackers event for students, to take place in Liverpool on Wednesday December 8, 2010 from 9.30am to 5pm at Liverpool John Moores University’s Art and Design Academy.

In partnership with Open Labs, we’re putting on this event for student developers and journalists from LJMU’s School of Journalism and other departments including the School of Computing & Mathematical Sciences.

So what’s this hack day all about?
It’s a practical event at which web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data.

Who’s it for?
We hope to attract ‘hacks’ and ‘hackers’ from all different types of backgrounds – students with skills in journalism, data visualisation, designers, programmers, statistics, games developers etc.

What will you get out of it?
The aim is to show journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show programmers how to find, develop, and polish stories and features.

What should participants bring?
We would encourage people to come along with ideas for local ‘datasets’ that are of interest. In addition we will create a list of suggested data sets at the introduction on the morning of the event but flexibility is key for this event. If you have a laptop, please bring this too.

But what exactly will happen on the day itself? Armed with their laptops and WIFI, journalists and developers will be put into teams of around four to develop ideas, with the aim of finishing final projects that can be published and shared publicly. Each team will then present their project to the whole group.

Overall winners will receive a prize at the end of the day. Food and drink will be provided during the day!
Any more questions? Please get in touch via aine[at]scraperwiki.com.

October 29 2010


Video: Leeds Hacks & Hackers Hack Day

Here are a few video interviews from yesterday’s Hacks and Hackers Hack Day Leeds, with various participants and Sarah Hartley from Guardian Local. Follow this link for a write-up of all the projects.


Leeds Hacks and Hackers Hack Day: Planning maps; cutting up Leeds; researching brownfield; and finding the city’s blogging pulse

It was to West Yorkshire for the fifth stop on Scraperwiki’s UK & Ireland Hacks & Hacker tour, at Old Broadcasting House in the excellent city of Leeds.

A varied crowd turned out for yesterday’s hack day hosted by nti Leeds, and also sponsored by Guardian Open Platform and Guardian Local and Leeds Trinity Centre for Journalism. It included participants from the city council and regional newspapers, independent bloggers, designers and computer programmers – with all different kinds of experience.

With the introduction over, the competition began, fuelled by the usual Scraperwiki promise of pizza and beer; and Amazon vouchers for the winners – who would be decided by our three judges, Sarah Hartley, editor of Guardian Local, Linda Broughton, head of nti Leeds, and Richard Horsman, associate principal lecturer at Leeds Trinity Centre for Journalism.

Five groups formed around different areas of interest, but all with a Leeds focus. Brownfield Research, by Greg Brant, Rebecca Whittington, Jon Eland and Tom Mortimer-Jones was about discovering the past, present and planned future of brownfield sites using scrapes of planning applications and change of use applications combined with web-chat and related documents. It also aimed to include history of industrial disease and accidents and contamination on site.

Leeds Planning Map by Catherine O’Connor (@journochat), Elizabeth Sanderson (@Lizziesanderson), James Rothschild (@jrpmedia), John Baron (@GdnLeeds), Karl Schneider (@karlschneider), Matt Jones (@matt_jones86) allowed users to view all planning decisions in Leeds colour coded by accepted or refused applications.

Find Me by software developer Marcus Houlden (@mhoulden) built a geolocation web application that displays current location, address, postcode, and links to nearest bus stops. He also started adding Yorkshire Water roadworks data.

The Leeds Pulse team scraped Live Journal data to produce a web application, built on Django, demonstrating negative and positive blogging attitudes across Leeds – drawing from 8,500 blog posts. It categorised “love, like or good” as positive, and “hate, bad or meh” as negative. The judges certainly weren’t ‘meh’ about it, and chose it as the runner-up.

Leeds Uncut, however, scooped the overall prize. Suzanne McTaggart, Amna Kaleem (@amnakaleem), Nick Crossland (@ncrossland), Michael Brunton-Spall (@bruntonspall) with some help from developer Martin Dunschen created a map showing the eight constituencies in Leeds to highlight how they are being affected by spending cuts and redundancies.

They also looked at job vacancies in each of the constituencies, to identify whether the creation of new jobs is offsetting the doom and gloom caused by spending cuts and job losses. Different shades of colour in the form of an “economic health thermometer” gave a visually effective overview of which constituencies are suffering the most and least.

The data for the project was gathered from job websites, news websites, the Guardian’s Cutswatch page and the Office of National Statistics, which provides figures on how many people are claiming unemployment benefit/jobseekers allowance each month, giving an indication of the number of new redundancies.

The three judges … were unanimous in deciding that the worthy winners had successfully collated trusted data and compiled an easy to use map visualisation.

… commented judge Sarah Hartley, who has written this account of the beginning, middle and end of the day.

£250 worth of Amazon.co.uk vouchers will be split up among the winners and runners up. An extra prize for the best scraper work, chosen by Scraperwiki’s Julian Todd, went to Matt Jones, who will continue to maintain the planning data scraper.

With thanks to all our sponsors and helpers mentioned above, and additionally Leeds Trinity’s Catherine O’Connor and developer Imran Ali.

Twitter conversation was via the #hhhleeds tag, and see below for a visualisation of some of the geotagged tweets (courtesy of remote onlooker Tony Hirst, @psychemedia):

You can find a Twitter list of delegates at this link here:

More links to be added as we spot them and photographs are coming… Please email judith at scraperwiki.com with more material, or leave links in the comment section below. I’d especially like to add in links to scrapers and data sets, so people can see how the projects were built.

Want to get involved? We’re still on tour! If you’d like to sponsor an event please get in touch with aine@scraperwiki.com.

October 19 2010


Scraperwiki/RBI launch first in-house Hacks & Hackers event – for B2Bs

Tickets are now available for a Scraperwiki hack day at Reed Business Information (RBI) on Monday 29th November in Quadrant House, Surrey, from 8am (registration) – 8.30pm.

B2B journalists, developers and designers are invited to attend the one-day ‘Hacks and Hackers’ event hosted and sponsored by RBI, B2B publisher of titles including FlightGlobal, Farmers Weekly and New Scientist.

The idea is that business journalists and bloggers (‘Hacks’) pair up with computer programmers and designers (‘Hackers’) to produce innovative data projects in the space of one day. Food and drink will be provided throughout the event. Prizes for the best projects will be awarded in the evening.

Any journalist from a B2B background, or developer/designer with an interest in business journalism is welcome to attend. We’re especially keen to welcome people who are interested in producing data visualisations.

“Data journalism is an important area of development for our editorial teams in RBI,” said Karl Schneider, RBI editorial development director:

“It’s a hot topic for all journalists, but it’s particularly relevant in the B2B sector. B2B journalism is focused on delivering information that it’s audience can act on, supporting important business decisions.

“Often a well-thought-out visualisation of data can be the most effective way of delivering critical information and helping users to understand key trends.

“We’re already having some successes with this kind of journalism, and we think we can do a lot more. So building up the skills of our editorial teams in this area is very important.”

The event is the first in-house hack day that Scraperwiki has organised as part of its UK and Ireland Hacks & Hackers tour.

5o places are available in total: half for RBI staff; half for external attendees. People wishing to attend should select the relevant ticket at this link.

Past hacks and hackers days have run in London, Liverpool, Birmingham and Manchester. For a flavour of the projects please see this blog post.

If you have any questions please contact Aine McGuire via Aine [at]scraperwiki.com.

October 17 2010


Video: Hacks and Hackers Hack Day Manchester

Hacks and Hackers Hack Day Manchester at Vision+Media in Salford, on 15th October 2010. Filmed (on a Flip) and edited by Joseph Stashko, who has kindly allowed us to re-publish the video here. A write-up of the day can be found at this link.


Hacks and Hackers Hack Day Manchester: Tweeting police, local gigs and Preston’s summer spend

We’re sure that Greater Manchester Police had us in mind when they set about tweeting 24 hours of calls on the eve of our Hacks and Hackers Hack Day Manchester. (Photos courtesy of Michael Brunton-Spall).

It proved a fantastic data set to work with and sparked four different ‘splinter’ groups. Michael Brunton-Spall, a developer from Guardian Platform (one of the event sponsors), set about making the tweets usable and created a Json GMP24 dataset [link].

Meanwhile, for the ‘Genetically Modified Policing’ project, Louise Bolotin, from Inside the M60, Lee Swettingham from MEN Media, programmer Dave Kendal and Megan Knight from the University of Central Lancashire scraped tweets and analysed peak times of tweets, the categories of calls and the number of followers of the feeds throughout the day.

Obviously, they would love to work with a dump of the police calls database, but in the meantime, this would do, said Megan, who presented the team’s work.

David Kendall also produced his own project mapping 999 calls in the area. He took the tweet data and put it through the Yahoo placemaker tool, plotting information on a Google map, to see which areas got calls over certain periods of time.

Yuwei Lin and Enrico Zini took the stage and First Prize for the final police project, a GMP tweet database, and showed a very neat search tool that allowed analysis of certain aspects of the police data (3257 items).

For example, we could look at the number of incidents that involved ‘sex’, or ‘youths and drinking’, whether the incidents involved males or females (“men are troublesome than women!” ), and at a tag cloud for certain locations. We could see a list of keywords and place names. It involved using the Json dataset created by Michael Brunton-Spall [dataset link] and adding keyword sets. The source code has been released here, along with a handy explanation.

Second prize went to ‘Preston’s Summer of Spend’, built by Uclan student Daniel Bentley and Scraperwiki’s Julian Todd. They took spending data from Preston City Council, converting PDFs to machine readable formats.

Once in a CSV file, they were able to create interactives, and identify interesting aspects of the data. It might be worth, for example, looking into why quite so much went to one individual Google told us was a “legal representative of a controversial city development”.  A further step might be to request the same information from other local councils and compare the spending levels.

Third prize and the Scraperwiki mug for best scraper went to the ‘Quarternote’ project built by developers Kane, Robin, Zen, Becky, Andrew and Andrew. This web application, which got many of the audience very interested,  provided local music and band information for venue owners, promoters and event organisers.

By scraping MySpace, you could easily find band gigging in your area for your event. Simply put, you could put together a gig list in three clicks. While something like LastFM would have been an easier hack, the team targeted MySpace as a source to which more local bands were contributing. (Photo from video by @josephstash)

Tom Mortimer-Jones of Scraperwiki, freelance writer Ruth Rosselson, InsidetheM60′s Nigel Barlow, Journal Local developer Philip John and freelance Mark Bentley decided to hack data showing ‘Manchester Rich and Poor’. They made a comparison by ward in Manchester, showing different factors, eg. population density, unemployment rate, incapacity benefit and severe disablement allowance, and education.

Lastly, the Judgmental group, Francis, Chris and James decided to do some work with legal data [disclaimer: I was also part of this one!]. Thanks to a friendly unknown donor, one of our team had been given a CD full of United Kingdom case judgment data. At the moment this only available via Bailli and the team wanted to make something more usable and searchable (Bailli’s data cannot be scraped or indexed by Google). So judgmental.org.uk was created.

It is still a work in progress, but could eventually provide a very useful tool for journalists. Although the data is not updated past a certain point, journalists would be able to analyse the information for different factors: which judges made which judgments? What is the level of activity in different courts? Which times of year are busier? It could be scrutinised to determine different aspects of the cases.

Judge Andy Dickinson from the University of Central Lancashire has since blogged his thoughts about the day overall:

Give the increasing amount of raw data that organisations are pumping out journalists will find themselves vital in making sure that they stay accountable. But I said in an earlier post that good journalists don’t need to know how to do everything, they just need to know who to ask.

With thanks to our judges (Andy, along with developer Tim Dobson and Julian Tait from Open Data Cities), our host Vision+Media and our lovely sponsors Inside the M60, Guardian Open Platform, the Digital Editors NetworkVision+Media (supported by the European Regional Development Fund and the Northwest Regional Development Agency), Journal Local and MEN Media.

A special thanks to Louise  & Nigel at InsidetheM60 and Jacqui at Vision+Media for the organisational help.

Links to posts about Hacks and Hackers Hack Day Manchester:

Any more you have spotted? Any names I’ve missed off? Videos will be added to this post soon. If you have technical detail, or screen shots, or presentations to add please email judith [at] scraperwiki.com.

Our youngest hacker yet, with Aidan:

October 15 2010


SOS! Hackers needed for transglobal ‘Hackathon’ in Birmingham

Emergency announcement: Digital Birmingham is hosting a transglobal ‘Hackathon’ in Birmingham, facilitated by Scraperwiki and in collaboration with hack events in Edmonton [Canada] and Seoul [Korea], with the aim of building a disaster situation app.

When? Wednesday 20th October 2010 – 14:00 to Thursday 21st October 14:00.  We know this announcement comes at short notice but you don’t get notice in a real emergency!

Where? Birmingham Science Park Aston Faraday Wharf, Holt Street.

What will our hackathon be like? It will be slightly different from most hackathons. As we work on our project, we will be sharing what we’re working on with Edmonton [Canada] and Seoul [Korea] as we ‘follow the sun’.

We’re going to spend 24 hours trying to create the blueprint for an app design that can be used in cities all over the world. This will be an application that will help families prepare during a disaster and will list emergency muster points, emergency info, alerts during disasters, and what to do. It will cater for different scenarios: although floods or a Buncefield type explosion are both emergency situations, you don’t handle them the same way. The aim of this app  is to keep people safe, and we hope to come up with a guideline for how it should look – so that app designers around the world can pick up the guideline and run with it.

We will join up with Edmonton [Canada] and the Girl Geeks Hackathon Team at 14:00 on the 20th and set the scene for what we will be doing.  We will have OPEN DATA sets from Edmonton.  We also hope to have a representative from Birmingham’s emergency planning team to talk about real scenarios and how they are dealt with by the authorities and emergency services. After our link up with Edmonton we will break into project groups and start the hack.

Why? This is not abandonware!  Think repository. Think CPAN! The code that is written will not be a temporary sand castle; it will be more like graffiti. This is a chance to work collaboratively with developers from across the world over 24 hrs. Come for 4 hrs or 8 hrs or do the full 24hr marathon and join us for what promises to be a great night.

Who should come? Hackers who are interested in making a difference or insomniacs!  We need coders, designers and creative people. We would also love to have people who are involved in emergency planning and crisis management.

What happens afterwards? The best ideas will be taken up and potentially used as the basis for new product development that could act as an app template for any city in the world.

Who else is helping? The event is facilitated by ScraperWiki, the 4IP funded data start-up behind the UK & Ireland Hacks & Hackers Hack Day tour.

Scraperwiki provides a platform that allows programmers the ability to develop, store, and maintain software scrapers for the purposes of extracting and linking data.  In addition ScraperWiki provides ‘views’, that allow private individuals, researchers, journalists and commercial organisations the ability to interrogate and cross-reference public data in a simple and meaningful way.

Other (important) stuff! We will feed you and give you lots of coke, pizza and refreshments – we will also have beers at the end to celebrate.

October 08 2010


Venue change for Manchester Hacks and Hackers Hack Day

Due to a couple of logistical reasons we have decided to change the venue for our Manchester Hacks and Hackers Hack Day on Friday 15th October. It will now be held at:

Vision+Media, 100 Broadway, Salford, M50 2UW (Google Map)

Nearest tram is Broadway.

There are still a few hacker places left if you would like to sign up:

October 06 2010


Event: Hacks and Hackers Hack Day Lichfield (#hhhlich)

We have another event to announce, as part of Scraperwiki’s UK & Ireland tour. We’re going to Lichfield, Staffordshire! In partnership with Lichfield District Council, we’re holding a hacks and hackers hack day at Venture House on Monday 11th November.

“Lichfield District Council have been publishing open data for a while now, and it seems a good fit to put on a day where we can showcase the data we have published, as well as encourage people to do something with it,” said council webmaster Stuart Harrison.

“We’re not precious though, and if something is built using other public data, we’ll be just as happy!”

The details:

What? Scraperwiki, the award-winning new screen scraper and data mining tool, funded by 4iP and Lichfield District Council are putting on a one day practical hack day* in Lichfield, Staffordshire at which web developers and designers (hackers) will pair up with journalists and bloggers or anyone with an interest in media and communications (hacks) to produce a number of projects and stories based on public data. It’s all part of the ScraperWiki UK & Ireland Hacks and Hackers tour.

Who’s it for? We hope to attract ‘hacks’ and ‘hackers’ from all different types of backgrounds – across programming, media and communications.

What will I get out of it?
The aim is to show journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show programmers how to find, develop, and polish stories and features. To see what happened at our past events in Liverpool and Birmingham visit the ScraperWiki blog.

How much? NOTHING! It’s free, thanks to our sponsors

What should I bring? We would encourage people to come along with ideas for local ‘datasets’ that are of interest. In addition we will create a list of suggested data sets at the introduction on the morning of the event but flexibility is key for this event. If you have a laptop, please bring this too.

So what exactly will happen on the day? Armed
with their laptops and WIFI, journalists and developers will be put
into teams of around four to develop their ideas, with the aim of
finishing final projects that can be published and shared publicly. Each team will then present their project to the whole group. Overall winners will receive a prize at the end of the day.

*Not sure what a hack day is? Let’s go with the Wikipedia definition: It “an event where developers, designers and people with ideas gather to build ‘cool stuff’”…

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.
Get rid of the ads (sfw)

Don't be the product, buy the product!