Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

October 17 2010


Hacks and Hackers Hack Day Manchester: Tweeting police, local gigs and Preston’s summer spend

We’re sure that Greater Manchester Police had us in mind when they set about tweeting 24 hours of calls on the eve of our Hacks and Hackers Hack Day Manchester. (Photos courtesy of Michael Brunton-Spall).

It proved a fantastic data set to work with and sparked four different ‘splinter’ groups. Michael Brunton-Spall, a developer from Guardian Platform (one of the event sponsors), set about making the tweets usable and created a Json GMP24 dataset [link].

Meanwhile, for the ‘Genetically Modified Policing’ project, Louise Bolotin, from Inside the M60, Lee Swettingham from MEN Media, programmer Dave Kendal and Megan Knight from the University of Central Lancashire scraped tweets and analysed peak times of tweets, the categories of calls and the number of followers of the feeds throughout the day.

Obviously, they would love to work with a dump of the police calls database, but in the meantime, this would do, said Megan, who presented the team’s work.

David Kendall also produced his own project mapping 999 calls in the area. He took the tweet data and put it through the Yahoo placemaker tool, plotting information on a Google map, to see which areas got calls over certain periods of time.

Yuwei Lin and Enrico Zini took the stage and First Prize for the final police project, a GMP tweet database, and showed a very neat search tool that allowed analysis of certain aspects of the police data (3257 items).

For example, we could look at the number of incidents that involved ‘sex’, or ‘youths and drinking’, whether the incidents involved males or females (“men are troublesome than women!” ), and at a tag cloud for certain locations. We could see a list of keywords and place names. It involved using the Json dataset created by Michael Brunton-Spall [dataset link] and adding keyword sets. The source code has been released here, along with a handy explanation.

Second prize went to ‘Preston’s Summer of Spend’, built by Uclan student Daniel Bentley and Scraperwiki’s Julian Todd. They took spending data from Preston City Council, converting PDFs to machine readable formats.

Once in a CSV file, they were able to create interactives, and identify interesting aspects of the data. It might be worth, for example, looking into why quite so much went to one individual Google told us was a “legal representative of a controversial city development”.  A further step might be to request the same information from other local councils and compare the spending levels.

Third prize and the Scraperwiki mug for best scraper went to the ‘Quarternote’ project built by developers Kane, Robin, Zen, Becky, Andrew and Andrew. This web application, which got many of the audience very interested,  provided local music and band information for venue owners, promoters and event organisers.

By scraping MySpace, you could easily find band gigging in your area for your event. Simply put, you could put together a gig list in three clicks. While something like LastFM would have been an easier hack, the team targeted MySpace as a source to which more local bands were contributing. (Photo from video by @josephstash)

Tom Mortimer-Jones of Scraperwiki, freelance writer Ruth Rosselson, InsidetheM60′s Nigel Barlow, Journal Local developer Philip John and freelance Mark Bentley decided to hack data showing ‘Manchester Rich and Poor’. They made a comparison by ward in Manchester, showing different factors, eg. population density, unemployment rate, incapacity benefit and severe disablement allowance, and education.

Lastly, the Judgmental group, Francis, Chris and James decided to do some work with legal data [disclaimer: I was also part of this one!]. Thanks to a friendly unknown donor, one of our team had been given a CD full of United Kingdom case judgment data. At the moment this only available via Bailli and the team wanted to make something more usable and searchable (Bailli’s data cannot be scraped or indexed by Google). So judgmental.org.uk was created.

It is still a work in progress, but could eventually provide a very useful tool for journalists. Although the data is not updated past a certain point, journalists would be able to analyse the information for different factors: which judges made which judgments? What is the level of activity in different courts? Which times of year are busier? It could be scrutinised to determine different aspects of the cases.

Judge Andy Dickinson from the University of Central Lancashire has since blogged his thoughts about the day overall:

Give the increasing amount of raw data that organisations are pumping out journalists will find themselves vital in making sure that they stay accountable. But I said in an earlier post that good journalists don’t need to know how to do everything, they just need to know who to ask.

With thanks to our judges (Andy, along with developer Tim Dobson and Julian Tait from Open Data Cities), our host Vision+Media and our lovely sponsors Inside the M60, Guardian Open Platform, the Digital Editors NetworkVision+Media (supported by the European Regional Development Fund and the Northwest Regional Development Agency), Journal Local and MEN Media.

A special thanks to Louise  & Nigel at InsidetheM60 and Jacqui at Vision+Media for the organisational help.

Links to posts about Hacks and Hackers Hack Day Manchester:

Any more you have spotted? Any names I’ve missed off? Videos will be added to this post soon. If you have technical detail, or screen shots, or presentations to add please email judith [at] scraperwiki.com.

Our youngest hacker yet, with Aidan:

October 16 2010


ScraperWiki: Hacks and Hackers day, Manchester.

If you’re not familiar with scraperwiki it’s ”all the tools you need for Screen Scraping, Data Mining & visualisation”.

These guys are working really hard at convincing Journos that data is their friend by staging a steady stream of events bringing together journos and programmers together to see what happens.

So I landed at NWVM’s offices to what seems like a mountain of laptops, fried food, coke and biscuits to be one of the judges of their latest hacks and hackers day in Manchester (#hhhmcr). I was expecting some interesting stuff. I wasn’t dissapointed.

The winners

We had to pick three prizes from the six of so projects started that day and here’s what we (Tom Dobson, Julian Tait and me)  ended up with.

The three winners, in reverse order:

Quarternote: A website that would ‘scrape’ myspace for band information. The idea was that you could put a location and style of music in to the system and it would compile a line-up of bands.

A great idea (although more hacker than hack) and if I was a dragon I would consider investing. These guys also won the Scraperwiki ‘cup’ award for actually being brave enough to have a go at scraping data from Myspace. Apparently myspace content has less structure than custard! The collective gasps from the geeks in the room when they said that was what they wanted to do underlined that.

Second was Preston’s summer of spend.  Local councils are supposed to make details of any invoice over 500 pounds available, and many have. But many don’t make the data very useable.  Preston City council is no exception. PDF’s!

With a little help from Scraperwiki the data was scraped, tidied and put in a spreadsheet and then organised. It through up some fun stuff – 1000 pounds to The Bikini Beach Band! And some really interesting areas for exploration – like a single payment of over 80,000 to one person (why?) – and I’m sure we’ll see more from this as the data gets a good running through.  A really good example of how a journo and a hacker can work together.

The winner was one of number of projects that took the tweets from the GMP 24hr tweet experiment; what one group titled ‘Genetically modified police’ tweeting :). Enrico Zini and Yuwei Lin built a searchable GMP24 tweet database (and a great write up of the process) of the tweets which allowed searching by location, keyword, all kinds of things. It was a great use of the data and the working prototype was impressive given the time they had.

Credit should go to Michael Brunton-Spall of the Guardian into a useable dataset which saved a lot of work for those groups using the tweets as the raw data for their projects.

Other projects included mapping deprivation in manchester and a legal website that if it comes off will really be one to watch. All brilliant stuff.

Hacks and hackers we need you

Give the increasing amount of raw data that organisations are pumping out journalists will find themselves vital in making sure that they stay accountable. But I said in an earlier post that good journalists don’t need to know how to do everything, they just need to know who to ask.

The day proved to me and, I think to lots of people there,  that asking a hacker to help sort data out is really worth it.

I’m sure there will be more blogs etc about the day appearing over the next few days.

Thanks to everyone concerned for asking me along.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!