Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

What are the best tools for "scraping" data off a Web page for analysis in Excel or other software?

My former student Michelle Minkoff answered this question, at least in part, on Poynter.org today: http://www.poynter.org/column.asp?id=31&aid=183176. Her post includes links to two wonderful tutorials. I'm interested in other suggestions, and also in approaches for someone who's not too afraid of coding to write/adapt their own scraper.

Part of the reason I ask this question is that I've been thinking that writing a scraper might be an interesting final project for a course introducing programming to journalists. The rationale (along the lines of how I've taught computer-assisted reporting in the past) is that it's the kind of project a journalist would immediately see the utility/value of. So in addition to suggested tools/approaches, I'd be interested in feedback on this idea.

Don't be the product, buy the product!