Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

October 06 2010

14:50

Pushing the Limits of What a Wiki Can Do with Councilpedia

Barely two decades into the digital age, we take online media for granted. So much is so easy and convenient -- at our fingertips -- that we can forget technology can only do so much. Then we come up with a great idea that leaves us with the challenge of how to successfully push the limits.

This is what has confronted Gotham Gazette as we move into the final stages of creating our Councilpedia site. Councilpedia, a Knight News Challenge winner that I've blogged about here previously, will explore more fully the links between money and politics in New York City.

Councilpedia will enable visitors to the site to share what they know about politicians and their donors. It is to be powered by MediaWiki to let people flag something -- noting, for example, that one contributor to a candidate owns land she hopes to get rezoned for a Walmart. Gotham Gazette staff will then confirm -- or delete -- the comment.

Filtering Data

The core of Councilpedia is information already on Gotham Gazette, information from City Council (on earmarks, for example) and, above all, the massive records from the city Campaign Finance Board on giving and spending. The sheer magnitude of all this data has posed an array of problems.

The city data, while thorough and accessible, is inscrutable to most New Yorkers -- a list of largely meaningless names. To make it easier to search and understand, we set out to code the data (to indicate large donors, those from the city, unions, real estate industry etc.). With some candidates having thousands of contributors, this presented a massive task. Fortunately, we had some conscientious interns this summer who, between their other reporting responsibilities, dutifully researched and coded line after line of information under the supervision of our city government editor, Courtney Gross.

Readers will be able to examine this data in a number of ways. They can view by candidate. They can find out who else the contributor helped fund. They can look at intermediaries and determine whose money they bundled and then who it went to. And so on.

For the wiki, though, this mountain of information has been a bit much. When technical manager William JaVon Rice began uploading the data into spreadsheets he had created, the process took 36 hours and produced some 31,000 pages -- a sure indication no one would ever attempt this in print. The system balked, overwriting pages, for example, which required Rice to check every candidate's list of often hundreds of contributors to determine which ones had been overwritten. Then he had to undo the overwrite.

Pushing The Limits of MediaWiki

We're still planning to have this ready to show you in the next several weeks. And we think you'll be impressed. Not to boast, but the reporters, campaign finance aficionados and followers of city government who viewed our test felt that way.

But we do see a number of issues looming ahead. Councilpedia is intended as a living, breathing site, meaning data will continue to accumulate as officials collect more money, award more earmarks, pass more bills, and so on. The updating poses a challenge for a small non-profit like Gotham Gazette.

The magnitude of the new information -- added to the volumes we already have -- is likely to push the limits of MediaWiki even further.

With this in mind, we're looking for ways to automate the process more. And we hope someone -- any takers out there?-- will make MediaWiki more robust or create or an alternative.

As always, we appreciate your ideas, so feel free to share them in the comments below. And stay tuned for Councilpedia.

April 20 2010

19:43

Joomla to Mediawiki help

My programming skills are near nonexistent, so I need some serious help here.

I have some pages (about 150) in a Joomla install that I want to open up to collaborative editing. I figure the best way to do this is to put the content of those Joomla pages into a wiki. Since both are based on MySQL databases, I figure exporting, converting and importing is my best bet.

(I'm figuring I'll do the export and import using CSV files and phpMyAdmin, because I know how to do that and learning the database commands seems more difficult than it's worth here. Ditto for massive regexing, plus the fact that I don't know all the tags that might be lurking in the content.)

Perl seems to have the best combination of capabilities, but the CSV import extension returns an array of hashes. I know that data types can nest somewhat in Perl, but I have no idea how to traverse the hashes in the array to get the two fields out of the Joomla database export. Concatenating the fields should be simple, and the HTML-to-Wiki conversion extension looks reasonably easy. But again, I don't know to create the CSV to import into the Wiki database.

PHP has a function to read CSV a line at a time, making the variable much less complex, but I can't find a converter from HTML to MediaWiki.

Python has been a bust so far as finding anything.

So, any direction? Things I have missed or not considered?

March 18 2010

21:26

In Seach of a Wiki with Track Changes

Most of us have become so used to being able to do so much online that is comes as a surprise when we want to do something and can't find the tools to do it.

That's the situation confronting the Gotham Gazette staff as we move forward with our Councilpedia project that will use crowdsourcing to probe the links between money and politics. I'm hoping you can help. (For more on Councilpedia see my earlier post.)

Monitoring Revisions

The project will enable registered users to contribute information on campaign donors and the politicians they help. Like Wikipedia, Councilpedia needs to allow readers to easily provide us with information. But we also want the ability to monitor revisions much the same way that Microsoft Word's track changes does.

Our technical manager, JaVon Rice, has found that Mediawiki simply does not do everything we need it to do and is looking for something essentially like Writeboard or Google Docs, except for public rather than just internal use.

Any ideas? Please share them in the comments.

Reblog this post [with Zemanta]

November 06 2009

14:54

Welcome to Davis, Calif.: Six lessons from the world’s best local wiki

Ah, Davis: home of 60,000 people, 30,000 students, 188 sunny days a year, a 16 percent bike commute mode share and the busiest local wiki in the world.

If I were Omaha World-Herald Publisher Terry Kroeger, I’d be booking my post-holiday flight immediately.

As Gina reported here last week, Omaha’s employee-owned metro daily just bought WikiCity, an Omaha-based Web startup that wamts to provide mini-Wikipedias for every city in the country. Creating a cheap platform for evergreen, user-generated local Web content has been tried, um, once or twice before. But with some notable exceptions, corporations have turned out to be really, really bad at this.

Philip Neustrom hasn’t.

Today, the quirky 500-page wiki Neustrom launched with fellow UC Davis math student Mike Ivanov in 2004 has 14,000 pages and drew 13,000 edits by 3,300 users last month, averaging 10,000 unique visitors daily. More importantly, it’s the best way in town to find a lost cat, compare apartment rental prices or get a list of every business open past 10 p.m. Operating budget, not counting its founders’ part-time volunteer labor: about $2,000 a year.

What’s the secret? Neustrom, who now wrangles code for the Citizen Engagement Lab in the Bay Area, was nice enough to tell us.

Wikis need content to breed content. Or, as evergreen-content guru Matt Thompson put it last week, a wiki written primarily by robots will appeal primarily to robots.

“Starting anything is hard,” said Neustrom, now 25. “The issue is predominantly an issue of outreach, of coordinating people and making sure people understand that they can’t just put something up there and add 50 pages and walk away, and then come back in a month and hope that it’s taken off.”

Instead, Neustrom, Ivanov convinced some of their friends to spend four summer months writing snippets about things that only exist in Davis, like drunken biking through late-night fog, oversized playground equipment and the smell from the cow farm on the edge of town.

“We were just trying to do something that we liked,” Neustrom said. “We certainly weren’t trying to do anything that was very useful.”

Business information is the holy grail. Pages about your local toad tunnel are dandy, Neustrom said, and quirky content kept the site from feeling generic to early users. But the feature that made DavisWiki take off was what the traditional media calls “consumer reporting.”

“After we’d sort of seeded it with 500 pages or something like that, we opened it up to the public,” Neustrom said. “First, it was pretty slow going. Nothing really happened.”

Then, sometime in late 2005, pages on things like lunch specials and Davis’s nicest bathrooms started filling up. Local business coverage has been “a big driving force” ever since, Neustrom said. Today, he said, retail businesses in town often keep their own information on DavisWiki up to date.

A wiki’s strengths kick in after one year. The web craves news like kids crave sugar. Blogs and tweets are gobbled fast and burn quick. But wikis are the whole grains of the web: One year after news breaks, someone will want to find and link to it again — and a wiki is likely to be the only place it’s still hanging around.

“All of the existing online resources for sort of cataloging anything about the town were sort of time-based,” Neustrom said. “After about a year and a half, these things would sort of disappear, even if they’d been around for 100 years, like the local newspaper…So we became the resource of record.”

Start with a subculture, then build out to a general audience. DavisWiki has always aspired to cover its whole town, but it’s always served students best.

That’s all right, Neustrom thinks. If he’d tried to please everybody who showed up, no one would have come back.

“When building something like this, you can’t just aim for this wide spectrum at first,” Neustrom said. Some companies try to launch wikis by writing programs that “crawl through a database, that spit out statistics and create 13 million pages and put that out there and hope that it’s going to stick. You can’t do that. It’s just not going to work.”

Neustrom, who spent 2004 sharing a house with musicians, found his base among the artsy, but he thinks any subculture would do. “You could have, like, a physics grad student start a community for their town, and it’s a bunch of physics nerds,” he said. “And that could spiral out and out.”

Keep your content open source, no matter what. Don’t do it for marketing reasons or out of the kindness of your heart. Do it because it’s the only way to guarantee to your users that if you fold, all their hard work won’t die with you.

Good wikis inspire rabid devotion — if they don’t, they never become good wikis. Neustrom and Ivanov keep their budget online and think of the project as a user co-op. Their users did, too. “There are people on there who literally spend four hours a day looking at DavisWiki,” Neustrom said. “People had free [computer lab] pages every quarter, so they would use their excess printing to print out 400 fliers and staple them to every room on campus.”

People don’t do that for sites they think are “neat,” Neustrom said. They do it for sites they own.

Don’t get hung up on mimicking Wikipedia. Sure, it may be the most useful object ever created by human beings. But as Marshall Poe showed in his terrific biography of Wikipedia’s youth, its rules — universal editorship, neutral point of view, no original research — were forged out of year-long flamewars among the early Wikipedians. Neustrom and his friends didn’t think NPOV was suited to an inherently Davis-centric site, so they ditched it.

Wikipedia’s widely used software, MediaWiki, isn’t perfect either. DavisWiki uses a modified Sycamore platform but it, too, has flaws.

“People want to be able to search for all elementary schools within a certain radius of a certain point, or all of the restaurants that serve vegan food,” Neustrom said. “MediaWiki suffers the same issue [as Sycamore] — it was written before the advent of modern web framework.”

Neustrom is yearning for a modern wiki platform. That’s why he’s been messing around with Django this year. It’s also why he’s incorporating Wikispot, the nonprofit he set up to reproduce DavisWiki for other towns and topics, as a 501(c)3.

Looking for a tax write-off, Terry?

Photo by Arlen used under a Creative Commons license.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.
Get rid of the ads (sfw)

Don't be the product, buy the product!

Schweinderl