Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

July 26 2012

11:54

Analyzing documents with the help of the crowd

A Duke computer scientist and his graduate students are hoping their FirstPass project will help journalists analyze massive dumps of public records by harnessing the power of the crowd with Amazon's Mechanical Turk. Read More »

February 28 2012

16:12

How a conference taught me I know nothing

The 2012 Computer-Assisted Reporting conference in St. Louis provided journalists with plenty of new reporting tools. Here's our top-15 list of applications and websites from the weekend. Read More »

June 13 2011

02:13

Sarah Palin’s emails and a call for collaborative journalism

If you were committing an act of news on Friday, June 10, chances are every national news organization missed it.

Why? We all had boxes and boxes of printed emails of an ex-political official to go through. From the New York Times to Mother Jones/MSNBC/ProPublica, the Washington Post and my own employer – many national news sources spent enormous amounts [...]

December 21 2010

15:00

Tracking documents, numbers, and military social media: New tools for reporting in an age of swarming data

To conclude our series of videos from the Nieman Foundation’s secrecy and journalism conference, here’s a video of the day’s final session — the Labbiest of the bunch. Our own Megan Garber moderates a set of presentations on new digital approaches to dealing with new data and new sources.

The presenters: John Bohannon, contributing correspondent for Science Magazine; Teru Kuwayama, Knight News Challenge winner for Basetrack; Aron Pilhofer, editor of interactive news at The New York Times and Knight News Challenge winner for DocumentCloud; and Bill Allison, editorial director of the Sunlight Foundation. Below is an embed of the session’s liveblog.

December 09 2010

17:03

Altering Docs? Now There's a Tool for That in DocumentCloud

When we embarked on the DocumnetCloud project, tools for altering documents were the furthest thing from our minds. After all, a responsible journalist doesn't tweak source documents!

But one of the first papers to embed material using DocumentCloud needed to do just that. The Chicago Tribune accompanied their coverage of a troubled foster home with a collection of letters and court orders. Though the documents offered an excellent illustration of the state child services agency's lax oversight and slipped follow-ups, they were predictably full of personal information about children in the foster care system, individual agency staff names and other personal and identifying details about private individuals that the Tribune opted to omit from their reporting. That decision, however, left the news apps team replacing the whole stack of letters multiple times before the package was finally ready to post.

A tool, right inside of DocumentCloud, for replacing, removing and re-ordering the pages of a document would have helped them a lot.

When the "PBS NewsHour" shared a century old hand-written Mark Twain essay, our OCR tools were not nearly up to the task of reading his handwriting. NewsHour transcribed the 10-page essay by hand and we worked with them to replace the text stored in DocumentCloud and displayed on the embedded letters.

By the time that Memphis' Commercial Appeal wanted to run a lengthy series of handwritten letters from Abdulhakim Mujahid Muhammad, a young Memphis-born man who opened fire on a military recruiting center in Little Rock last summer, we at DocumentCloud were busy supporting nearly 200 newsrooms -- offering to hide the text tab was the best we could do.

What NewsHour and Commercial Appeal really needed was a tool, right inside of DocumentCloud, with which to edit the text of each document.

And so, we've assembled what we think is a sweet suite of tools to let you re-order pages, insert new ones, delete old ones and edit the text that will appear in your embedded document. Check out our user guide to see how it all works. We welcome your bugs, feedback, rants, raves and, as ever, your documents.

September 07 2010

19:55

DocumentCloud Helps Newspapers Bring Transparency to Government

Since we last updated readers on DocumentCloud's progress, we've made it much easier to upload a lot of documents at once, and introduced a related documents search that uses data about names and places provided by OpenCalais to find documents that are probably related to the one you're looking at. We've also added a bit more contextto the data we help reporters comb through. Most of this work is happening inside the gates of the DocumentCloud workspace, but it is resulting in some lively reporting. For example...

Using Documents to Tell the Story

This summer, as the federal 5th Circuit Court of Appeals prepared to hear arguments in a challenge to the University of Texas's affirmative action policy, Texas Tribune complemented its coverage of the case with nearly 200 pages of annotated court documents, including the original district court ruling, the university's appellate brief, as well as that of the plaintiffs in the case.

The Las Vegas Sun incorporated quite a trove of documents into its series on hospital care in Las Vegas. Readers were invited to browse everything from Department of Health and Human Services reports to individual records, right along with the Sun's reporters. When they say that hospital-acquired infections cost the country $30 billion per year or account for close to 100,000 deaths, they back each number up with original documents.

The Columbia Missourian annotated the city budget and took a local blogger to task for exaggerating Columbia, Missouri's cash reserves.

When Texas Governor Rick Perry challenged reporters to find anyone who can out-work him, Texas Tribune posted the governor's May 2010 schedule alongside that of Florida's Gov. Crist, New York's Gov. Paterson and California's Gov. Schwarzenegger and invited readers to help them skim over a hundred pages of briefings, receptions and photo ops for stories deserving of a closer look.

The Washington Post supplemented its reporting on the cozy relationship between the oil industry and the federal agency assigned to regulate them with an annotated report on the prospects for "Moving beyond Conflict" between regulator and regulated. Their document cache also included reports outlining just how cozy things had gotten by 2008. As Emily Keller pointed out in Free Government Info, a transparency project, documents like these give more transparency to journalism itself.

New Features in the Testing Lab

We're also hard at work fine tuning the document viewer, transforming it into something that users could reasonably plug into a template with a narrower content column. Thus far folks have been stuck with a full page viewer. We haven't fully rolled it out yet, but we've worked with a couple of our beta testers to implement it already.

Iowa State has a new men's basketball coach, and the Des Moines Register included all 14 pages of his contract to their coverage of the finer points contained in it. Among the unusual clauses? Hoiberg can walk away if the university decides to increase academic standards for student athletes beyond the NCAA's minimum.

Meanwhile, at the Santa Fe Reporter, Alexa Schirtzinger opted not to publish tables of information right inside her story on elder abuse in New Mexico, but she did use her staff blog to share the data that she had such a hard time tracking down. An annotation highlights the numbers that showed her that New Mexico fields more abuse complaints per nursing home bed than any other state.

DocumentCloud watchers will notice that they posted the contract right on the same page as Randy Peterson's writeup instead of displaying the document in a full page. We'll be making tweaks like this a lot easier for all of our users. In the meantime, if you're skilled at the art of reverse engineering JavaScript, you can view the source of the Register's story (or the Reporter's) to see just how they toggled the sidebar or zoom on those documents.

August 03 2010

17:51

DocumentCloud Helps Arizona Paper with Annotated Immigration Law

We opened the DocumentCloud floodgates less than six months ago and we're still working hard to make DocumentCloud a better tool. We're rolling out improvements at a healthy clip including SSL support, better documentation, and support for cross-newsroom collaboration. We continue to listen to feedback from our really incredible crop of beta testers (who now number close to 500!).

There are nearly 100 newsrooms participating in the DocumentCloud beta and requests are still pouring in. We've been doing a fair amount of outreach and more is in the works, but it turns out that our users are our best advocates: After John Addams in Great Falls, Montana, blogged about his experiences with DocumentCloud we were deluged with requests from Montana news organizations large and small.

Use Cases in Arizona, Chicago, Memphis

The really great stories about how reporters are using DocumentCloud continue to surprise all of us.

Not long after Arizona's governor signed that state's now infamous immigration law, the Arizona Republic published the bill in full, complete with annotations by a local law professor. Republic reporters told us that traffic to the annotated legislation outpaced the paper's popular entertainment guide in its first weekend, and continues to draw traffic as the bill stays in the news.

Meanwhile, in Chicago, reporters at the Tribune have been uploading each document and transcript entered into evidence in former governor Rod Blagojevich's corruption trial -- the documents are just part of their extensive coverage of the trial.

In Memphis, the Commercial Appeal published a sample ballot alongside their voter guide.

These are just a few of the great uses reporters have put DocumentCloud to -- there are many more great stories already out there and plenty of new ones on the way.

January 21 2010

16:11

How Could News Organizations Manage Documents Better?

How are you handling primary source material on your website?

OaklandLocal is summarizing a new report on a shootout in March that left five people dead. They use Scribd to embed reports directly on their site, but can't provide annotations.

California Watch is looking at what campaign season generosity bought for agribusiness in the Sacramento-San Joaquin River Delta. They put together a great Flash widget that highlights noteworthy portions of the documents they reviewed, but they had to sit down with a highlighter, circle relevant passages, and then scan each document for the site.

ProPublica, the Los Angeles Times, ABC News and the Washington Post are collaborating to report on civilian contractors injured in Iraq that are now struggling to get badly needed health care. But if you want to read the Congressional report that found covering private contractors has proved quite profitable for insurance companies, you'll have to download the whole report.

If the reporters behind these stories had DocumentCloud at their fingertips, we could have saved them a decent amount of work, made the documents they wanted to share more accessible, and invited deeper reader examination. And that is just the beginning. We don't know yet what that examination might yet yield. Stay tuned!

As we work, we really do want to know: What are you doing with documents now? Can you point out a recent story that you'd like to look at more closely? Please share your thoughts and feedback in the comments.

November 05 2009

15:27

Kicking Off the Grant Process With Monitoring and Evaluation

We at the Jefferson Institute began our experience as a 2009 Knight News Challenge winner with one of the more exciting and misunderstood elements of the grant cycle: monitoring and evaluation (M&E).

When done properly, M&E begins with the grantee setting out clearly the objectives of the grant, the activities necessary to achieve the objectives, and the resources applied to make these activities happen. So, for example, blogging for Idea Lab is an activity. An objective might be to create a thriving community, or to help guide the way for community news in transition.

For our Knight project, the objective is a bit more specific: to create open source tools that make community news and information easy to visualize. Activities include mapping existing tools, surveying users for specific unmet needs, coding, testing, translating, demoing, fixing, etc. Our primary resource will be the Drupal community, which is also one of our project's main beneficiaries. Ideally, we will create a virtuous circle.

The grantee is expected to have a clear causal logic, setting out how the activities will achieve the objectives, and identifying verifiable measures to assess performance against targets at each level: resources, activities, and objectives. Especially objectives. It is important to do this well, because far too often the project gets underway and the grantee loses sight of the objectives. They end up obsessing about performance as it relates to activities and resources. This is natural because activities are much more easily controlled and measured than the messy causal chain leading to the objectives. The donor, meanwhile, is mostly interested in the objectives. These differing centers of attention are the root of most donor-grantee disputes.

By starting out so early on M&E -- essentially before the grant even begins -- Knight is demonstrating how these tools can be used for partnership and management, not merely bean-counting. Our opportunity as the grantee is to embrace their challenge of partnership.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl